Prosodic structure and sentence types by using large speech databases supported by deep learning techniques

Financer institutionNemzeti Kutatási, Fejlesztési és Innovációs Hivatal

IDK-135038
Domestic tenderInstitutional tender

Principal investigator: Katalin Mády

One prerequisite of studies on the structure of Hungarian is the availability of a large amount of spontaneous speech data. Manual data processing is time-consuming and expensive. For this reason, we develop models based on deep neural networks in order to facilitate the automatic processing of speech data now and in the future. Data processing includes automatic speech recognition and the time-alignment of the annotations within the acoustic signal. Parallelly, a prosodic annotation system is being developed in order to reveal relevant units of Hungarian prosody and their structure. The main research questions we seek to answer based on the corpora are: identification of the communicative functions of complex sentence structure and a description of Hungarian intonation. The databases and language models developed during the project are freely available for research purposes, thus contributing to the improved usability of Hungarian language resources. The work is supported by two computer engineers (Gergely Dobsinszki, Máté Kádár) and two research assistents (Péter Csényi, Flóra Hegyi).

Duration 2020-2024

Participating researchers

Katalin MÁDY
research group leader, senior research fellow
András Ákos BALOG
former junior research fellow
Péter CSÉNYI
junior research fellow
Hans-Martin GÄRTNER
research group leader, research professor
Tekla Etelka GRÁCZI
senior research fellow
Beáta GYURIS
senior research fellow
Anna KOHÁRI
research fellow
Péter MIHAJLIK
senior research fellow
Gergely DOBSINSZKI
IT specialist
Flóra HEGYI
research assistant
Máté Soma KÁDÁR
IT specialist