Languages with small number of speakers often suffer from limited availability of spoken language resources, due to immense costs to develop them. One of possible solutions in such case is to combine various existing spoken language resources to recognize speech for a new task or domain. This combined approach was also applied to unsupervised and lightly supervised training of acoustic models for spontaneous speech recognition. The BNSI Broadcast News database was used to recognize SloParl Parliamentary debates, utterances from the PoliDat database and utterances from the Plattos database.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 000000000The discourse markers analysis for two different genres of spontaneous speech is important for preserving the Slovenian language in the era of digitalization. With the growing influence of information society also the influence of English language grows. This is specifically noticeable for state-of-the-art telecommunication services. The analysis applied to Slovenian speech databases enables new fundamental research in the area of spontaneous speech recognition. This will result in the additional development of state-of-the-art telecommunication services with support for Slovenian language.
F.29 Contribution to the development of national cultural identity
COBISS.SI-ID: 36334434The paper presents the validation results for annotating the discourse markers in spontaneous speech databases. The analysis showed that during manual annotation variances occur for specific categories of discourse markers, but are insignificant for later usage of speech databases for building acoustic models for automatic speech recognition. An important result is the definition of those tokens that always occur in the role of discourse markers. This will ease the procedure of automatic annotation of spontaneous spoken language resources for spontaneous speech recognition in the future.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 12719894A framework for modeling filled pauses for Slovenian spontaneous speech recognition was designed. The baseline was set with the analysis of discourse markers in formal and informal genres. The acoustic modeling for speech recognition was performed using four proposed implicit topologies with various complexities. The expert defined language dependent rules were used at the start. The experiment was carried out using Slovenian spoken language resources. The results analysis showed a significant influence of modeling filled pauses on the performance of spontaneous speech recognition
B.03 Paper at an international scientific conference
COBISS.SI-ID: 12454422