The impact of context in formal and informal conversations on the use of discourse markers was analysed. Several contextual factors, which contribute to the differences in the use of discourse markers, were identified. These results were used as a baseline for the modeling of spontaneous speech. The differences between genres indicated that the combining of various spoken language resources for acoustic modeling generation could be applied as possible solution. The achievement was published in an interdisciplinary journal, which belongs to the A1” category regarding the ARRS methodology.
COBISS.SI-ID: 12612886
The paper presents a study on modeling the highly inflective Slovenian language for spontaneous speech recognition. The research focus is on data sparsity, which results from the complex morphology of the language. A new data-driven subword unit based method is proposed for the induction of inflectional morphology modeling. No prior knowledge of the language is used. We are searching for a decomposition which yields the minimum entropy of the training corpus. The experiments demonstrate that proposed models considerably reduce out-of-vocabulary rate and improve speech recognition performance.
COBISS.SI-ID: 13118230
A new method for modeling filled pauses and onomatopoeas for large vocabulary spontaneous speech recognition was proposed. The acoustic models were defined using the results of discourse markers analysis. The major influence was given to their acoustic-phonetic properties, where separated modeling of onomatopoeas was necessary. The proposed method is based on implicit modeling using the phonetic broad classes. The context of these models is ignored in the language model. The comparison with three other modeling methods showed statistically significant improvement of speech recognition results.
COBISS.SI-ID: 12706070
The paper presents the results of analysis of discourse markers in Slovenian spontaneous speech. The analysis was performed comparing sets of spontaneous utterances from the BNSI Broadcast News speech database and Turdis speech database. The first one comprises news shows and the second one conversation from tourist office. For modeling spontaneous speech for automatic speech recognition, the most important categories of results are those for filled pauses and background signals. The results showed a statistically significant difference in discourse markers frequency for both genres.
COBISS.SI-ID: 36334434