1.

A new operator for dynamic input-output automata

In order to be able to model systems containing dynamically created and mobile components, a mathematical model named dynamic input-output automata had been proposed. We found and demonstrated that this model does not allow faithful representation of some kinds of dynamic systems which often occur in practice. Although a real system does not receive messages from its environemnt (i.e. it is closed), it receives them in the model (i.e. it is open). We proposed a new operator for dynamic input-output automata, which allows closing of the model of such a system.

COBISS.SI-ID: 14173206

2.

Three-stage framework for unsupervised acoustic modeling using untranscribed spoken content

A three-stage framework for unsupervised acoustic modeling using untranscribed spoken content is proposed. Efficient usage of untranscribed spoken content can significantly increase the amount of available language resources. The first method proposed is focused on building initial acoustic models. The second method is focused on the first few training iterations, after the untranscribed spoken content is added, where the phonetic vocabulary is modified with phonetic broad classes. The proposed method significantly improved speech recognition results.

COBISS.SI-ID: 14411542

3.

Modelling of filled pauses and onomatopoeias for spontaneous speech recognition

Filled pauses and onomatopoeias are one of the key factors that degrade spontaneous speech recognition performance and increase the system complexity. Improved method for implicit acoustic modelling of filled pauses was presented in the chapter. The UMB Broadcast News speech recognition system was applied for recognizing spontaneous speech.

COBISS.SI-ID: 14486806

4.

Statistical machine translation of highly inflected language

Statistical machine translation of highly inflected language was performed. The main idea was to reduce morphological complexity of highly inflected language when translated into less inflected language. Experimental results show improved translation quality.

COBISS.SI-ID: 13979670

5.

Online speaker segmentation and clustering using cross-likelihood ratio calculation with reference criterion selection

The authors presented a new on-line method for speaker segmentation and clustering in real-world environments. They proposed a new segmentation and clustering method, where the Bayesian Information Criterion (BIC) and the Normalized Cross-Likelihood Ratio (NCLR) are combined into an on-line speaker diarization system. A new decision parameter for BIC and NCLR is proposed using Normalization with Reference Criterion Selection, together with a window normalization technique, called Window Length Compensation, which normalizes the criterion value according to analyzed window length.

COBISS.SI-ID: 14756630

P2-0069 — Annual report 2010

1.

A new operator for dynamic input-output automata

2.

Three-stage framework for unsupervised acoustic modeling using untranscribed spoken content

3.

Modelling of filled pauses and onomatopoeias for spontaneous speech recognition

4.

Statistical machine translation of highly inflected language

5.

Online speaker segmentation and clustering using cross-likelihood ratio calculation with reference criterion selection