1.

Acoustic classification and segmentation using modified spectral roll-off and variance-based features

This paper presents novel features and an architecture for an automatic on-line acoustic classification and segmentation system. The system includes speech/non-speech segmentation (with the emphasis on accurate speech/music segmentation), gender segmentation, and speech bandwidth segmentation. This automatic segmentation system can be easily integrated into an automatic continuous speech recognition system, where information about individual acoustic segments can be used for acoustic model selection and adaptation, or as additional information for rich transcription output. Acoustic model adaptation can improve the speech recognition rate and additional information in rich transcription can be useful when searching for some certain events or circumstances (male speaker talking over the phone line, etc.). For speech/non-speech segmentation we propose a new set of features, which are based on an energy variance in a narrow frequency sub-band, called EVFB (Energy Variance of Filter Bank). The proposed features also prove to be an efficient discriminator between speech and music. Segmentation cross-test results show that EVFB features prove to be more robust than MFCC features. Two new features (modified spectral roll-off and high-frequency energy variance) are also proposed for speech bandwidth classification and segmentation. The results show a good and robust performance by the automatic on-line acoustic segmentation system. All experiments and tests were performed on a radio broadcast database and a Slovenian BNSI Broadcast News database. Integration of the automatic on-line acoustic segmentation system into a continuous speech recognition system based on MFCC (melfrequency cepstral coefficients) features requires only a small additional computational cost because many of the proposed system's feature calculation procedures are common to the MFCC features calculation procedure.

COBISS.SI-ID: 16450838

2.

Speech recognition for interaction with a robot in noisy environment

One of the main problems with speech recognition for robots is noise. In this paper we propose two methods to enhance the robustness of continuous speech recognition in noisy environment. We show that the accuracy of recognition can be improved by better weighting the language model in the decision process. The second proposed method is based on language model adaptation. The experiments showed that both proposed techniques improve speech recognition accuracy by approximately 2% .

COBISS.SI-ID: 16824598

3.

Viseme recognition system based on transformed acoustic models

Viseme recognition from speech is one of the methods needed to operate a talking head system, which can be used in various areas, such as mobile services and applications, gaming, the entertainment industry, and so on. This paper proposes a novel method for generating acoustic models for viseme recognition from speech. The viseme acoustic models were generated using transformations from trained phoneme acoustic models. The proposed transformation method is language-independent; only the available speech resources are needed. The viseme sequence with corresponding time information was produced as a result of recognition using context-dependent acoustic models. The evaluation of the proposed acoustic models transformation method was carried out on a test scenario with phonetically balanced words, in which the results were compared to the baseline viseme recognition system. The improvement in viseme accuracy was statistically significant when using the proposed method for transforming acoustic models.

COBISS.SI-ID: 17578262

4.

Model checking using Spin and SpinRCP

Spin is one of the leading verification tools for the model checking of distributed systems. It is used over a broad spectrum of applications where systems can be represented as asynchronously running processes. This paper provides an overview of the concepts of model checking, the Spin model checker together with its input language Promela, and of the available graphical user interfaces to Spin. In order to offer Spin users an integrated development environment for Spin, we have developed a SpinRCP. We introduce its structure and demonstrate some of its features by considering a standard algorithm for leader election in a unidirectional ring.

COBISS.SI-ID: 17523222

5.

Self-adaptive differential evolution algorithm using population size reduction and three strategies

Many real-world optimization problems are largescale in nature. In order to solve these problems, an optimization algorithm is required that is able to apply a global search regardless of the problems' particularities. This paper proposes a self-adaptive differential evolution algorithm, called jDElscop, for solving large-scale optimization problems with continuous variables. The proposed algorithm employs three strategies and a population size reduction mechanism. The performance of the jDElscop algorithm is evaluated on a set of benchmark problems provided for the Special Issue on the Scalability of Evolutionary Algorithms and other Metaheuristics for Large Scale Continuous Optimization Problems. Nonparametric statistical procedures were performed for multiple comparisons between the proposed algorithm and three well known algorithms from literature. The results show that the jDElscop algorithm can deal with large-scale continuous optimization effectively. It also behaves significantly better than other three algorithms used in the comparison, in most cases.

COBISS.SI-ID: 14398230

P2-0069 — Final report

1.

Acoustic classification and segmentation using modified spectral roll-off and variance-based features

2.

Speech recognition for interaction with a robot in noisy environment

3.

Viseme recognition system based on transformed acoustic models

4.

Model checking using Spin and SpinRCP

5.

Self-adaptive differential evolution algorithm using population size reduction and three strategies