𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Improvement of noisy speech recognition using a proportional alignment decoding algorithm in the training phase

✍ Scribed by Wei-Wen Hung; Hsiao-Chuan Wang


Publisher
Elsevier Science
Year
1998
Tongue
English
Weight
670 KB
Volume
12
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

✦ Synopsis


Modelling the state duration of hidden Markov models (HMMs) can effectively improve the accuracy in decoding the state sequence of an utterance and result in an improvement of speech recognition accuracy. However, when a speech signal is contaminated by ambient noise, the decoded state sequence may be distorted. It may stay at some states too long or too short even with the help of state duration models. This paper presents a proportional alignment decoding (PAD) algorithm for retraining the HMMs. A task of multi-speaker isolated Mandarin digit recognition was conducted to demonstrate the effectiveness and robustness of the PAD-based variable duration hidden Markov model (VDHMM/PAD) method. Experimental results show that the discriminativity of VDHMM/PAD in a noisy environment has been significantly enhanced. Moreover, the proposed method outperforms those widely used state duration modelling methods, such as using Poisson, gamma, Gaussian, bounded and nonparametric probability density functions.