๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Integrated bias removal techniques for robust speech recognition

โœ Scribed by Craig Lawrence; Mazin Rahim


Publisher
Elsevier Science
Year
1999
Tongue
English
Weight
522 KB
Volume
13
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


In this paper, we present a family of maximum likelihood (ML) techniques that aim at reducing an acoustic mismatch between the training and testing conditions of hidden Markov model (HMM)-based automatic speech recognition (ASR) systems. Our study is conducted in two phases. In the first phase, we evaluate two classes of robustness techniques; those that represent the acoustic mismatch for the entire utterance as a single additive bias and those that represent the mismatch as a non-stationary bias. In the second phase, we propose a codebook-based stochastic matching (CBSM) approach for bias removal both at the feature level and at the model level. CBSM associates each bias with an ensemble of HMM mixture components that share similar acoustic characteristics. It is integrated with hierarchical signal bias removal and further extended to account for n-best candidates. Experimental results on connected digits, recorded over a cellular network, shows that incorporating bias removal reduces both the word and string error rates by about 12% and 16%, respectively, when using a global bias, and 36% and 31%, respectively, when using a non-stationary bias.


๐Ÿ“œ SIMILAR VOLUMES


Near-field Adaptive Beamformer for Robus
โœ Iain A. McCowan; Darren C. Moore; S. Sridharan ๐Ÿ“‚ Article ๐Ÿ“… 2002 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 237 KB

This paper investigates a new microphone array processing technique specifically for the purpose of speech enhancement and recognition. The main objective of the proposed technique is to improve the low frequency directivity of a conventional adaptive beamformer, as low frequency performance is crit

New temporal features for robust speech
โœ Jia-lin Shen; Wen L. Hwang ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 131 KB

Although the delta and RASTA methods have been widely used in extracting the temporal properties of stationary features for robust speech recognition, there is still a need to investigate new temporal features for better performance. In this paper, we present two new temporal features for robust pro