This paper proposes a new statistical approach, namely the probabilistic union model, for speech recognition subjected to unknown burst noise during the utterance. The model combines the local temporal information based on the union of random events, to reduce the dependence of the model on informat
Speech recognition with unknown partial feature corruption – a review of the union model
✍ Scribed by Ji Ming; F. Jack Smith
- Publisher
- Elsevier Science
- Year
- 2003
- Tongue
- English
- Weight
- 272 KB
- Volume
- 17
- Category
- Article
- ISSN
- 0885-2308
No coin nor oath required. For personal study only.
✦ Synopsis
This paper provides a summary of our studies on robust speech recognition based on a new statistical approach -the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.
📜 SIMILAR VOLUMES
A speech recogmzer ts developed usmg a layered feedforward neural network to implement speech-frame predwtlon. A Markov cham ts used to control changes in the network's wetght parameters. We postulate that speech recogmtion accuracy ts closely hnked to the capabthty of the predictive model m represe