This paper describes a method for the construction of a word graph (or lattice) for large vocabulary, continuous speech recognition. The advantage of a word graph is that a fairly good degree of decoupling between acoustic recognition at the 10-ms level and the final search at the word level using a
Bi-directional graph search strategies for speech recognition
โ Scribed by Z. Li; G. Boulianne; P. Labute; M. Barszcz; H. Garudadri; P. Kenny
- Publisher
- Elsevier Science
- Year
- 1996
- Tongue
- English
- Weight
- 354 KB
- Volume
- 10
- Category
- Article
- ISSN
- 0885-2308
No coin nor oath required. For personal study only.
โฆ Synopsis
We describe a new search algorithm for speech recognition which applies the monotone graph search procedure to the problem of building a word graph. A first backward pass provides a method for estimating the word boundary times and phone segment boundary times needed to build the word graph using either the 1-phone or 2phone lookahead assumptions. It also provides a heuristic for the search which satisfies the monotonicity condition. A second backward pass applies forward-backward pruning to the word graph.
We show how the search can be made to run very quickly if the 1phone lookahead assumption holds. We present the results of experiments performed on the 5000-word speaker-independent Wall Street Journal task under both the 1-phone and 2-phone lookahead assumptions. These results show that the 1-phone lookahead assumption leads to unacceptably large error rates for speakerindependent recognition using current acoustic phonetic modelling techniques.
Finally, we give an account of the methods we have developed to process speech data in successive blocks so as to address the real-time issue and to control the memory requirements of the search.
๐ SIMILAR VOLUMES
In this paper, by using the formulation of the missing-data problem, a general framework for statistical acoustic modelling of speech is presented. With the motivation of utilizing bi-directional contextual dependence in acoustic modelling, a bi-directional hidden Markov modelling approach for speec
In this paper, we propose the use of a tree-trellis search scheme for the task of large vocabulary Mandarin polysyllabic word recognition. Usually, the task of large vocabulary word recognition is computationally intractable by whole-word based approach. We convert this task into a tree network sear