๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

An overview of decoding techniques for large vocabulary continuous speech recognition

โœ Scribed by Xavier L. Aubert


Publisher
Elsevier Science
Year
2002
Tongue
English
Weight
351 KB
Volume
16
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by m-gram language models (LM) and cross-word (CW) phonetic contexts. This study is structured along two main axes: the network expansion and the search algorithm itself. The network can be expanded statically or dynamically while the search can proceed either time-synchronously or asynchronously which leads to distinct architectures. Three broad classes of decoding methods are briefly reviewed: the use of weighted finite state transducers (WFST) for static network expansion, the time-synchronous dynamic-expansion search and the asynchronous stack decoding. Heuristic methods for further reducing the search space are also considered. The main approaches are compared and some prospective views are formulated regarding possible future avenues.


๐Ÿ“œ SIMILAR VOLUMES


A word graph algorithm for large vocabul
โœ Stefan Ortmanns; Hermann Ney; Xavier Aubert ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 360 KB

This paper describes a method for the construction of a word graph (or lattice) for large vocabulary, continuous speech recognition. The advantage of a word graph is that a fairly good degree of decoupling between acoustic recognition at the 10-ms level and the final search at the word level using a

The use of tree-trellis search for large
โœ Eng-Fong Huang; Frank K. Soong; Hsiao-Chuan Wang ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 483 KB

In this paper, we propose the use of a tree-trellis search scheme for the task of large vocabulary Mandarin polysyllabic word recognition. Usually, the task of large vocabulary word recognition is computationally intractable by whole-word based approach. We convert this task into a tree network sear

Automatic selection of phonetically dist
โœ Jia-lin Shen; Hsin-min Wang; Ren-yuan Lyu; Lin-shan Lee ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 163 KB

This paper presents an approach of automatic selection of phonetically distributed sentence sets for speaker adaptation, and applies the concept to the task of Mandarin speech recognition with very large vocabulary. This is a different approach to the adaptation data selection problem. A computer al