𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Language modelling for efficient beam-search

✍ Scribed by Marcello Federico; Mauro Cettolo; Fabio Brugnara; Giuliano Antoniol


Publisher
Elsevier Science
Year
1995
Tongue
English
Weight
263 KB
Volume
9
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

✦ Synopsis


This paper considers the problems of estimating bigram language models and of efficiently representing them by a finite state network, which can be employed by a hidden Markov model based, beamsearch, continuous speech recognizer.

A review of the best known bigram estimation techniques is given together with a description of the original Stacked model. Language model comparisons in terms of perplexity are given for three text corpora with different data sparseness conditions, while speech recognition accuracy tests are presented for a 10 000-word real-time, speaker independent dictation task. The Stacked estimation method compares favourably with the others, by achieving about 93% of word accuracy.

If better language model estimates can improve recognition accuracy, representations better suited to the search algorithm can improve its speed as well. Two static representations of language models are introduced: linear and tree-based. Results show that the latter organization is better exploited by the beam-search algorithm as it provides a five times faster response with same word accuracy. Finally, an off-line reduction algorithm is presented that cuts the space requirements of the tree-based topology to about 40%.

The proposed solutions presented here have been successfully employed in a real-time, speaker independent, 10 000-word real-time dictation system for radiological reporting.


πŸ“œ SIMILAR VOLUMES


EquiXβ€”A search and query language for XM
✍ Sara Cohen; Yaron Kanza; Yakov Kogan; Yehoshua Sagiv; Werner Nutt; Alexander Ser πŸ“‚ Article πŸ“… 2002 πŸ› John Wiley and Sons 🌐 English βš– 287 KB πŸ‘ 1 views
Sabine: parametric data input language f
✍ G.R. Moore πŸ“‚ Article πŸ“… 1986 πŸ› Elsevier Science 🌐 English βš– 127 KB

One of the central problems in assessing the role of process planning in CIM is the paucity of current systems available to address the stringent needs of CAPP in a CIM context. This paper reviews these stringent needs and classifies the available systems accordingly. It concludes with a summary of

Stochastic automata for language modelin
✍ Giuseppe Riccardi; Roberto Pieraccini; Enrico Bocchieri πŸ“‚ Article πŸ“… 1996 πŸ› Elsevier Science 🌐 English βš– 434 KB

Stochastic language models are widely used in spoken language understanding to recognize and interpret the speech signal: the speech samples are decoded into word transcriptions by means of acoustic and syntactic models and then interpreted according to a semantic model. Both for speech recognition