𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

✍ Scribed by Z. GuoDong; L. KimTeng


Publisher
Elsevier Science
Year
1999
Tongue
English
Weight
191 KB
Volume
13
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

✦ Synopsis


While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a three-word window. This paper proposes a new language modeling approach to capture the preferred relationships between words over a short or long distance through the concept of MI-Trigger pairs. Different MI-Trigger-based models are constructed in either a distance-dependent or a distance-independent way within a window from 1 to 10 words. This new MI-Trigger-based modeling is also compared and merged with word bigram modeling. It is found that the MI-Trigger-based modeling has better performance than word bigram modeling. It is also found that n-gram and MI-Trigger models have good complementarity and their proper merging can further increase the recognition rate when tested on Mandarin speech recognition. One advantage of MI-Trigger-based modeling is that the number of parameters needed for MI-Trigger modeling is much less than that of word bigram modeling. Another advantage is that the number of trigger pairs in an MI-Trigger model can be kept to a reasonable size without losing too much of its modeling power.