✦ LIBER ✦

Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

✍ Scribed by Z. GuoDong; L. KimTeng

Publisher: Elsevier Science
Year: 1999
Tongue: English
Weight: 191 KB
Volume: 13
Category: Article
ISSN: 0885-2308
DOI: 10.1006/csla.1998.0118

No coin nor oath required. For personal study only.

✦ Synopsis

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a three-word window. This paper proposes a new language modeling approach to capture the preferred relationships between words over a short or long distance through the concept of MI-Trigger pairs. Different MI-Trigger-based models are constructed in either a distance-dependent or a distance-independent way within a window from 1 to 10 words. This new MI-Trigger-based modeling is also compared and merged with word bigram modeling. It is found that the MI-Trigger-based modeling has better performance than word bigram modeling. It is also found that n-gram and MI-Trigger models have good complementarity and their proper merging can further increase the recognition rate when tested on Mandarin speech recognition. One advantage of MI-Trigger-based modeling is that the number of parameters needed for MI-Trigger modeling is much less than that of word bigram modeling. Another advantage is that the number of trigger pairs in an MI-Trigger model can be kept to a reasonable size without losing too much of its modeling power.