Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition
✍ Scribed by Z. GuoDong; L. KimTeng
- Publisher
- Elsevier Science
- Year
- 1999
- Tongue
- English
- Weight
- 191 KB
- Volume
- 13
- Category
- Article
- ISSN
- 0885-2308
No coin nor oath required. For personal study only.
✦ Synopsis
While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a three-word window. This paper proposes a new language modeling approach to capture the preferred relationships between words over a short or long distance through the concept of MI-Trigger pairs. Different MI-Trigger-based models are constructed in either a distance-dependent or a distance-independent way within a window from 1 to 10 words. This new MI-Trigger-based modeling is also compared and merged with word bigram modeling. It is found that the MI-Trigger-based modeling has better performance than word bigram modeling. It is also found that n-gram and MI-Trigger models have good complementarity and their proper merging can further increase the recognition rate when tested on Mandarin speech recognition. One advantage of MI-Trigger-based modeling is that the number of parameters needed for MI-Trigger modeling is much less than that of word bigram modeling. Another advantage is that the number of trigger pairs in an MI-Trigger model can be kept to a reasonable size without losing too much of its modeling power.