✦ LIBER ✦

A weighted average n-gram model of natural language

✍ Scribed by P. O'Boyle; M. Owens; F.J. Smith

Publisher: Elsevier Science
Year: 1994
Tongue: English
Weight: 403 KB
Volume: 8
Category: Article
ISSN: 0885-2308
DOI: 10.1006/csla.1994.1017

No coin nor oath required. For personal study only.

✦ Synopsis

A new (n)-gram model of natural language designed to aid speech recognition is presented in which the probabilities are calculated as a weighted average of maximum likelihood probabilities obtained from a training corpus. This simple approach produces a model that can be constructed quickly and is easily adapted either by changing the weights or by changing the training corpus. The model is compared with two other models; the first is based on Turing-Good estimates and uses Katz's back-off approach. The second model is a deleted estimate model which combines different probability distributions in approximately optimal proportions.

We introduce a new measure for language models based on their performance when predicting words removed randomly from samples of unseen text. The performance of all three models using both this new measure and the existing measure of perplexity have been compared. Results indicate that the performance of the new model is close to the performance of the deleted estimate model, while both are superior to the Turing-Good model.

📜 SIMILAR VOLUMES

Relevance weighting for combining multi-

Relevance weighting for combining multi-domain data for n-gram language modeling

✍ R. Iyer; M. Ostendorf 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 155 KB

Standard statistical language modeling techniques suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. In this paper, we focus on improving the estimation of domain-dependent n-gram models by the selective use of out-of-domain text data. Previous a

A mathematical model for translations of

A mathematical model for translations of natural languages

✍ Eli Katz; Lev J. Leifman; Roger H. Marty; Stewart M. Robinson 📂 Article 📅 1989 🏛 Elsevier Science 🌐 English ⚖ 661 KB

Interpolation of n-gram and mutual-infor

Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

✍ Z. GuoDong; L. KimTeng 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 191 KB

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a th

A Language for Category Theory in which

A Language for Category Theory in which Natural Equivalence Implies Elementary Equivalence of Models

✍ A. Preller 📂 Article 📅 1985 🏛 John Wiley and Sons 🌐 English ⚖ 473 KB

Average energy of an N-electron system i

Average energy of an N-electron system in a finite-dimensional and spin-adapted model space

✍ J. Karwowski; J. Planelles; F. Rajadell 📂 Article 📅 1997 🏛 John Wiley and Sons 🌐 English ⚖ 112 KB

An expression for the average energy of an N-electron system in a finite-dimensional, Ž antisymmetric, and spin-adapted model space as, e.g., a full-configuration interaction . space is derived using elementary properties of the Hamiltonian in the Fock space.

Analytic Models of Domain-Averaged Fermi

Analytic Models of Domain-Averaged Fermi Holes: A New Tool for the Study of the Nature of Chemical Bonds

✍ Robert Ponec; David L. Cooper; Andreas Savin 📂 Article 📅 2008 🏛 John Wiley and Sons 🌐 English ⚖ 341 KB 👁 1 views