๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

A weighted average n-gram model of natural language

โœ Scribed by P. O'Boyle; M. Owens; F.J. Smith


Publisher
Elsevier Science
Year
1994
Tongue
English
Weight
403 KB
Volume
8
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


A new (n)-gram model of natural language designed to aid speech recognition is presented in which the probabilities are calculated as a weighted average of maximum likelihood probabilities obtained from a training corpus. This simple approach produces a model that can be constructed quickly and is easily adapted either by changing the weights or by changing the training corpus. The model is compared with two other models; the first is based on Turing-Good estimates and uses Katz's back-off approach. The second model is a deleted estimate model which combines different probability distributions in approximately optimal proportions.

We introduce a new measure for language models based on their performance when predicting words removed randomly from samples of unseen text. The performance of all three models using both this new measure and the existing measure of perplexity have been compared. Results indicate that the performance of the new model is close to the performance of the deleted estimate model, while both are superior to the Turing-Good model.


๐Ÿ“œ SIMILAR VOLUMES


Relevance weighting for combining multi-
โœ R. Iyer; M. Ostendorf ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 155 KB

Standard statistical language modeling techniques suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. In this paper, we focus on improving the estimation of domain-dependent n-gram models by the selective use of out-of-domain text data. Previous a

Interpolation of n-gram and mutual-infor
โœ Z. GuoDong; L. KimTeng ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 191 KB

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a th

Average energy of an N-electron system i
โœ J. Karwowski; J. Planelles; F. Rajadell ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 112 KB

An expression for the average energy of an N-electron system in a finite-dimensional, ลฝ antisymmetric, and spin-adapted model space as, e.g., a full-configuration interaction . space is derived using elementary properties of the Hamiltonian in the Fock space.