๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

On structuring probabilistic dependences in stochastic language modelling

โœ Scribed by Hermann Ney; Ute Essen; Reinhard Kneser


Publisher
Elsevier Science
Year
1994
Tongue
English
Weight
975 KB
Volume
8
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


In this paper, we study the problem of stochastic language modelling from the viewpoint of introducing suitable structures into the conditional probability distributions. The task of these distributions is to predict the probability of a new word by looking at (M) or even all predecessor words. The conventional approach is to limit (M) to 1 or 2 and to interpolate the resulting bigram and trigram models with a unigram model in a linear fashion. However, there are many other structures that can be used to model the probabilistic dependences between the predecessor word and the word to be predicted. The structures considered in this paper are: nonlinear interpolation as an alternative to linear interpolation; equivalence classes for word histories and single words; cache memory and word associations. For the optimal estimation of nonlinear and linear interpolation parameters, the leaving-one-out method is systematically used. For the determination of word equivalence classes in a bigram model, an automatic clustering procedure has been adapted. To capture long-distance dependences, we consider various models for word-by-word dependences; the cache model may be viewed as a special type of self-association. Experimental results are presented for two text databases, a Germany database and an English database.


๐Ÿ“œ SIMILAR VOLUMES


Population Extinction and Quasi-stationa
โœ Garry L. Block; Linda J.S. Allen ๐Ÿ“‚ Article ๐Ÿ“… 2000 ๐Ÿ› Springer ๐ŸŒ English โš– 357 KB

Density-independent and density-dependent, stochastic and deterministic, discrete-time, structured models are formulated, analysed and numerically simulated. A special case of the deterministic, density-independent, structured model is the well-known Leslie age-structured model. The stochastic, dens

Boolean queries and term dependencies in
โœ Croft, W. Bruce ๐Ÿ“‚ Article ๐Ÿ“… 1986 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 741 KB

A method of integrating Boolean queries with probabilistic retrieval models is proposed. Boolean queries are interpreted as specifying term dependencies that can be used to correct the document scores obtained with a basic probabilistic model. Alternative methods of obtaining dependency information,

Periodicity of equilibrium structures in
โœ Tsantas, N. ;Georgiou, A. C. ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 478 KB ๐Ÿ‘ 2 views

The problem of periodicity for a non-homogeneous Markov model in a stochastic environment is studied. The stochastic concept is established through the notion of optional scenarios applied on the transition process. It is proved that the sequence of so-called aggregate structures follows a certain p