On the law of Zipf-Mandelbrot for multi-word phrases
✍ Scribed by Egghe, L.
- Publisher
- John Wiley and Sons
- Year
- 1999
- Tongue
- English
- Weight
- 92 KB
- Volume
- 50
- Category
- Article
- ISSN
- 0002-8231
No coin nor oath required. For personal study only.
✦ Synopsis
This article studies the probabilities of the occurrence of multi-word (m-word) phrases (m ؍ 2,3,. . .) in relation to the probabilities of occurrence of the single words. It is well known that, in the latter case, the law of Zipf is valid (i.e., a power law). We prove that in the case of m-word phrases (m > 2), this is not the case. We present two independent proofs of this. We furthermore show that, in case we want to approximate the found distribution by Zipf's law, we obtain exponents  m in this power law for which the sequence ( m ) mʦN is strictly decreasing. This explains experimental findings of Smith and Devine (1985), Hilberg (1988), and Meyer (1989a,b). Our results should be compared with a heuristic finding of Rousseau who states that the law of Zipf-Mandelbrot is valid for multi-word phrases. He, however, uses other-less classical-assumptions than we do.