✦ LIBER ✦
An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences
✍ Scribed by R.F. Mott; T.B.L. Kirkwood; R.N. Curnow
- Publisher
- Springer
- Year
- 1990
- Tongue
- English
- Weight
- 514 KB
- Volume
- 52
- Category
- Article
- ISSN
- 1522-9602
No coin nor oath required. For personal study only.
✦ Synopsis
An accurate approximation is derived to the distribution of the length of the longest matching word present between two random DNA sequences of finite length, using only elementary probability arguments. The distribution is shown to be consistent with previous asymptotic results for the mean and variance of longest common words. The application of the distribution to assessing the statistical significance of sequence similarities is considered. It is shown how the distribution can be modified to take account of non-independence of neighbouring bases in real sequences.