The Shannon entropy is a standard measure for the order state of symbol sequences, such as, for example, DNA sequences. In order to incorporate correlations between symbols, the entropy of n-mers (consecutive strands of n symbols) has to be determined. Here, an assay is presented to estimate such hi
Robustness of the Estimator of the Index of Dispersion for DNA Sequences
β Scribed by Rasmus Nielsen
- Publisher
- Elsevier Science
- Year
- 1997
- Tongue
- English
- Weight
- 118 KB
- Volume
- 7
- Category
- Article
- ISSN
- 1055-7903
No coin nor oath required. For personal study only.
β¦ Synopsis
If substitutions in DNA sequences follow a Poisson process, the ratio of the variance in the number of substitutions to the mean number of substitutions (the index of dispersion) should equal 1. In this paper, the robustness of the commonly applied estimator of the index of dispersion in replacement sites and silent sites to various assumptions regarding DNA evolution is explored using simulation methods. The estimate of the index of dispersion may be strongly biased if the assumptions of the model of substitution are violated. However, the results of this study support the conclusions of studies by Gillespie and Ohta that the process of substitution in replacement sites is overdispersed. This result contradicts those of a recent study and shows that the high index of dispersion for replacement sites is not an artifact caused by the method of estimation.
π SIMILAR VOLUMES
## Abstract The __h__βindex (Hirsch, 2005) is robust, remaining relatively unaffected by errors in the long tails of the citationsβrank distribution, such as typographic errors that shortβchange frequently cited articles and create bogus additional records. This robustness, and the ease with which
We consider Sturmian sequences and provide an explicit formula for the index of such a sequence in terms of the continued fraction expansion coefficients of its slope.
In the literature, there are basically two kinds of resampling methods for least squares estimation in linear models; the E-type (the efficient ones like the classical bootstrap), which is more efficient when error variables are homogeneous, and the R-type (the robust ones like the jackknife), which