The paper described three previously undetected effects, due to biases and non-independence, that can arise in statistical tests for associations between character states in cross-species data. One kind, which we call the family problem, is general to all known methods. In phytogenetic data, the anc
Statistical Tests for Discrete Cross-species Data
β Scribed by Alan Grafen; Mark Ridley
- Publisher
- Elsevier Science
- Year
- 1996
- Tongue
- English
- Weight
- 219 KB
- Volume
- 183
- Category
- Article
- ISSN
- 0022-5193
No coin nor oath required. For personal study only.
β¦ Synopsis
Four methods have been proposed that can be used to test for associations between the states of discrete characters in cross-species data and that do not suffer from non-independence due to overcounting of data points. The tests are those of Ridley (1983), Burt (1989), Grafen (1989), and a new test called the ICDE test. The aim of the paper is to measure the Type I error rates for these methods with simulated null distributions of discrete characters. The null data is generated by a model of discrete character evolution, using three shapes of phylogeny: tetratomous, dichotomous, and realistic. Ridley's and Burt's tests are both reasonably valid with the realistic phylogeny but biased with the tetratomous and dichotomous phylogenies. Grafen's phylogenetic regression is reasonably valid with all tree shapes. One version of the ICDE test was valid, the other less so. The invalid results are explained in terms of two kinds of statistical non-independence that arise in discrete data: non-independence due to the reconstruction of character states by parsimony, and the ''family problem'' in which similar patterns are found in null data in many separate radiations because all the radiations began from the same ancestral state.
π SIMILAR VOLUMES
This paper considered the relative merits of the P-value and the mid-P-value. It is shown that inference based onthe mid-P-value is in a certain sense on firmer ground. In particular the expected mid-P-value does not change under an irrelevant breakup in the test statistic.
A statistical foundation is given to the problem of hypothesizing and testing geometric properties of image data heuristically derived by Kanatani (CVGIP: Image Understanding 54 (1991), 333-348). Points and lines in the image are represented by " \(\mathrm{N}\)-vectors" and their reliability is eval
In this paper we consider the amount of undetected replication in HIV infection diagnoses as reported to the Public Health Laboratory Service AIDS Centre, Colindale, London. These diagnoses are usually reported with the date of birth of the individual but no names held on the database. The PHLS cann