Zipf's Law in Importance of Genes for Cancer Classification Using Microarray Data
โ Scribed by WENTIAN LI; YANING YANG
- Publisher
- Elsevier Science
- Year
- 2002
- Tongue
- English
- Weight
- 264 KB
- Volume
- 219
- Category
- Article
- ISSN
- 0022-5193
No coin nor oath required. For personal study only.
โฆ Synopsis
Using a measure of how differentially expressed a gene is in two biochemically/phenotypically different conditions, we can rank all genes in a microarray dataset. We have shown that the falling-off of this measure (normalized maximum likelihood in a classification model such as logistic regression) as a function of the rank is typically a power-law function. This power-law function in other similar ranked plots are known as the Zipf's law, observed in many natural and social phenomena. The presence of this power-law function prevents an intrinsic cutoff point between the "important" genes and "irrelevant" genes. We have shown that similar power-law functions are also present in permuted dataset, and provide an explanation from the well-known chi(2) distribution of likelihood ratios. We discuss the implication of this Zipf's law on gene selection in a microarray data analysis, as well as other characterizations of the ranked likelihood plots such as the rate of fall-off of the likelihood.
๐ SIMILAR VOLUMES
BACKGROUND. An increase in the proportion of prostate carcinomas diagnosed at early, potentially curable stages has led to several changes in treatment of patients with this disease. Greater use of radical prostatectomy and external beam radiation has been documented, and recent data suggest that th