An algorithm for rule discovery in databases is described which is based on the reasoning strategies of human diagnosticians. It differs from other algorithms in its hypothesis-driven approach and primarily qualitative assessment of rule interest. Upper and lower bounds are established for the value
Data mining
✍ Scribed by L. Adrienne Cupples; Julia Bailey; Kevin C. Cartier; Catherine T. Falk; Kuang-Yu Liu; Yuanqing Ye; Robert Yu; Heping Zhang; Hongyu Zhao
- Publisher
- John Wiley and Sons
- Year
- 2005
- Tongue
- English
- Weight
- 114 KB
- Volume
- 29
- Category
- Article
- ISSN
- 0741-0395
No coin nor oath required. For personal study only.
✦ Synopsis
Group 14 used data-mining strategies to evaluate a number of issues, including appropriate diagnosis, haplotype estimation, genetic linkage and association studies, and type I error. Methods ranged from exploratory analyses, to machine learning strategies (neural networks, supervised learning, and tree-based methods), to false discovery rate control of type I errors. The general motivations were to find the "story" in the data and to summarize information from a multitude of measures. Several methods illustrated strategies for better trait definition, using summarization of related traits. In the few studies that sought to identify genes for alcoholism, there was little agreement among the different strategies, likely reflecting the complexities of the disease. Nevertheless, Group 14 found that these methods offered strategies to gain a better understanding of the complex pathways by which disease develops.
📜 SIMILAR VOLUMES
Present-day instrumentation networks already provide immense quantities of data, very little of which provide any insight into the basic physical phenomena that are occurring in the medium measured. In order to exploit fully the information contained in the data, scientists are developing a suite of
Uncertainty management is necessary for real world applications, especially those used with data mining. The Region Connection Calculus (RCC) and egg-yolk methods have proven useful for the representation of vague regions in spatial data. Rough set theory has been shown to be an effective tool for d