𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Discovering knowledge from noisy databases using genetic programming

✍ Scribed by Wong, Man Leung ;Leung, Kwong Sak ;Cheng, Jack C. Y.


Publisher
John Wiley and Sons
Year
2000
Tongue
English
Weight
169 KB
Volume
51
Category
Article
ISSN
0002-8231

No coin nor oath required. For personal study only.

✦ Synopsis


In data mining, we emphasize the need for learning from huge, incomplete, and imperfect data sets. To handle noise in the problem domain, existing learning systems avoid overfitting the imperfect training examples by excluding insignificant patterns. The problem is that these systems use a limiting attribute-value language for representing the training examples and the induced knowledge. Moreover, some important patterns are ignored because they are statistically insignificant. In this article, we present a framework that combines Genetic Programming and Inductive Logic Programming to induce knowledge represented in various knowledge representation formalisms from noisy databases. The framework is based on a formalism of logic grammars, and it can specify the search space declaratively. An implementation of the framework, LOGENPRO (The Logic grammar based GENetic PROgramming system), has been developed.

The performance of LOGENPRO is evaluated on the chess end-game domain. We compare LOGENPRO with FOIL and other learning systems in detail, and find its performance is significantly better than that of the others. This result indicates that the Darwinian principle of natural selection is a plausible noise handling method that can avoid overfitting and identify important patterns at the same time. Moreover, the system is applied to one real-life medical database. The knowledge discovered provides insights to and allows better understanding of the medical domains.


πŸ“œ SIMILAR VOLUMES


Using biological knowledge to discover h
✍ Gary K. Chen; Duncan C Thomas πŸ“‚ Article πŸ“… 2010 πŸ› John Wiley and Sons 🌐 English βš– 401 KB

## Abstract The recent successes of genome‐wide association studies (GWAS) have revealed that many of the replicated findings have explained only a small fraction of the heritability of common diseases. One hypothesis that investigators have suggested is that higher order interactions between SNPs

Programs for signal recovery from noisy
✍ V.I. Gelfgat; E.L. Kosarev; E.R. Podolyak πŸ“‚ Article πŸ“… 1993 πŸ› Elsevier Science 🌐 English βš– 617 KB

University of Belfast, N. Ireland (see application form in this Nature of physical problem issue) The programs allow reconstruction of nonnegative signals from noisy experimental data distorted by the measuring Licensing provi.sions: Persons requesting the program must device [1].Noise may have a Ga

Data mining of fractured experimental da
✍ Q. Shao; R.C. Rowe; P. York πŸ“‚ Article πŸ“… 2008 πŸ› John Wiley and Sons 🌐 English βš– 144 KB

In the pharmaceutical field, current practice in gaining process understanding by data analysis or knowledge discovery has generally focused on dealing with single experimental databases. This limits the level of knowledge extracted in the situation where data from a number of sources, so called fra