Group 14 used data-mining strategies to evaluate a number of issues, including appropriate diagnosis, haplotype estimation, genetic linkage and association studies, and type I error. Methods ranged from exploratory analyses, to machine learning strategies (neural networks, supervised learning, and t
Data mining in hydrology
✍ Scribed by Vladan Babovic
- Publisher
- John Wiley and Sons
- Year
- 2005
- Tongue
- English
- Weight
- 85 KB
- Volume
- 19
- Category
- Article
- ISSN
- 0885-6087
- DOI
- 10.1002/hyp.5862
No coin nor oath required. For personal study only.
✦ Synopsis
Present-day instrumentation networks already provide immense quantities of data, very little of which provide any insight into the basic physical phenomena that are occurring in the medium measured. In order to exploit fully the information contained in the data, scientists are developing a suite of techniques to 'mine the knowledge' from data.
The Data Mining Paradigm
The formative period of modern science covered the period between the late 15th century and the late 18th century. The new foundations were based on the utilization of the concept of a physical experiment and the applications of a mathematical apparatus in order to describe these experiments. The works of Brahe, Kepler, Newton, Leibniz, Euler and Lagrange clearly personify such an approach. Prior to these developments, scientific work primarily consisted only of collecting the observables, or recording the 'readings of the book of nature itself'.
This modern scientific approach was principally characterized by two stages: the first in which a set of observations of the physical system was collected, and the second in which inductive assertion about the behaviour of the system (a hypothesis) was generated. Observational data represent specific knowledge, whereas a hypothesis represents a generalization of this knowledge that implies and characterizes all such observational data. One may argue that, through this process of hypothesis generation, it becomes possible to economize human thought, since more compact ways of describing observations are provided. Evidently the 'information content' is very little changed (or even unchanged) when a body of data is transformed into an equation derived from these data, but the 'expressivity' or 'meaning value' is commonly increased immensely. Since it is just this increase in 'meaning value' that justifies the whole activity of substituting equations for data, there is a natural interest in processes for further promoting such means for effecting what are essentially 'economies of thought'.
Today, at the beginning of the 21st century, we are experiencing yet another change in the scientific process as just outlined. This latest scientific approach is one in which information technology is employed to assist the human analyst in the process of hypothesis generation. The computer-assisted analysis of data is sometimes referred to as a process of data mining and knowledge discovery. Data mining and knowledge discovery aims at providing tools to
📜 SIMILAR VOLUMES
An algorithm for rule discovery in databases is described which is based on the reasoning strategies of human diagnosticians. It differs from other algorithms in its hypothesis-driven approach and primarily qualitative assessment of rule interest. Upper and lower bounds are established for the value
especially if no solution is offered. For these kinds of reasons,