Data mining, or Knowledge Discovery in Databases (KDD), is of little benefit to commercial enterprises unless it can be carried out efficiently on realistic volumes of data. Operational factors also dictate that KDD should be performed within the context of standard DBMS. Fortunately, relational DBM
Mining Very Large Databases with Parallel Processing
โ Scribed by Alex A. Freitas, Simon H. Lavington (auth.)
- Publisher
- Springer US
- Year
- 2000
- Tongue
- English
- Leaves
- 210
- Series
- The Kluwer International Series on Advances in Database Systems 9
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely `intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms.
The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers.
It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science.
The primary audience for Mining Very Large Databases with ParallelProcessing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning.
โฆ Table of Contents
Front Matter....Pages i-xiii
Introduction....Pages 1-4
Front Matter....Pages 5-5
Knowledge Discovery Tasks....Pages 7-17
Knowledge Discovery Paradigms....Pages 19-29
The Knowledge Discovery Process....Pages 31-40
Data Mining....Pages 41-50
Data Mining Tools....Pages 51-57
Front Matter....Pages 59-59
Basic Concepts on Parallel Processing....Pages 61-69
Data Parallelism, Control Parallelism, and Related Issues....Pages 71-78
Parallel Database Servers....Pages 79-86
Front Matter....Pages 87-87
Approaches to Speed Up Data Mining....Pages 89-108
Parallel Data Mining without DBMS Facilities....Pages 109-142
Parallel Data Mining with DBMS Facilities....Pages 143-172
Summary and Some Open Problems....Pages 173-179
Back Matter....Pages 181-208
โฆ Subjects
Data Structures, Cryptology and Information Theory; Document Preparation and Text Processing
๐ SIMILAR VOLUMES
<b>The latest techniques and principles of parallel and grid database processing <p> The growth in grid databases, coupled with the utility of parallel query processing, presents an important opportunity to understand and utilize high-performance parallel database processing within a major da
The latest techniques and principles of parallel and grid database processingThe growth in grid databases, coupled with the utility of parallel query processing, presents an important opportunity to understand and utilize high-performance parallel database processing within a major database manageme
With the unprecedented growth-rate at which data is being collected and stored electronically today in almost all fields of human endeavor, the efficient extraction of useful information from the data available is becoming an increasing scientific challenge and a massive economic need. This book pre
<b>The latest techniques and principles of parallel and grid database processing <p> The growth in grid databases, coupled with the utility of parallel query processing, presents an important opportunity to understand and utilize high-performance parallel database processing within a major da