The past decade has seen explosive growth in database technology and the amount of data collected. Advances in data collection, use of bar codes in commercial outlets, and the computerization of business transactions have flooded us with lots of data. We have an unprecedented opportunity to analyze
Special Issue on High-Performance Data Mining
โ Scribed by Vipin Kumar; Sanjay Ranka; Vineet Singh
- Publisher
- Elsevier Science
- Year
- 2001
- Tongue
- English
- Weight
- 70 KB
- Volume
- 61
- Category
- Article
- ISSN
- 0743-7315
No coin nor oath required. For personal study only.
โฆ Synopsis
There has been an explosive growth in the amount of data collected by commercial and scientific applications. E-commerce companies store click stream data to track consumer behavior on a Web site. Particle accelerators routinely store information regarding millions of events every day for high-energy physics analysis. The size of the data collected in these applications ranges from a few megabytes to tens of gigabytes per day.
There is an unprecedented opportunity to analyze this data to extract more intelligent and useful information. Data mining for interesting, useful, and previously unknown patterns from such large amounts of data is extremely important in these applications. Examples of data mining tasks include generating a description of different kinds of credit card holders, discovering particles with unusual behavior, and finding all the items of a department store that are frequently bought together.
Parallel and distributed processing is an important component of any successful large-scale data mining application for the following reasons: (i) the computation requirements are very large and (ii) the enormity of data or the nature of data collections often requires that the data be stored across multiple storage devices. The purpose of this special issue is to provide an overview of recent research in the area of high-performance data mining, especially for parallel and distributed environments. The six papers selected for inclusion in this special issue cover a broad range of issues in algorithms, software, and systems.
The first paper, by Sanjay Goil and Alok Choudhary and titled ``PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining,'' proposes an infrastructure for processing, online analytical processing (OLAP), and
๐ SIMILAR VOLUMES
The program of the 2002 World Congress on Computational Intelligence (WCCI 2002) included six sessions on "Granular Computing and Data Mining," reflecting the growing interest in application of granular computing to data mining. In these sessions, a broad range of issues were addressed. This Special
Rapid advances in technologies such as networking, parallel computing and information management have led to the development of intelligent software components that act autonomously on the behalf of users, can analyse and access a diverse range of information, can react to changes in their environme