Data Mining in Large Sets of Complex Data

✍ Scribed by Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina Júnior (auth.)

Publisher: Springer-Verlag London
Year: 2013
Tongue: English
Leaves: 123
Series: SpringerBriefs in Computer Science
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

The amount and the complexity of the data gathered by current enterprises are increasing at an exponential rate. Consequently, the analysis of Big Data is nowadays a central challenge in Computer Science, especially for complex data. For example, given a satellite image database containing tens of Terabytes, how can we find regions aiming at identifying native rainforests, deforestation or reforestation? Can it be made automatically? Based on the work discussed in this book, the answers to both questions are a sound “yes”, and the results can be obtained in just minutes. In fact, results that used to require days or weeks of hard work from human specialists can now be obtained in minutes with high precision. Data Mining in Large Sets of Complex Data discusses new algorithms that take steps forward from traditional data mining (especially for clustering) by considering large, complex datasets. Usually, other works focus in one aspect, either data size or complexity. This work considers both: it enables mining complex data from high impact applications, such as breast cancer diagnosis, region classification in satellite images, assistance to climate change forecast, recommendation systems for the Web and social networks; the data are large in the Terabyte-scale, not in Giga as usual; and very accurate results are found in just minutes. Thus, it provides a crucial and well timed contribution for allowing the creation of real time applications that deal with Big Data of high complexity in which mining on the fly can make an immeasurable difference, such as supporting cancer diagnosis or detecting deforestation.

✦ Table of Contents

Front Matter....Pages i-xi
Introduction....Pages 1-6
Related Work and Concepts....Pages 7-20
Clustering Methods for Moderate-to-High Dimensionality Data....Pages 21-32
Halite....Pages 33-67
BoW....Pages 69-92
QMAS....Pages 93-109
Conclusion....Pages 111-116

✦ Subjects

Data Mining and Knowledge Discovery; Database Management

📜 SIMILAR VOLUMES

Mining Sequential Patterns from Large Da

📁 Mining Sequential Patterns from Large Data Sets

✍ Wei Wang, Jiong Yang (auth.) 📂 Library 📅 2005 🏛 Springer US 🌐 English

The focus of Mining Sequential Patterns from Large Data Sets is on sequential pattern mining. In many applications, such as bioinformatics, web access traces, system utilization logs, etc., the data is naturally in the form of sequences. This information has been of great interest for

Mining Sequential Patterns from Large Da

📁 Mining Sequential Patterns from Large Data Sets

✍ Wei Wang, Jiong Yang (auth.) 📂 Library 📅 2005 🏛 Springer US 🌐 English

Mining Sequential Patterns from Large Da

📁 Mining Sequential Patterns from Large Data Sets

✍ Wei Wang, Jiong Yang 📂 Library 📅 2005 🏛 Springer 🌐 English

The focus of Mining Sequential Patterns from Large Data Sets is on sequential pattern mining. In many applications, such as bioinformatics, web access traces, system utilization logs, etc., the data is naturally in the form of sequences. This information has been of great interest for analyzing the

Mining sequential patterns from large da

📁 Mining sequential patterns from large data sets

✍ Wei Wang, Jiong Yang 📂 Library 📅 2005 🏛 Springer 🌐 English

Intelligent Simulation Tools for Mining

📁 Intelligent Simulation Tools for Mining Large Scientific Data Sets

✍ Zhao F., Bailey-Kellogg C., Huang X. 📂 Library 📅 1999 🌐 English

This paper describes problems, challenges, and opportunities for intelligent simulation of physical systems. Prototype intelligent simulation tools have been constructed for interpreting massive data sets from physical fields and for designing engineering systems. We identify the characteristics of

Analysis of Large and Complex Data

📁 Analysis of Large and Complex Data

✍ Adalbert F.X. Wilhelm, Hans A. Kestler (eds.) 📂 Library 📅 2016 🏛 Springer 🌐 English

This book offers a snapshot of the state-of-the-art in classification at the interface between statistics, computer science and application fields. The contributions span a broad spectrum, from theoretical developments to practical applications; they all share a strong computational component. Th