<p><EM><P>The focus of Mining Sequential Patterns from Large Data Sets</EM> is on sequential pattern mining. In many applications, such as bioinformatics, web access traces, system utilization logs, etc., the data is naturally in the form of sequences. This information has been of great interest for
Data Mining in Large Sets of Complex Data
β Scribed by Robson L. F. Cordeiro, Christos Faloutsos, Caetano Traina JΓΊnior (auth.)
- Publisher
- Springer-Verlag London
- Year
- 2013
- Tongue
- English
- Leaves
- 123
- Series
- SpringerBriefs in Computer Science
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
The amount and the complexity of the data gathered by current enterprises are increasing at an exponential rate. Consequently, the analysis of Big Data is nowadays a central challenge in Computer Science, especially for complex data. For example, given a satellite image database containing tens of Terabytes, how can we find regions aiming at identifying native rainforests, deforestation or reforestation? Can it be made automatically? Based on the work discussed in this book, the answers to both questions are a sound βyesβ, and the results can be obtained in just minutes. In fact, results that used to require days or weeks of hard work from human specialists can now be obtained in minutes with high precision. Data Mining in Large Sets of Complex Data discusses new algorithms that take steps forward from traditional data mining (especially for clustering) by considering large, complex datasets. Usually, other works focus in one aspect, either data size or complexity. This work considers both: it enables mining complex data from high impact applications, such as breast cancer diagnosis, region classification in satellite images, assistance to climate change forecast, recommendation systems for the Web and social networks; the data are large in the Terabyte-scale, not in Giga as usual; and very accurate results are found in just minutes. Thus, it provides a crucial and well timed contribution for allowing the creation of real time applications that deal with Big Data of high complexity in which mining on the fly can make an immeasurable difference, such as supporting cancer diagnosis or detecting deforestation.
β¦ Table of Contents
Front Matter....Pages i-xi
Introduction....Pages 1-6
Related Work and Concepts....Pages 7-20
Clustering Methods for Moderate-to-High Dimensionality Data....Pages 21-32
Halite....Pages 33-67
BoW....Pages 69-92
QMAS....Pages 93-109
Conclusion....Pages 111-116
β¦ Subjects
Data Mining and Knowledge Discovery; Database Management
π SIMILAR VOLUMES
<p><EM><P>The focus of Mining Sequential Patterns from Large Data Sets</EM> is on sequential pattern mining. In many applications, such as bioinformatics, web access traces, system utilization logs, etc., the data is naturally in the form of sequences. This information has been of great interest for
The focus of Mining Sequential Patterns from Large Data Sets is on sequential pattern mining. In many applications, such as bioinformatics, web access traces, system utilization logs, etc., the data is naturally in the form of sequences. This information has been of great interest for analyzing the
This paper describes problems, challenges, and opportunities for intelligent simulation of physical systems. Prototype intelligent simulation tools have been constructed for interpreting massive data sets from physical fields and for designing engineering systems. We identify the characteristics of
<p>This book offers a snapshot of the state-of-the-art in classification at the interface between statistics, computer science and application fields. The contributions span a broad spectrum, from theoretical developments to practical applications; they all share a strong computational component. Th