Pattern Recognition Algorithms for Data Mining (Chapman & Hall/CRC Computer Science & Data Analysis)

✍ Scribed by Sankar K. Pal, Pabitra Mitra

Publisher: Chapman and Hall/CRC
Year: 2004
Tongue: English
Leaves: 218
Series: Chapman & Hall/CRC Computer Science & Data Analysis
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Pattern Recognition Algorithms for Data Mining covers the topic of data mining from a pattern recognition perspective. This unique book presents real life data sets from various domains, such as geographic information systems, remote sensing imagery, and population census, to demonstrate the use of innovative new methodologies. Classical approaches are covered along with granular computation by integrating fuzzy sets, artificial neural networks, and genetic algorithms for efficient knowledge discovery. The authors then compare the granular computing and rough fuzzy approaches with the more classical methods and clearly demonstrate why they are more efficient.

✦ Table of Contents

Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing......Page 1
Contents......Page 4
Foreword......Page 9
Preface......Page 14
List of Tables......Page 17
List of Figures......Page 19
References......Page 0
1.1 Introduction......Page 22
1.2 Pattern Recognition in Brief......Page 24
1.2.2 Feature selection/extraction......Page 25
1.2.3 Classification......Page 26
1.3 Knowledge Discovery in Databases (KDD)......Page 28
1.4.1 Data mining tasks......Page 31
1.4.3 Applications of data mining......Page 33
1.5.1 Database perspective......Page 35
1.5.3 Pattern recognition perspective......Page 36
1.5.4 Research issues and challenges......Page 37
1.6.1 Data reduction......Page 38
1.6.2 Dimensionality reduction......Page 39
1.6.4 Data partitioning......Page 40
1.6.6 Efficient search algorithms......Page 41
1.7 Significance of Soft Computing in KDD......Page 42
1.8 Scope of the Book......Page 43
2.1 Introduction......Page 49
2.2.1 Condensed nearest neighbor rule......Page 52
2.2.2 Learning vector quantization......Page 53
2.3 Multiscale Representation of Data......Page 54
2.4 Nearest Neighbor Density Estimate......Page 57
2.5 Multiscale Data Condensation Algorithm......Page 58
2.6 Experimental Results and Comparisons......Page 60
2.6.2 Test of statistical significance......Page 61
2.6.3 Classification: Forest cover data......Page 67
2.6.4 Clustering: Satellite image data......Page 68
2.6.5 Rule generation: Census data......Page 69
2.7 Summary......Page 72
3.1 Introduction......Page 79
3.2 Feature Extraction......Page 80
3.3 Feature Selection......Page 82
3.3.1 Filter approach......Page 83
3.4 Feature Selection Using Feature Similarity (FSFS)......Page 84
3.4.1 Feature similarity measures......Page 85
3.4.1.2 Least square regression error (e)......Page 86
3.4.1.3 Maximal information compression index (λ2)......Page 87
3.4.2 Feature selection through clustering......Page 88
3.5.1 Supervised indices......Page 91
3.5.2 Unsupervised indices......Page 92
3.5.3 Representation entropy......Page 93
3.6.1 Comparison: Classification and clustering performance......Page 94
3.6.2 Redundancy reduction: Quantitative study......Page 99
3.6.3 Effect of cluster size......Page 100
3.7 Summary......Page 102
4.1 Introduction......Page 103
4.2 Support Vector Machine......Page 106
4.3 Incremental Support Vector Learning with Multiple Points......Page 108
4.4 Statistical Query Model of Learning......Page 109
4.4.2 Confidence factor of support vector set......Page 110
4.5 Learning Support Vectors with Statistical Queries......Page 111
4.6.1 Classification accuracy and training time......Page 114
4.6.3 Margin distribution......Page 117
4.7 Summary......Page 121
5.1 Introduction......Page 123
5.2 Soft Granular Computing......Page 125
5.3 Rough Sets......Page 126
5.3.2 Indiscernibility and set approximation......Page 127
5.3.3 Reducts......Page 128
5.3.4 Dependency rule generation......Page 130
5.4 Linguistic Representation of Patterns and Fuzzy Granulation......Page 131
5.5 Rough-fuzzy Case Generation Methodology......Page 134
5.5.1 Thresholding and rule generation......Page 135
5.5.2 Mapping dependency rules to cases......Page 137
5.5.3 Case retrieval......Page 138
5.6 Experimental Results and Comparison......Page 140
5.7 Summary......Page 141
6.1 Introduction......Page 143
6.2 Clustering Methodologies......Page 144
6.3.2 BIRCH: Balanced iterative reducing and clustering using hierarchies......Page 146
6.3.3 DBSCAN: Density-based spatial clustering of applications with noise......Page 147
6.3.4 STING: Statistical information grid......Page 148
6.4 CEMMiSTRI: Clustering using EM, Minimal Spanning Tree and Rough-fuzzy Initialization......Page 149
6.4.1 Mixture model estimation via EM algorithm......Page 150
6.4.2 Rough set initialization of mixture parameters......Page 151
6.4.3 Mapping reducts to mixture parameters......Page 152
6.4.4 Graph-theoretic clustering of Gaussian components......Page 153
6.5 Experimental Results and Comparison......Page 155
6.6 Multispectral Image Segmentation......Page 159
6.6.4 Experimental results and comparison......Page 161
6.7 Summary......Page 167
7.1 Introduction......Page 168
7.2 Self-Organizing Maps (SOM)......Page 169
7.2.1 Learning......Page 170
7.3 Incorporation of Rough Sets in SOM(RS OM)......Page 171
7.3.2 Mapping rough set rules to network weights......Page 172
7.4.1 Extraction methodology......Page 173
7.4.2 Evaluation indices......Page 174
7.5 Experimental Results and Comparison......Page 175
7.5.1 Clustering and quantization error......Page 176
7.5.2 Performance of rules......Page 181
7.6 Summary......Page 182
8.1 Introduction......Page 183
8.2 Ensemble Classifiers......Page 185
8.3.1.1 Apriori......Page 188
8.3.1.4 Dynamic itemset counting......Page 190
8.4 Classification Rules......Page 191
8.5.1.2 Output representation......Page 193
8.5.2 Rough set knowledge encoding......Page 194
8.6.1 Algorithm......Page 196
8.6.1.1 Steps......Page 197
8.6.2.1 Chromosomal representation......Page 200
8.6.2.4 Choice of fitness function......Page 201
8.7.1 Rule extraction methodology......Page 202
8.7.2 Quantitative measures......Page 206
8.8 Experimental Results and Comparison......Page 207
8.8.1 Classification......Page 208
8.8.2 Rule extraction......Page 210
8.8.2.1 Rules for staging of cervical cancer with binary feature inputs......Page 215
8.9 Summary......Page 217

📜 SIMILAR VOLUMES

Microarray Image Analysis: An Algorithmi

📁 Microarray Image Analysis: An Algorithmic Approach (Chapman & Hall CRC Computer Science & Data Analysis)

✍ Karl Fraser, Zidong Wang, Xiaohu Liu 📂 Library 📅 2010 🏛 Chapman and Hall/CRC 🌐 English

To harness the high-throughput potential of DNA microarray technology, it is crucial that the analysis stages of the process are decoupled from the requirements of operator assistance. Microarray Image Analysis: An Algorithmic Approach presents an automatic system for microarray image processing to

Textual Data Science with R (Chapman & H

📁 Textual Data Science with R (Chapman & Hall/CRC Computer Science & Data Analysis)

✍ Mónica Bécue-Bertaut 📂 Library 📅 2019 🏛 Chapman and Hall/CRC 🌐 English

<strong>Textual Statistics with R</strong> comprehensively covers the main multidimensional methods in textual statistics supported by a specially-written package in R. Methods discussed include correspondence analysis, clustering, and multiple factor analysis for contigency tables. Each method is i

Semisupervised Learning for Computationa

📁 Semisupervised Learning for Computational Linguistics (Chapman & Hall Crc Computer Science & Data Analysis)

✍ Steven Abney 📂 Library 📅 2007 🌐 English

The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisuper

Semisupervised Learning for Computationa

📁 Semisupervised Learning for Computational Linguistics (Chapman & Hall/CRC Computer Science & Data Analysis)

✍ Steven Abney 📂 Library 📅 2007 🏛 Chapman and Hall/CRC 🌐 English

We're finally getting to the point where Computational Linguistics will start to see their titles in the titles. In the past one would have to piggyback off of another discipline to get the information they needed. This book is a must for anyone learning anything statistical in the NLP field. I took

R Programming for Bioinformatics (Chapma

📁 R Programming for Bioinformatics (Chapman & Hall/CRC Computer Science & Data Analysis)

✍ Robert Gentleman 📂 Library 📅 2008 🏛 Chapman and Hall/CRC 🌐 English

R itself is well equipped with documentation, which ships with every distribution of R and the R add-on packages. But a good overall picture, how R concrete is used in nowadays bioinformatics software development, was always missing. If you are interested in bioinformatics software developing, and

Design and Modeling for Computer Experim

📁 Design and Modeling for Computer Experiments (Chapman & Hall CRC Computer Science & Data Analysis)

✍ Kai-Tai Fang, Runze Li, Agus Sudjianto 📂 Library 📅 2005 🏛 Chapman and Hall CRC 🌐 English

Computer simulations based on mathematical models have become ubiquitous across the engineering disciplines and throughout the physical sciences. Successful use of a simulation model, however, requires careful interrogation of the model through systematic computer experiments. While specific theoret