Model-Based Clustering, Classification, and Density Estimation Using mclust in R

✍ Scribed by Luca Scrucca, Chris Fraley, T. Brendan Murphy, and Adrian E. Raftery

Publisher: CRC Press
Year: 2023
Tongue: English
Leaves: 269
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Model-Based Clustering, Classification, and Denisty Estimation Using mclust in R

Model-based clustering and classification methods provide a systematic statistical approach to clustering, classification, and density estimation via mixture modeling. The model-based framework allows the problems of choosing or developing an appropriate clustering or classification method to be understood within the context of statistical modeling. The mclust package for the statistical environment R is a widely adopted platform implementing these model-based strategies. The package includes both summary and visual functionality, complementing procedures for estimating and choosing models.

Key features of the book

An introduction to the model-based approach and the mclust R package
A detailed description of mclust and the underlying modeling strategies
An extensive set of examples, color plots, and figures along with the R code for reproducing them
Supported by a companion website, including the R code to reproduce the examples and figures presented in the book, errata, and other supplementary material
Model-Based Clustering, Classification, and Density Estimation Using mclust in R is accessible to quantitatively trained students and researchers with a basic understanding of statistical methods, including inference and computing. In addition to serving as a reference manual for mclust, the book will be particularly useful to those wishing to employ these model-based techniques in research or applications in statistics, data science, clinical research, social science, and many other disciplines.

✦ Table of Contents

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
List of Figures
List of Tables
List of Examples
Preface
1. Introduction
1.1. Model-Based Clustering and Finite Mixture Modeling
1.2. mclust
1.3. Overview
1.4. Organization of the Book
2. Finite Mixture Models
2.1. Finite Mixture Models
2.1.1. Maximum Likelihood Estimation and the EM Algorithm
2.1.2. Issues in Maximum Likelihood Estimation
2.2. Gaussian Mixture Models
2.2.1. Parsimonious Covariance Decomposition
2.2.2. EM Algorithm for Gaussian Mixtures
2.2.3. Initialization of EM Algorithm
2.2.4. Maximum A Posteriori (MAP) Classification
2.3. Model Selection
2.3.1. Information Criteria
2.3.2. Likelihood Ratio Testing
2.4. Resampling-Based Inference
3. Model-Based Clustering
3.1. Gaussian Mixture Models for Cluster Analysis
3.2. Clustering in mclust
3.3. Model Selection
3.3.1. BIC
3.3.2. ICL
3.3.3. Bootstrap Likelihood Ratio Testing
3.4. Resampling-Based Inference in mclust
3.5. Clustering Univariate Data
3.6. Model-Based Agglomerative Hierarchical Clustering
3.6.1. Agglomerative Clustering for Large Datasets
3.7. Initialization in mclust
3.8. EM Algorithm in mclust
3.9. Further Considerations
4. Mixture-Based Classification
4.1. Classification as Supervised Learning
4.2. Gaussian Mixture Models for Classification
4.2.1. Prediction
4.2.2. Estimation
4.3. Classification in mclust
4.4. Evaluating Classifier Performance
4.4.1. Evaluating Predicted Classes: Classification Error
4.4.2. Evaluating Class Probabilities: Brier Score
4.4.3. Estimating Classifier Performance: Test Set and Resampling-Based Validation
4.4.4. Cross-Validation in mclust
4.5. Classification with Unequal Costs of Misclassification
4.6. Classification with Unbalanced Classes
4.7. Classification of Univariate Data
4.8. Semi-Supervised Classification
5. Model-Based Density Estimation
5.1. Density Estimation
5.2. Finite Mixture Modeling for Density Estimation with mclust
5.3. Univariate Density Estimation
5.3.1. Diagnostics for Univariate Density Estimation
5.4. Density Estimation in Higher Dimensions
5.5. Density Estimation for Bounded Data
5.6. Highest Density Regions
6. Visualizing Gaussian Mixture Models
6.1. Displays for Univariate Data
6.2. Displays for Bivariate Data
6.3. Displays for Higher Dimensional Data
6.3.1. Coordinate Projections
6.3.2. Random Projections
6.3.3. Discriminant Coordinate Projections
6.4. Visualizing Model-Based Clustering and Classification on Projection Subspaces
6.4.1. Projection Subspaces for Visualizing Cluster Separation
6.4.2. Incorporating Variation in Covariances
6.4.3. Projection Subspaces for Classification
6.4.4. Relationship to Other Methods
6.5. Using ggplot2 with mclust
6.6. Using Color-Blind-Friendly Palettes
7. Miscellanea
7.1. Accounting for Noise and Outliers
7.2. Using a Prior for Regularization
7.2.1. Adding a Prior in mclust
7.3. Non-Gaussian Clusters from GMMs
7.3.1. Combining Gaussian Mixture Components for Clustering
7.3.2. Identifying Connected Components in GMMs
7.4. Simulation from Mixture Densities
7.5. Large Datasets
7.6. High-Dimensional Data
7.7. Missing Data
Bibliography
Index

📜 SIMILAR VOLUMES

Model-Based Clustering, Classification,

📁 Model-Based Clustering, Classification, and Density Estimation Using mclust in R

✍ Luca Scrucca, Chris Fraley, T. Brendan Murphy, Adrian E. Raftery 📂 Library 📅 2023 🏛 CRC Press 🌐 English

Model-Based Clustering and Classificatio

📁 Model-Based Clustering and Classification for Data Science: With Applications in R

✍ Charles Bouveyron; Gilles Celeux; T. Brendan Murphy; Adrian E. Raftery 📂 Library 📅 2019 🏛 Cambridge University Press 🌐 English

Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observat

Model-Based Clustering and Classificatio

📁 Model-Based Clustering and Classification for Data Science: With Applications in R

✍ Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery 📂 Library 📅 2019 🏛 Cambridge University Press 🌐 English

Model-Based Clustering and Classificatio

📁 Model-Based Clustering and Classification for Data Science: With Applications in R

✍ Bouveyron C 📂 Library 📅 2019 🏛 Cambridge University Press 🌐 English

Functional Estimation for Density, Regre

📁 Functional Estimation for Density, Regression Models and Processes

✍ Odile, Pons. 📂 Library 📅 2011 🏛 World Scientific 🌐 English

This volume discusses the extended stochastic integral (ESI) (or Skorokhod-Hitsuda integral) and its relation to the logarithmic derivative of differentiable measure along the vector or operator field. In addition, the theory of surface measures and the theory of heat potentials in infinite-dimensio

Functional Estimation for Density, Regre

📁 Functional Estimation for Density, Regression Models and Processes

✍ Odile Pons 📂 Library 📅 2023 🏛 World Scientific Pub Co Inc 🌐 English

<span>Nonparametric kernel estimators apply to the statistical analysis of independent or dependent sequences of random variables and for samples of continuous or discrete processes. The optimization of these procedures is based on the choice of a bandwidth that minimizes an estimation error and the