Multiple Imputation of Missing Data in Practice: Basic Theory and Analysis Strategies (Chapman & Hall/CRC Interdisciplinary Statistics)

✍ Scribed by Yulei He, Guangyu Zhang, Chiu-Hsieh Hsu

Publisher: Chapman and Hall/CRC
Year: 2021
Tongue: English
Leaves: 495
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Multiple Imputation of Missing Data in Practice: Basic Theory and Analysis Strategies provides a comprehensive introduction to the multiple imputation approach to missing data problems that are often encountered in data analysis. Over the past 40 years or so, multiple imputation has gone through rapid development in both theories and applications. It is nowadays the most versatile, popular, and effective missing-data strategy that is used by researchers and practitioners across different fields. There is a strong need to better understand and learn about multiple imputation in the research and practical community.

Accessible to a broad audience, this book explains statistical concepts of missing data problems and the associated terminology. It focuses on how to address missing data problems using multiple imputation. It describes the basic theory behind multiple imputation and many commonly-used models and methods. These ideas are illustrated by examples from a wide variety of missing data problems. Real data from studies with different designs and features (e.g., cross-sectional data, longitudinal data, complex surveys, survival data, studies subject to measurement error, etc.) are used to demonstrate the methods. In order for readers not only to know how to use the methods, but understand why multiple imputation works and how to choose appropriate methods, simulation studies are used to assess the performance of the multiple imputation methods. Example datasets and sample programming code are either included in the book or available at a github site (https://github.com/he-zhang-hsu/multiple_imputation_book).

Key Features

Provides an overview of statistical concepts that are useful for better understanding missing data problems and multiple imputation analysis

Provides a detailed discussion on multiple imputation models and methods targeted to different types of missing data problems (e.g., univariate and multivariate missing data problems, missing data in survival analysis, longitudinal data, complex surveys, etc.)

Explores measurement error problems with multiple imputation

Discusses analysis strategies for multiple imputation diagnostics

Discusses data production issues when the goal of multiple imputation is to release datasets for public use, as done by organizations that process and manage large-scale surveys with nonresponse problems

For some examples, illustrative datasets and sample programming code from popular statistical packages (e.g., SAS, R, WinBUGS) are included in the book. For others, they are available at a github site (https://github.com/he-zhang-hsu/multiple_imputation_book)

✦ Table of Contents

Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
Foreword
Preface
1. Introduction
1.1. A Motivating Example
1.2. Definition of Missing Data
1.3. Missing Data Patterns
1.4. Missing Data Mechanisms
1.5. Structure of the Book
2. Statistical Background
2.1. Introduction
2.2. Frequentist Theory
2.2.1. Sampling Experiment
2.2.2. Model, Parameter, and Estimation
2.2.3. Hypothesis Testing
2.2.4. Resampling Methods: The Bootstrap Approach
2.3. Bayesian Analysis
2.3.1. Rudiments
2.3.2. Prior Distribution
2.3.3. Bayesian Computation
2.3.4. Asymptotic Equivalence between Frequentist and Bayesian Estimates
2.4. Likelihood-based Approaches to Missing Data Analysis
2.5. Ad Hoc Missing Data Methods
2.6. Monte Carlo Simulation Study
2.7. Summary
3. Multiple Imputation Analysis: Basics
3.1. Introduction
3.2. Basic Idea
3.2.1. Bayesian Motivation
3.2.2. Basic Combining Rules and Their Justifications
3.2.3. Why Does Multiple Imputation Work?
3.3. Statistical Inference on Multiply Imputed Data
3.3.1. Scalar Inference
3.3.2. Multi-parameter Inference
3.3.3. How to Choose the Number of Imputations
3.4. How to Create Multiple Imputations
3.4.1. Bayesian Imputation Algorithm
3.4.2. Proper Multiple Imputation
3.4.3. Alternative Strategies
3.5. Practical Implementation
3.6. Summary
4. Multiple Imputation for Univariate Missing Data: Parametric Methods
4.1. Overview
4.2. Imputation for Continuous Data Based on Normal Linear Models
4.3. Imputation for Noncontinuous Data Based on Generalized Linear Models
4.3.1. Generalized Linear Models
4.3.2. Imputation for Binary Data
4.3.2.1. Logistic Regression Model Imputation
4.3.2.2. Discriminant Analysis Imputation
4.3.2.3. Rounding
4.3.2.4. Data Separation
4.3.3. Imputation for Nonbinary Categorical Data
4.3.4. Imputation for Other Types of Data
4.4. Imputation for a Missing Covariate in a Regression Analysis
4.5. Summary
5. Multiple Imputation for Univariate Missing Data: Robust Methods
5.1. Overview
5.2. Data Transformation
5.2.1. Transforming or Not?
5.2.2. How to Apply Transformation in Multiple Imputation
5.3. Imputation Based on Smoothing Methods
5.3.1. Basic Idea
5.3.2. Practical Use
5.4. Adjustments for Continuous Data with Range Restrictions
5.5. Predictive Mean Matching
5.5.1. Hot-Deck Imputation
5.5.2. Basic Idea and Procedure
5.5.3. Predictive Mean Matching for Noncontinuous Data
5.5.4. Additional Discussion
5.6. Inclusive Imputation Strategy
5.6.1. Basic Idea
5.6.2. Dual Modeling Strategy
5.6.2.1. Propensity Score
5.6.2.2. Calibration Estimation and Doubly Robust Estimation
5.6.2.3. Imputation Methods
5.7. Summary
6. Multiple Imputation for Multivariate Missing Data: The Joint Modeling Approach
6.1. Introduction
6.2. Imputation for Monotone Missing Data
6.3. Multivariate Continuous Data
6.3.1. Multivariate Normal Models
6.3.2. Models for Nonnormal Continuous Data
6.4. Multivariate Categorical Data
6.4.1. Log-Linear Models
6.4.2. Latent Variable Models
6.5. Mixed Categorical and Continuous Variables
6.5.1. One Continuous Variable and One Binary Variable
6.5.2. General Location Models
6.5.3. Latent Variable Models
6.6. Missing Outcome and Covariates in a Regression Analysis
6.6.1. General Strategy
6.6.2. Conditional Modeling Framework
6.6.3. Using WinBUGS
6.6.3.1. Background
6.6.3.2. Missing Interactions and Squared Terms of Covariates in
6.6.3.3. Imputation Using Flexible Distributions
6.7. Summary
7. Multiple Imputation for Multivariate Missing Data: The Fully Conditional Specification Approach
7.1. Introduction
7.2. Basic Idea
7.3. Specification of Conditional Models
7.4. Handling Complex Data Features
7.4.1. Data Subject to Bounds or Restricted Ranges
7.4.2. Data Subject to Skips
7.5. Implementation
7.5.1. General Algorithm
7.5.2. Software
7.5.2.1. Using WinBUGS
7.6. Subtle Issues
7.6.1. Compatibility
7.6.2. Performance under Model Misspecifications
7.7. A Practical Example
7.8. Summary
8. Multiple Imputation in Survival Data Analysis
8.1. Introduction
8.2. Imputation for Censored Event Times
8.2.1. Theoretical Basis
8.2.2. Parametric Imputation
8.2.3. Semiparametric Imputation
8.2.4. Merits
8.3. Survival Analysis with Missing Covariates
8.3.1. Overview
8.3.2. Joint Modeling
8.3.3. Fully Conditional Specification
8.3.4. Semiparametric Methods
8.4. Summary
9. Multiple Imputation for Longitudinal Data
9.1. Introduction
9.2. Mixed Models for Longitudinal Data
9.3. Imputation Based on Mixed Models
9.3.1. Why Use Mixed Models?
9.3.2. General Imputation Algorithm
9.3.3. Examples
9.4. Wide Format Imputation
9.5. Multilevel Data
9.6. Summary
10. Multiple Imputation Analysis for Complex Survey Data
10.1. Introduction
10.2. Design-Based Inference for Survey Data
10.3. Imputation Strategies for Complex Survey Data
10.3.1. General Principles
10.3.1.1. Incorporating the Survey Sampling Design
10.3.1.2. Assuming Missing at Random
10.3.1.3. Using Fully Conditional Specification
10.3.2. Modeling Options
10.4. Some Examples from the Literature
10.5. Database Construction and Release
10.5.1. Data Editing
10.5.2. Documentation and Release
10.6. Summary
11. Multiple Imputation for Data Subject to Measurement Error
11.1. Introduction
11.2. Rationale
11.3. Imputation Strategies
11.3.1. True Values Partially Observed
11.3.1.1. Basic Setup
11.3.1.2. Direct Imputation
11.3.1.3. Accommodating a Specific Analysis
11.3.1.4. Using Fully Conditional Specification
11.3.1.5. Predictors under Detection Limits
11.3.2. True Values Fully Unobserved
11.4. Data Harmonization Using Bridge Studies
11.5. Combining Information fromMultiple Data Sources
11.6. Imputation for a Composite Variable
11.7. Summary
12. Multiple Imputation Diagnostics
12.1. Overview
12.2. Imputation Model Development
12.2.1. Inclusion of Variables
12.2.2. Specifying Imputation Models
12.3. Comparison between Observed and Imputed Values
12.3.1. Comparison on Marginal Distributions
12.3.2. Comparison on Conditional Distributions
12.3.2.1. Basic Idea
12.3.2.2. Using Propensity Score
12.4. Checking Completed Data
12.4.1. Posterior Predictive Checking
12.4.2. Comparing Completed Data with Their Replicates
12.5. Assessing the Fraction of Missing Information
12.5.1. Relating the Fraction of Missing Information with Model Predictability
12.6. Prediction Accuracy
12.7. Comparison among Different Missing Data Methods
12.8. Summary
13. Multiple Imputation Analysis for Nonignorable Missing Data
13.1. Introduction
13.2. The Implication of Missing Not at Random
13.3. Using Inclusive Imputation Strategy to Rescue
13.4. Missing Not at Random Models
13.4.1. Selection Models
13.4.2. Pattern Mixture Models
13.4.3. Shared Parameter Models
13.5. Analysis Strategies
13.5.1. Direct Imputation
13.5.2. Sensitivity Analysis
13.6. Summary
14. Some Advanced Topics
14.1. Overview
14.2. Uncongeniality in Multiple Imputation Analysis
14.3. Combining Analysis Results from Multiply Imputed Datasets: Further Considerations
14.3.1. Normality Assumption in Question
14.3.2. Beyond Sufficient Statistics
14.3.3. Complicated Completed-Data Analyses: Variable Selection
14.4. High-Dimensional Data
14.5. Final Thoughts
Bibliography
Authors Index
Subject Index

📜 SIMILAR VOLUMES

Missing Data in Longitudinal Studies: St

📁 Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis (Chapman & Hall CRC Monographs on Statistics & Applied Probability)

✍ Michael J. Daniels, Joseph W. Hogan 📂 Library 📅 2008 🌐 English

Drawing from the authors’ own work and from the most recent developments in the field, Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis describes a comprehensive Bayesian approach for drawing inference from incomplete data in longitudinal studies. To il

Visualizing Data Patterns with Micromaps

📁 Visualizing Data Patterns with Micromaps (Chapman & Hall CRC Interdisciplinary Statistics)

✍ Daniel B. Carr, Linda Williams Pickle 📂 Library 📅 2010 🏛 Chapman and Hall\/CRC 🌐 English

After more than 15 years of development drawing on research in cognitive psychology, statistical graphics, computer science, and cartography, micromap designs are becoming part of mainstream statistical visualizations. Bringing together the research of two leaders in this field, Visualizing Data Pat

Bayesian Analysis for Population Ecology

📁 Bayesian Analysis for Population Ecology (Chapman & Hall CRC Interdisciplinary Statistics)

✍ Ruth King, Byron Morgan, Olivier Gimenez, Steve Brooks 📂 Library 📅 2009 🏛 Chapman and Hall/CRC 🌐 English

Novel Statistical Tools for Conserving and Managing Populations By gathering information on key demographic parameters, scientists can often predict how populations will develop in the future and relate these parameters to external influences, such as global warming. Because of their ability to eas

Statistics of Medical Imaging (Chapman &

📁 Statistics of Medical Imaging (Chapman & Hall/CRC Interdisciplinary Statistics)

✍ Tianhu Lei 📂 Library 📅 2011 🏛 Chapman and Hall/CRC 🌐 English

Statistical investigation into technology not only provides a better understanding of the intrinsic features of the technology (analysis), but also leads to an improved design of the technology (synthesis). Physical principles and mathematical procedures of medical imaging technologies have

Spatial Statistics for Data Science: The

📁 Spatial Statistics for Data Science: Theory and Practice with R (Chapman & Hall/CRC Data Science Series)

✍ Paula Moraga 📂 Library 📅 2023 🏛 Chapman and Hall/CRC 🌐 English

Applied Categorical and Count Data Analy

📁 Applied Categorical and Count Data Analysis (Chapman & Hall/CRC Texts in Statistical Science)

✍ Wan Tang, Hua He, Xin M. Tu 📂 Library 🏛 Chapman and Hall/CRC 🌐 English

Developed from the authors’ graduate-level biostatistics course, Applied Categorical and Count Data Analysis, Second Edition explains how to perform the statistical analysis of discrete data, including categorical and count outcomes. The authors have been teaching