This bookย brings together aย collection of articles on statistical methods relating to missing data analysis, including multiple imputation, propensity scores, instrumental variables, and Bayesian inference. Covering new research topics andย real-worldย examplesย which do not feature in many standard te
Bayesian Nonparametrics for Causal Inference and Missing Data
โ Scribed by Michael J. Daniels, Antonio Linero, Jason Roy
- Publisher
- CRC Press/Chapman & Hall
- Year
- 2023
- Tongue
- English
- Leaves
- 263
- Series
- Chapman & Hall/CRC Monographs on Statistics and Applied Probability
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Bayesian Nonparametrics for Causal Inference and Missing Data provides an overview of flexible Bayesian nonparametric (BNP) methods for modeling joint or conditional distributions and functional relationships, and their interplay with causal inference and missing data. This book emphasizes the importance of making untestable assumptions to identify estimands of interest, such as missing at random assumption for missing data and unconfoundedness for causal inference in observational studies. Unlike parametric methods, the BNP approach can account for possible violations of assumptions and minimize concerns about model misspecification. The overall strategy is to first specify BNP models for observed data and then to specify additional uncheckable assumptions to identify estimands of interest.
The book is divided into three parts. Part I develops the key concepts in causal inference and missing data and reviews relevant concepts in Bayesian inference. Part II introduces the fundamental BNP tools required to address causal inference and missing data problems. Part III shows how the BNP approach can be applied in a variety of case studies. The datasets in the case studies come from electronic health records data, survey data, cohort studies, and randomized clinical trials.
Features
โข Thorough discussion of both BNP and its interplay with causal inference and missing data
โข How to use BNP and g-computation for causal inference and non-ignorable missingness
โข How to derive and calibrate sensitivity parameters to assess sensitivity to deviations from uncheckable causal and/or missingness assumptions
โข Detailed case studies illustrating the application of BNP methods to causal inference and missing data
โข R code and/or packages to implement BNP in causal inference and missing data problems
The book is primarily aimed at researchers and graduate students from statistics and biostatistics. It will also serve as a useful practical reference for mathematically sophisticated epidemiologists and medical researchers.
โฆ Table of Contents
Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Preface
I. Overview of Bayesian inference in causal inference and missing data and identifiability
1. Overview of causal inference
1.1. Introduction
1.1.1. Types of Causal Effects
1.1.2. Identifiability and Causal Assumptions
1.2. The g-Formula
1.2.1. Time-Dependent Confounding
1.2.2. Bayesian Nonparametrics and the g-Formula
1.3. Propensity Scores
1.3.1. Covariate Balance
1.3.2. Conditioning on the Propensity Score
1.3.3. Positivity and Overlap
1.4. Marginal Structural Models
1.5. Principal Stratification
1.6. Causal Mediation
1.7. Summary
2. Overview of missing data
2.1. Introduction
2.2. Overview of Missing Data
2.2.1. What is โMissing Data?โ
2.2.2. Full vs. Observed Data
2.2.3. Notation and Data Structures
2.2.4. Processes Leading to Missing Data
2.3. Defining Estimands with Missing Data
2.4. Classification of Missing Data Mechanisms
2.4.1. Missing Completely at Random (MCAR)
2.4.2. Missing at Random (MAR)
2.4.3. Missing Not at Random (MNAR)
2.4.4. Everywhere MAR and MCAR
2.4.5. Identifiability of Estimands under MCAR, MAR, and MNAR
2.4.6. Deciding between MCAR, MAR, and MNAR
2.5. Ignorable versus Non-Ignorable Missingness
2.6. Types of Non-Ignorable Models
2.6.1. Selection Models
2.6.2. Pattern Mixture Models
2.6.3. Shared Parameter Models
2.6.4. Observed Data Modeling Strategies
2.7. Summary and a Look Forward
3. Overview of Bayesian inference for missing data and causal inference
3.1. The Posterior Distribution
3.2. Priors and Identifiability
3.2.1. Priors in General
3.2.2. Priors for Unidentified Parameters
3.2.3. Priors for the Distribution of Covariates
3.3. Computation of the Posterior
3.3.1. An Overview of Markov Chain Monte Carlo
3.3.2. Gibbs Sampling
3.3.3. The Metropolis-Hastings Algorithm
3.3.4. Slice Sampling
3.3.5. Hamiltonian Monte Carlo
3.3.6. Drawing Inferences from MCMC Output
3.4. Model Selection/Checking
3.4.1. Model Selection
3.4.2. Model Checking
3.5. Data Augmentation
3.6. Bayesian g-Computation
3.7. Summary
4. Identifiability and sensitivity analysis
4.1. Calibration of Sensitivity Parameters
4.2. Identifiability
4.2.1. Sensitivity to the Ignorability Assumption for Causal Inference with a Point Treatment
4.2.2. Sensitivity to the Sequential Ignorability Assumption for Causal Mediation
4.2.3. Monotonicity Assumptions for Principal Stratification
4.3. Monotone Restrictions
4.3.1. Pattern Mixture Alternatives to MAR
4.3.2. The Non-Future Dependence Assumption
4.3.3. Completing the NFD Specification
4.3.4. g-Computation for Interior Family Restrictions
4.3.5. g-Computation for the NFD Restriction
4.3.6. Differential Reasons for Dropout
4.4. Non-Monotone Restrictions
4.4.1. The Partial Missing at Random Assumption
4.4.2. Generic Non-Monotone Restrictions
4.4.3. Computation of Treatment Effects under Non-Monotone Missingness
4.4.4. Strategies for Introducing Sensitivity Parameters
4.5. Summary
II. Bayesian nonparametrics for causal inference and missing data
5. Bayesian decision trees and their ensembles
5.1. Motivation: The Need for Priors on Functions
5.1.1. Nonparametric Binary Regression and Semiparametric Gaussian Regression
5.1.2. Running Example: Medical Expenditure Data
5.2. From Basis Expansions to Tree Ensembles
5.3. Bayesian Additive Regression Trees
5.3.1. Decision Trees
5.3.2. Priors over Decision Trees
5.3.3. Ensembles of Decision Trees
5.3.4. Prior Specification for Bayesian Additive Regression Trees
5.3.5. Posterior Computation for Bayesian Additive Regression Trees
5.3.6. Non-Bayesian Approaches
5.4. Bayesian Additive Regression Trees Applied to Causal Inference
5.4.1. Estimating the Outcome Regression Function
5.4.2. Regularization-Induced Confounding and Bayesian Causal Forests
5.5. BART Models for Other Data Types
5.6. Summary
6. Dirichlet process mixtures and extensions
6.1. Motivation for Dirichlet Process Mixtures
6.2. Dirichlet Process Priors
6.3. Dirichlet Process Mixtures (DPMs)
6.3.1. Posterior Computations
6.3.2. DPMs for Causal Inference and Missing Data
6.3.3. Shahbaba and Neal DPM
6.3.4. Priors on Parameters of the Base Measure
6.4. Enriched Dirichlet Process Mixtures (EDPMs)
6.4.1. EDPM for Causal Inference and Missing Data
6.4.2. Posterior Computations
6.4.2.1. MCMC
6.4.2.2. Post-Processing Steps (after MCMC): g-Computation
6.5. Summary
7. Gaussian process priors and dependent Dirichlet processes
7.1. Motivation: Alternate Priors for Functions and Nonparametric Modeling of Conditional Distributions
7.2. Gaussian Process Priors
7.2.1. Normal Outcomes
7.2.2. Binary or Count Outcomes
7.2.3. Priors on GP Parameters
7.2.4. Posterior Computations
7.2.5. GP for Causal Inference
7.3. Dependent Dirichlet Process Priors
7.3.1. Sampling Algorithms
7.3.2. DDP+GP for Causal Inference
7.3.3. Considerations for Choosing between Various DP Mixture Models
7.4. Summary
III. Case studies
8. Causal inference on quantiles using propensity scores
8.1. EHR Data and Questions of Interest
8.2. Methods
8.3. Analysis
8.4. Conclusions
9. Causal inference with a point treatment using an EDPM model
9.1. Hepatic Safety of Therapies for HIV/HCV Coinfection
9.2. Methods
9.3. Analysis
9.4. Conclusions
10. DDP+GP for causal inference using marginal structural models
10.1. Changes in Neurocognitive Function among Individuals with HIV
10.2. Methods
10.3. Analysis
10.4. Conclusions
11. DPMs for dropout in longitudinal studies
11.1. Schizophrenia Clinical Trial
11.2. Methods
11.3. Posterior Computation
11.4. Analysis
11.5. Conclusions
12. DPMs for non-monotone missingness
12.1. The Breast Cancer Prevention Trial (BCPT)
12.2. Methods
12.3. Posterior Computation
12.4. Analysis
12.5. Conclusions
13. Causal mediation using DPMs
13.1. STRIDE Project
13.2. Methods
13.3. Analysis
13.4. Conclusions
14. Causal mediation using BART
14.1. Motivation
14.2. Methods
14.3. Regularization-Induced Confounding and the Prior on Selection Bias
14.4. Results
14.5. Conclusions
15. Causal analysis of semicompeting risks using a principal stratification estimand and DDP+GP
15.1. Brain Cancer Clinical Trial
15.2. Methods
15.3. Analysis
15.4. Conclusions
Bibliography
Index
๐ SIMILAR VOLUMES
<p>As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research
<p>This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the bookโs structure follows a data analysis perspective. As such, the chapters are organized by traditional dat
<p><p>The statistics profession is at a unique point in history. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready to move towards clear objective benchmarks under which too
<p>This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background o