𝔖 Scriptorium
✦   LIBER   ✦

📁

Bayesian Statistical Modeling With Stan, R, and Python

✍ Scribed by Kentaro Matsuura


Publisher
Springer
Year
2023
Tongue
English
Leaves
395
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


This book provides a highly practical introduction to Bayesian statistical modeling with Stan, which has become the most popular probabilistic programming language. The book is divided into four parts. The first part reviews the theoretical background of modeling and Bayesian inference and presents a modeling workflow that makes modeling more engineering than art. The second part discusses the use of Stan, CmdStanR, and CmdStanPy from the very beginning to basic regression analyses. The third part then introduces a number of probability distributions, nonlinear models, and hierarchical (multilevel) models, which are essential to mastering statistical modeling. It also describes a wide range of frequently used modeling techniques, such as censoring, outliers, missing data, speed-up, and parameter constraints, and discusses how to lead convergence of MCMC. Lastly, the fourth part examines advanced topics for real-world dаta: longitudinal data analysis, state space models, spatial data analysis, Gaussian processes, Bayesian optimization, dimensionality reduction, model selection, and information criteria, demonstrating that Stan can solve any one of these problems in as little as 30 lines.

✦ Table of Contents


Preface
About This Book
Chapter Structure
Prerequired Background Knowledge
The Terminologies and Symbols Used in This Book
The Source Code Used in the Book
Contents
Part I Background of Modeling and Bayesian Inference
1 Overview of Statistical Modeling
1.1 What is Statistical Modeling?
1.2 Purposes of Statistical Modeling
1.3 Preparation for Data Analysis
1.3.1 Before Data Collection
1.3.2 After Collecting Data
1.4 Recommended Statistical Modeling Workflow
1.5 Role of Domain Knowledge
1.6 How to Represent Model
1.7 Model Selection Using Information Criteria
Reference
2 Overview of Bayesian Inference
2.1 Problems of Traditional Statistics
2.2 Likelihood and Maximum Likelihood Estimation (MLE)
2.3 Bayesian Inference and MCMC
2.4 Bayesian Confidence Interval, Bayesian Predictive Distribution, and Bayesian Prediction Interval
2.5 Relationship Between MLE and Bayesian Inference
2.6 Selection of Prior Distributions in This Book
References
Part II Introduction to Stan
3 Overview of Stan
3.1 Probabilistic Programming Language
3.2 Why Stan?
3.3 Why R and Python?
3.4 Preparation of Stan, CmdStanR, and CmdStanPy
3.5 Basic Grammar and Syntax of Stan
3.5.1 Block Structure
3.5.2 Basic Grammar and Syntax
3.5.3 Coding Style Guide
3.6 lp__ and target in Stan
References
4 Simple Linear Regression
4.1 Statistical Modeling Workflow Before Parameter Inference
4.1.1 Set Up Purposes
4.1.2 Check Data Distribution
4.1.3 Describe Model Formula
4.1.4 Maximum Likelihood Estimation Using R
4.1.5 Implement the Model with Stan
4.2 Bayesian Inference Using NUTS (MCMC)
4.2.1 Estimate Parameters from R or Python
4.2.2 Summarize the Estimation Result
4.2.3 Save the Estimation Result
4.2.4 Adjust the Settings of MCMC
4.2.5 Draw the MCMC Sample
4.2.6 Joint Posterior Distributions and Marginalized Posterior Distributions
4.2.7 Bayesian Confidence Intervals and Bayesian Prediction Intervals
4.3 transformed parameters Block and generated quantities Block
4.4 Other Inference Methods Besides NUTS
4.4.1 Bayesian Inference with ADVI
4.4.2 MAP Estimation with L-BFGS
4.5 Supplementary Information and Exercises
4.5.1 Exercises
Reference
5 Basic Regressions and Model Checking
5.1 Multiple Linear Regression
5.1.1 Set Up Purposes
5.1.2 Check Data Distribution
5.1.3 Imagine Data Generating Mechanisms
5.1.4 Describe Model Formula
5.1.5 Implement the Model
5.1.6 Estimate Parameters
5.1.7 Interpret Results
5.2 Check Models
5.2.1 Posterior Predictive Check (PPC)
5.2.2 Posterior Residual Check (PRC)
5.2.3 Scatterplot Matrix of MCMC Sample
5.3 Binomial Logistic Regression
5.3.1 Set Up Purposes
5.3.2 Check Data Distribution
5.3.3 Imagine Data Generating Mechanisms
5.3.4 Describe Model Formula
5.3.5 Implement the Model
5.3.6 Interpret Results
5.4 Logistic Regression
5.4.1 Set Up Purposes
5.4.2 Check Data Distribution
5.4.3 Imagine Data Generating Mechanisms
5.4.4 Describe Model Formula
5.4.5 Implement Models
5.4.6 PPC
5.5 Poisson Regression
5.5.1 Imagine Data Generating Mechanisms
5.5.2 Describe Model Formula
5.5.3 Implement the Model
5.5.4 Interpret Results
5.6 Expression Using Matrix Operation
5.7 Supplemental Information and Exercises
5.7.1 Exercises
Part III Essential Technics for Mastering Statistical Modeling
6 Introduction of Probability Distributions
6.1 Notations
6.2 Uniform Distribution
6.3 Bernoulli Distribution
6.4 Binomial Distribution
6.5 Beta Distribution
6.6 Categorical Distribution
6.7 Multinomial Distribution
6.8 Dirichlet Distribution
6.9 Exponential Distribution
6.10 Poisson Distribution
6.11 Gamma Distribution
6.12 Normal Distribution
6.13 Lognormal Distribution
6.14 Multivariate Normal Distribution
6.15 Cauchy Distribution
6.16 Student-t Distribution
6.17 Double Exponential Distribution (Laplace Distribution)
6.18 Exercise
References
7 Issues of Regression
7.1 Log Transformation
7.2 Nonlinear Model
7.2.1 Exponential Function
7.2.2 Emax Function
7.2.3 Sigmoid Emax Function
7.2.4 Other Functions
7.3 Interaction
7.4 Multicollinearity
7.5 Model Misspecification
7.6 Variable Selection
7.7 Censoring
7.8 Outlier
References
8 Hierarchical Model
8.1 Introduction of Hierarchical Models
8.1.1 Set Up Purposes and Check Data Distribution
8.1.2 Without Considering Group Difference
8.1.3 Groups Have Varying Intercepts and Slopes
8.1.4 Hierarchical Model
8.1.5 Model Comparison
8.1.6 Equivalent Representation of Hierarchical Models
8.2 Hierarchical Model with Multiple Layers
8.2.1 Set Up Purposes and Check Data Distribution
8.2.2 Imagine Data Generating Mechanisms and Describe Model Formula
8.2.3 Implement the Model
8.3 Hierarchical Model for Nonlinear Model
8.3.1 Set Up Purposes and Check Data Distribution
8.3.2 Imagine Data Generating Mechanisms and Describe Model Formula
8.3.3 Implement Models
8.3.4 Interpret Results
8.4 Missing Data
8.5 Hierarchical Model for Logistic Regression Model
8.5.1 Set Up Purposes
8.5.2 Imagine Data Generating Mechanisms
8.5.3 Describe Model Formula
8.5.4 Implement Models
8.5.5 Interpreting Results
8.6 Exercises
References
9 How to Improve MCMC Convergence
9.1 Removing Nonidentifiable Parameters
9.1.1 Parameter Identifiability
9.1.2 Individual Difference
9.1.3 Label Switching
9.1.4 Multinomial Logistic Regression
9.1.5 The Tortoise and the Hare
9.2 Use Weakly-Informative Priors to Restrict the Posterior Distributions
9.2.1 Weakly Informative Prior for Parameters in ( - infty, infty )
9.2.2 Weakly Informative Prior for Parameters with Positive Values
9.2.3 Weakly Informative Prior for Parameters in Range [0, 1]
9.2.4 Weakly Informative Prior for Covariance Matrix
9.3 Loosen Posterior Distribution by Reparameterization
9.3.1 Neal’s Funnel
9.3.2 Reparameterization of Hierarchical Models
9.3.3 Reparameterization of Multivariate Normal Distribution
9.4 Other Cases
9.5 Supplementary Information
References
10 Discrete Parameters
10.1 Techniques to Handle Discrete Parameters
10.1.1 log_sum_exp Function
10.1.2 Marginalizing Out Discrete Parameters
10.1.3 Using Mathematical Relationships
10.2 Mixture of Normal Distributions
10.3 Zero-Inflated Distribution
10.3.1 Set Up Purposes and Check Data Distribution
10.3.2 Imagine Data Generating Mechanisms
10.3.3 Describe Model Formula
10.3.4 Implement Models
10.3.5 Interpret Results
10.4 Supplementary Information and Exercises
10.4.1 Exercises
Reference
Part IV Advanced Topics for Real-World Data Analysis
11 Time Series Data Analysis with State Space Model
11.1 Introduction to Space State Models
11.1.1 Set Up Purposes
11.1.2 Check Data Distribution
11.1.3 Imagine the Mechanisms of Data Generation Process
11.1.4 Describe Model Formula
11.1.5 Implement the Model
11.1.6 Interpret the Results
11.2 Extending System Model
11.2.1 Trend Component
11.2.2 Regression Component
11.2.3 Seasonal Component
11.2.4 Switch Component
11.2.5 Pulse Component
11.2.6 Stationary AR Component
11.2.7 Reparameterization of Component
11.3 Extending the Observation Model
11.3.1 Outliers
11.3.2 Binary Values
11.3.3 Count Data
11.3.4 Vector
11.4 State Space Model with Missing Data
11.4.1 Observations at Certain Time Points are Missing
11.4.2 Time Intervals are not the Same (Unequal Intervals)
11.4.3 Vector
11.5 (Example 1) Difference Between Two Time Series
11.6 (Example 2) Changes in Body Weight and Body Fat
11.7 (Example 3) The Transition of Tennis Players’ Capabilities
11.8 (Example 4) Decomposition of Sales Data
11.9 Supplementary Materials and Exercises
11.9.1 Exercises
References
12 Spatial Data Analysis Using Gaussian Markov Random Fields and Gaussian Processes
12.1 Equivalence Between State Space Model and One-Dimensional GMRF
12.1.1 Posterior Probability of the State Space Model
12.1.2 The Equivalence Between the Temporal and Spatial Structures
12.2 (Example 1) Data on One-Dimensional Location
12.3 (Example 2) Fix the “Age Heaping”
12.4 Two-Dimensional GMRF
12.5 (Example 3) Geospatial Data on the Map
12.6 (Example 4) Data on Two-Dimensional Grid
12.7 Introduction to GP
12.7.1 Implementation of GP (1)
12.7.2 Implementation of GP (2)
12.7.3 Prediction with GP
12.7.4 Other Kernel Functions
12.8 (Example 5) Data on One-Dimensional Location
12.9 (Example 6) Data on Two-Dimensional Grid
12.10 Inducing Variable Method
12.11 Supplementary Information and Exercises
12.11.1 Exercises
References
13 Usages of MCMC Samples from Posterior and Predictive Distributions
13.1 Simulation Based Sample Size Calculation
13.2 Bayesian Decision Theory
13.3 Thompson Sampling and Bayesian Optimization
13.3.1 Thompson Sampling
13.3.2 Bayesian Optimization
References
14 Other Advanced Topics
14.1 Survival Analysis
14.2 Matrix Decomposition and Dimensionality Reduction
14.2.1 Matrix Decomposition
14.2.2 Dimensionality Reduction
14.3 Model Selection Based on Information Criteria
14.3.1 Introduction of Generalization Error and WAIC
14.3.2 Simulation Study to Evaluate Information Criteria
14.3.3 WAIC in a Hierarchical Model
14.4 Supplementary Information and Exercises
14.4.1 Exercises
References
Appendix Differences from BUGS Language


📜 SIMILAR VOLUMES


Bayesian Statistical Modeling with Stan,
✍ Kentaro Matsuura 📂 Library 📅 2023 🏛 Springer Nature 🌐 English

This book provides a highly practical introduction to Bayesian statistical modeling with Stan, which has become the most popular probabilistic programming language. The book is divided into four parts. The first part reviews the theoretical background of modeling and Bayesian inference and presents

Bayesian Models for Astrophysical Data:
✍ Joseph M. Hilbe, Rafael S. de Souza, Emille E. O. Ishida 📂 Library 📅 2017 🏛 Cambridge University Press 🌐 English

This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian g

Statistical Rethinking: A Bayesian Cours
✍ Richard McElreath 📂 Library 📅 2020 🏛 CRC Press 🌐 English

Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds your knowledge of and confidence in making inferences from data. Reflecting the need for scripting in today's model-based statistics, the book pushes you to perform step-by-step calculations that are usually automated. This