<p></p>This book provides a straightforward look at the concepts, algorithms and advantages of Bayesian Deep Learning and Deep Generative Models. Starting from the model-based approach to Machine Learning, the authors motivate Probabilistic Graphical Models and show how Bayesian inference naturally
Variational Methods for Machine Learning with Applications to Deep Networks
✍ Scribed by Lucas Pinheiro Cinelli, Matheus Araújo Marins, Eduardo Antônio Barros da Silva, Sérgio Lima Netto
- Publisher
- Springer
- Year
- 2021
- Tongue
- English
- Leaves
- 173
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
This book provides a straightforward look at the concepts, algorithms and advantages of Bayesian Deep Learning and Deep Generative Models. Starting from the model-based approach to Machine Learning, the authors motivate Probabilistic Graphical Models and show how Bayesian inference naturally lends itself to this framework. The authors present detailed explanations of the main modern algorithms on variational approximations for Bayesian inference in neural networks. Each algorithm of this selected set develops a distinct aspect of the theory. The book builds from the ground-up well-known deep generative models, such as Variational Autoencoder and subsequent theoretical developments. By also exposing the main issues of the algorithms together with different methods to mitigate such issues, the book supplies the necessary knowledge on generative models for the reader to handle a wide range of data types: sequential or not, continuous or not, labelled or not. The book is self-contained, promptly covering all necessary theory so that the reader does not have to search for additional information elsewhere.
- Offers a concise self-contained resource, covering the basic concepts to the algorithms for Bayesian Deep Learning;
- Presents Statistical Inference concepts, offering a set of elucidative examples, practical aspects, and pseudo-codes;
- Every chapter includes hands-on examples and exercises and a website features lecture slides, additional examples, and other support material.
✦ Table of Contents
Preface
Contents
Acronyms
1 Introduction
1.1 Historical Context
1.2 On the Notation
References
2 Fundamentals of Statistical Inference
2.1 Models
2.1.1 Parametric Models
2.1.1.1 Location-Scale Families
2.1.2 Nonparametric Models
2.1.3 Latent Variable Models
2.1.4 De Finetti's Representation Theorem
2.1.5 The Likelihood Function
2.2 Exponential Family
2.2.1 Sufficient Statistics
2.2.2 Definition and Properties
2.3 Information Measures
2.3.1 Fisher Information
2.3.2 Entropy
2.3.2.1 Conditional Entropy
2.3.2.2 Differential Entropy
2.3.3 Kullback-Leibler Divergence
2.3.4 Mutual Information
2.4 Bayesian Inference
2.4.1 Bayesian vs. Classical Approach
2.4.2 The Posterior Predictive Distribution
2.4.3 Hierarchical Modeling
2.5 Conjugate Prior Distributions
2.5.1 Definition and Motivation
2.5.2 Conjugate Prior Examples
2.6 Point Estimation
2.6.1 Method of Moments
2.6.2 Maximum Likelihood Estimation
2.6.3 Maximum a Posteriori Estimation
2.6.4 Bayes Estimation
2.6.5 Expectation-Maximization
2.6.5.1 EM Example
2.7 Closing Remarks
References
3 Model-Based Machine Learning and Approximate Inference
3.1 Model-Based Machine Learning
3.1.1 Probabilistic Graphical Models
3.1.1.1 Direct Acyclic Graphs
3.1.1.2 Undirected Graphs
3.1.1.3 The Power of Graphical Models
3.1.2 Probabilistic Programming
3.2 Approximate Inference
3.2.1 Variational Inference
3.2.1.1 The Evidence Lower Bound
3.2.1.2 Information Theoretic View on the ELBO
3.2.1.3 The Mean-Field Approximation
3.2.1.4 Coordinate Ascent Variational Inference
3.2.1.5 Stochastic Variational Inference
3.2.1.6 VI Issues
3.2.1.7 VI Example
3.2.2 Assumed Density Filtering
3.2.2.1 Minimizing the Forward kl Divergence
3.2.2.2 Moment Matching in the Exponential Family
3.2.2.3 ADF Issues
3.2.2.4 ADF Example
3.2.3 Expectation Propagation
3.2.3.1 Recasting adf as a Product of Approximate Factors
3.2.3.2 Operations in the Exponential Family
3.2.3.3 Power EP
3.2.3.4 EP Issues
3.2.3.5 EP Example
3.2.4 Further Practical Extensions
3.2.4.1 Black Box Variational Inference
3.2.4.2 Black Box α Minimization
3.2.4.3 Automatic Differentiation Variational Inference
3.3 Closing Remarks
References
4 Bayesian Neural Networks
4.1 Why BNNs?
4.2 Assessing Uncertainty Quality
4.2.1 Predictive Log-Likelihood
4.2.2 Calibration
4.2.3 Downstream Applications
4.3 Bayes by Backprop
4.3.1 Practical VI
4.4 Probabilistic Backprop
4.4.1 Incorporating the Hyper-Priors p(λ) and p(γ)
4.4.2 Incorporating the Priors on the Weights p(w| λ)
4.4.2.1 Update Equations for αλ and βλ
4.4.2.2 Update Equations for the μ and σ2
4.4.3 Incorporating the Likelihood Factors p(y| W, X, γ)
4.4.3.1 The Normalizing Factor
4.5 MC Dropout
4.5.1 Dropout
4.5.2 A Bayesian View
4.6 Fast Natural Gradient
4.6.1 Vadam
4.7 Comparing the Methods
4.7.1 1-D Toy Example
4.7.2 UCI Data Sets
4.7.2.1 Boston Housing
4.7.2.2 Concrete Compressive Strength
4.7.2.3 Energy Efficiency
4.7.2.4 Kin8nm
4.7.2.5 Condition Based Maintenance of Naval Propulsion Plants
4.7.2.6 Combined Cycle Power Plant
4.7.2.7 Wine Quality
4.7.2.8 Yacht Hydrodynamics
4.7.3 Experimental Setup
4.7.3.1 Hyper-Parameter Search with Bayesian Optimization (BO)
4.7.4 Training Configuration
4.7.5 Analysis
4.8 Further References
4.9 Closing Remarks
References
5 Variational Autoencoder
5.1 Motivations
5.2 Evaluating Generative Networks
5.3 Variational Autoencoders
5.3.1 Conditional VAE
5.3.2 β-VAE
5.4 Importance Weighted Autoencoder
5.5 VAE Issues
5.5.1 Inexpressive Posterior
5.5.1.1 Full Covariance Gaussian
5.5.1.2 Auxiliary Latent Variables
5.5.1.3 Normalizing Flow
5.5.2 The Posterior Collapse
5.5.3 Latent Distributions
5.5.3.1 Continuous Relaxation
5.5.3.2 Vector Quantization
5.6 Experiments
5.6.1 Data Sets
5.6.1.1 MNIST
5.6.1.2 Fashion-MNIST
5.6.2 Experimental Setup
5.6.3 Results
5.7 Application: Generative Models on Semi-supervised Learning
5.8 Closing Remarks
5.9 Final Words
References
A Support Material
A.1 Gradient Estimators
A.2 Update Formula for CAVI
A.3 Generalized Gauss–Newton Approximation
A.4 Natural Gradient and the Fisher Information Matrix
A.5 Gaussian Gradient Identities
A.6 t-Student Distribution
References
Index
📜 SIMILAR VOLUMES
<span>Rank-Based Methods for Shrinkage and Selection</span><p><span>A practical and hands-on guide to the theory and methodology of statistical estimation based on rank</span></p><p><span>Robust statistics is an important field in contemporary mathematics and applied statistical methods. </span><spa
<p><span>This book is very beneficial for early researchers/faculty who want to work in deep learning and machine learning for the classification domain. It helps them study, formulate, and design their research goal by aligning the latest technologies studies’ image and data classifications. The ea
<p><span>Diagnostic Biomedical Signal and Image Processing Applications with Deep Learning Methods</span><span> presents comprehensive research on both medical imaging and medical signals analysis. The book discusses classification, segmentation, detection, tracking and retrieval applications of non
This book evaluates the role of innovative machine learning and deep learning methods in dealing with power system issues, concentrating on recent developments and advances that improve planning, operation, and control of power systems. Cutting-edge case studies from around the world consider predic