Machine Learning Fundamentals: A Concise Introduction
β Scribed by Hui Jiang
- Publisher
- Cambridge University Press
- Year
- 2022
- Tongue
- English
- Leaves
- 423
- Edition
- New
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
This lucid, accessible introduction to supervised machine learning presents core concepts in a focused and logical way that is easy for beginners to follow. The author assumes basic calculus, linear algebra, probability and statistics but no prior exposure to machine learning. Coverage includes widely used traditional methods such as SVMs, boosted trees, HMMs, and LDAs, plus popular deep learning methods such as convolution neural nets, attention, transformers, and GANs. Organized in a coherent presentation framework that emphasizes the big picture, the text introduces each method clearly and concisely βfrom scratchβ based on the fundamentals. All methods and algorithms are described by a clean and consistent style, with a minimum of unnecessary detail. Numerous case studies and concrete examples demonstrate how the methods can be applied in a variety of contexts.
β¦ Table of Contents
Front matter
Copyright
Contents
Preface
Notation
1 Introduction
1.1 What Is Machine Learning?
1.2 Basic Concepts in Machine Learning
1.2.1 Classification versus Regression
1.2.2 Supervised versus Unsupervised Learning
1.2.3 Simple versus Complex Models
1.2.4 Parametric versus Nonparametric Models
1.2.5 Overfitting versus Underfitting
1.2.6 BiasβVariance Trade-Off
1.3 General Principles in Machine Learning
1.3.1 Occamβs Razor
1.3.2 No-Free-Lunch Theorem
1.3.3 Law of the Smooth World
1.3.4 Curse of Dimensionality
1.4 Advanced Topics in Machine Learning
1.4.1 Reinforcement Learning
1.4.2 Meta-Learning
1.4.3 Causal Inference
1.4.4 Other Advanced Topics
Exercises
2 Mathematical Foundation
2.1 Linear Algebra
2.1.1 Vectors and Matrices
2.1.2 Linear Transformation as Matrix Multiplication
2.1.3 Basic Matrix Operations
2.1.4 Eigenvalues and Eigenvectors
2.1.5 Matrix Calculus
2.2 Probability and Statistics
2.2.1 Random Variables and Distributions
2.2.2 Expectation: Mean, Variance, and Moments
2.2.3 Joint, Marginal, and Conditional Distributions
2.2.4 Common Probability Distributions
2.2.5 Transformation of Random Variables
2.3 Information Theory
2.3.1 Information and Entropy
2.3.2 Mutual Information
2.3.3 KL Divergence
2.4 Mathematical Optimization
2.4.1 General Formulation
2.4.2 Optimality Conditions
2.4.3 Numerical Optimization Methods
Exercises
3 Supervised Machine Learning (in a Nutshell)
3.1 Overview
3.2 Case Studies
4 Feature Extraction
4.1 Feature Extraction: Concepts
4.1.1 Feature Engineering
4.1.2 Feature Selection
4.1.3 Dimensionality Reduction
4.2 Linear Dimension Reduction
4.2.1 Principal Component Analysis
4.2.2 Linear Discriminant Analysis
4.3 Nonlinear Dimension Reduction (I): Manifold Learning
4.3.1 Locally Linear Embedding
4.3.2 Multidimensional Scaling
4.3.3 Stochastic Neighborhood Embedding
4.4 Nonlinear Dimension Reduction (II): Neural Networks
4.4.1 Autoencoder
4.4.2 Bottleneck Features
Lab Project I
Exercises
DISCRIMINATIVE MODELS
5 Statistical Learning Theory
5.1 Formulation of Discriminative Models
5.2 Learnability
5.3 Generalization Bounds
5.3.1 Finite Model Space: |H|
5.3.2 Infinite Model Space: VC Dimension
Exercises
6 Linear Models
6.1 Perceptron
6.2 Linear Regression
6.3 Minimum Classification Error
6.4 Logistic Regression
6.5 Support Vector Machines
6.5.1 Linear SVM
6.5.2 Soft SVM
6.5.3 Nonlinear SVM: The Kernel Trick
6.5.4 Solving Quadratic Programming
6.5.5 Multiclass SVM
Lab Project II
Exercises
7 Learning Discriminative Models in General
7.1 A General Framework to Learn Discriminative Models
7.1.1 Common Loss Functions in Machine Learning
7.1.2 Regularization Based on Lp Norm
7.2 Ridge Regression and LASSO
7.3 Matrix Factorization
7.4 Dictionary Learning
Lab Project III
Exercises
8 Neural Networks
8.1 Artificial Neural Networks
8.1.1 Basic Formulation of Artificial Neural Networks
8.1.2 Mathematical Justification: Universal Approximator
8.2 Neural Network Structures
8.2.1 Basic Building Blocks to Connect Layers
8.2.2 Case Study I: Fully Connected Deep Neural Networks
8.2.3 Case Study II: Convolutional Neural Networks
8.2.4 Case Study III: Recurrent Neural Networks (RNNs)
8.2.5 Case Study IV: Transformer
8.3 Learning Algorithms for Neural Networks
8.3.1 Loss Function
8.3.2 Automatic Differentiation
8.3.3 Optimization Using Stochastic Gradient Descent
8.4 Heuristics and Tricks for Optimization
8.4.1 Other SGD Variant Optimization Methods: ADAM
8.4.2 Regularization
8.4.3 Fine-Tuning Tricks
8.5 End-to-End Learning
8.5.1 Sequence-to-Sequence Learning
Lab Project IV
Exercises
9 Ensemble Learning
9.1 Formulation of Ensemble Learning
9.1.1 Decision Trees
9.2 Bagging
9.2.1 Random Forests
9.3 Boosting
9.3.1 Gradient Boosting
9.3.2 AdaBoost
9.3.3 Gradient Tree Boosting
Lab Project V
Exercises
GENERATIVE MODELS
10 Overview of Generative Models
10.1 Formulation of Generative Models
10.2 Bayesian Decision Theory
10.2.1 Generative Models for Classification
10.2.2 Generative Models for Regression
10.3 Statistical Data Modeling
10.3.1 Plug-In MAP Decision Rule
10.4 Density Estimation
10.4.1 Maximum-Likelihood Estimation
10.4.2 Maximum-Likelihood Classifier
10.5 Generative Models (in a Nutshell)
10.5.1 Generative versus Discriminative Models
Exercises
11 Unimodal Models
11.1 Gaussian Models
11.2 Multinomial Models
11.3 Markov Chain Models
11.4 Generalized Linear Models
11.4.1 Probit Regression
11.4.2 Poisson Regression
11.4.3 Log-Linear Models
Exercises
12 Mixture Models
12.1 Formulation of Mixture Models
12.1.1 Exponential Family (e-Family)
12.1.2 Formal Definition of Mixture Models
12.2 Expectation-Maximization Method
12.2.1 Auxiliary Function: Eliminating Log-Sum
12.2.2 Expectation-Maximization Algorithm
12.3 Gaussian Mixture Models
12.3.1 K-Means Clustering for Initialization
12.4 Hidden Markov Models
12.4.1 HMMs: Mixture Models for Sequences
12.4.2 Evaluation Problem: ForwardβBackward Algorithm
12.4.3 Decoding Problem: Viterbi Algorithm
12.4.4 Training Problem: BaumβWelch Algorithm
Lab Project VI
Exercises
13 Entangled Models
13.1 Formulation of Entangled Models
13.1.1 Framework of Entangled Models
13.1.2 Learning of Entangled Models in General
13.2 Linear Gaussian Models
13.2.1 Probabilistic PCA
13.2.2 Factor Analysis
13.3 Non-Gaussian Models
13.3.1 Independent Component Analysis (ICA)
13.3.2 Independent Factor Analysis (IFA)
13.3.3 Hybrid Orthogonal Projection and Estimation (HOPE)
13.4 Deep Generative Models
13.4.1 Variational Autoencoders (VAE)
13.4.2 Generative Adversarial Nets (GAN)
Exercises
14 Bayesian Learning
14.1 Formulation of Bayesian Learning
14.1.1 Bayesian Inference
14.1.2 Maximum a Posterior Estimation
14.1.3 Sequential Bayesian Learning
14.2 Conjugate Priors
14.2.1 Maximum-Marginal-Likelihood Estimation
14.3 Approximate Inference
14.3.1 Laplaceβs Method
14.3.2 Variational Bayesian (VB) Methods
14.4 Gaussian Processes
14.4.1 Gaussian Processes as Nonparametric Priors
14.4.2 Gaussian Processes for Regression
14.4.3 Gaussian Processes for Classification
Exercises
15 Graphical Models
15.1 Concepts of Graphical Models
15.2 Bayesian Networks
15.2.1 Conditional Independence
15.2.2 Representing Generative Models as Bayesian Networks
15.2.3 Learning Bayesian Networks
15.2.4 Inference Algorithms
15.2.5 Case Study I: Naive Bayes Classifier
15.2.6 Case Study II: Latent Dirichlet Allocation
15.3 Markov Random Fields
15.3.1 Formulation: Potential and Partition Functions
15.3.2 Case Study III: Conditional Random Fields
15.3.3 Case Study IV: Restricted Boltzmann Machines
Exercises
Appendix
A Other Probability Distributions
Bibliography
Index
π SIMILAR VOLUMES
<p><b>AN INTRODUCTION TO MACHINE LEARNING THAT INCLUDES THE FUNDAMENTAL TECHNIQUES, METHODS, AND APPLICATIONS</b></p> <p><i>Machine Learning: a Concise Introduction </i>offers a comprehensive introduction to the core concepts, approaches, and applications of machine learning. The authorβan expert in
<b>An introduction to machine learning that includes the fundamental techniques, methods, and applications</b><br /><br /><i>Machine Learning: a Concise Introduction</i>offers a comprehensive introduction to the core concepts, approaches, and applications of machine learning. The author--a noted exp
The emphasis of the book is on the question of Why - only if why an algorithm is successful is understood, can it be properly applied, and the results trusted. Algorithms are often taught side by side without showing the similarities and differences between them. This book addresses the commonalitie
"Machine Learning is known by many different names, and is used in many areas of science. It is also used for a variety of applications, including spam filtering, optical character recognition, search engines, computer vision, NLP, advertising, fraud detection, robotics, data prediction, astronomy.
<p>The emphasis of the book is on the question of Why β only if why an algorithm is successful is understood, can it be properly applied, and the results trusted. Algorithms are often taught side by side without showing the similarities and differences between them. This book addresses the commonali