๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Regularized System Identification Learning Dynamic Models from Data

โœ Scribed by Gianluigi Pillonetto


Publisher
Springer International Publishing AG
Year
2022
Tongue
English
Leaves
394
Series
Communications and Control Engineering Series
Edition
1
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


This open access book provides a comprehensive treatment of recent developments in kernel-based identification that are of interest to anyone engaged in learning dynamic systems from data. The reader is led step by step into understanding of a novel paradigm that leverages the power of machine learning without losing sight of the system-theoretical principles of black-box identification. The authorsโ€™ reformulation of the identification problem in the light of regularization theory not only offers new insight on classical questions, but paves the way to new and powerful algorithms for a variety of linear and nonlinear problems. Regression methods such as regularization networks and support vector machines are the basis of techniques that extend the function-estimation problem to the estimation of dynamic models. Many examples, also from real-world applications, illustrate the comparative advantages of the new nonparametric approach with respect to classic parametric prediction error methods. The challenges it addresses lie at the intersection of several disciplines so Regularized System Identification will be of interest to a variety of researchers and practitioners in the areas of control systems, machine learning, statistics, and data science. This is an open access book.

โœฆ Table of Contents


Preface
Acknowledgements
Contents
Abbreviations andย Notation
Notation
Abbreviations
1 Bias
1.1 The Stein Effect
1.1.1 The Jamesโ€“Stein Estimator
1.1.2 Extensions of the Jamesโ€“Stein Estimator
1.2 Ridge Regression
1.3 Further Topics and Advanced Reading
1.4 Appendix: Proof of Theorem 1.1
References
2 Classical System Identification
2.1 The State-of-the-Art Identification Setup
2.2 mathcalM: Model Structures
2.2.1 Linear Time-Invariant Models
2.2.2 Nonlinear Models
2.3 mathcalI: Identification Methodsโ€”Criteria
2.3.1 A Maximum Likelihood (ML) View
2.4 Asymptotic Properties of the Estimated Models
2.4.1 Bias and Variance
2.4.2 Properties of the PEM Estimate as Ntoinfty
2.4.3 Trade-Off Between Bias and Variance
2.5 X: Experiment Design
2.6 mathcalV: Model Validation
2.6.1 Falsifying Models: Residual Analysis
2.6.2 Comparing Different Models
2.6.3 Cross-Validation
References
3 Regularization of Linear Regression Models
3.1 Linear Regression
3.2 The Least Squares Method
3.2.1 Fundamentals of the Least Squares Method
3.2.2 Mean Squared Error and Model Order Selection
3.3 Ill-Conditioning
3.3.1 Ill-Conditioned Least Squares Problems
3.3.2 Ill-Conditioning in System Identification
3.4 Regularized Least Squares with Quadratic Penalties
3.4.1 Making an Ill-Conditioned LS Problem Well Conditioned
3.4.2 Equivalent Degrees of Freedom
3.5 Regularization Tuning for Quadratic Penalties
3.5.1 Mean Squared Error and Expected Validation Error
3.5.2 Efficient Sample Reuse
3.5.3 Expected In-Sample Validation Error
3.6 Regularized Least Squares with Other Types of Regularizers
3.6.1 ell1-Norm Regularization
3.6.2 Nuclear Norm Regularization
3.7 Further Topics and Advanced Reading
3.8 Appendix
3.8.1 Fundamentals of Linear Algebra
3.8.2 Proof of Lemma 3.1
3.8.3 Derivation of Predicted Residual Error Sum of Squares (PRESS)
3.8.4 Proof of Theorem 3.7
3.8.5 A Variant of the Expected In-Sample Validation Error and Its Unbiased Estimator
References
4 Bayesian Interpretation of Regularization
4.1 Preliminaries
4.2 Incorporating Prior Knowledge via Bayesian Estimation
4.2.1 Multivariate Gaussian Variables
4.2.2 The Gaussian Case
4.2.3 The Linear Gaussian Model
4.2.4 Hierarchical Bayes: Hyperparameters
4.3 Bayesian Interpretation of the Jamesโ€“Stein Estimator
4.4 Full and Empirical Bayes Approaches
4.5 Improper Priors and the Bias Space
4.6 Maximum Entropy Priors
4.7 Model Approximation via Optimal Projection
4.8 Equivalent Degrees of Freedom
4.9 Bayesian Function Reconstruction
4.10 Markov Chain Monte Carlo Estimation
4.11 Model Selection Using Bayes Factors
4.12 Further Topics and Advanced Reading
4.13 Appendix
4.13.1 Proof of Theorem 4.1
4.13.2 Proof of Theorem 4.2
4.13.3 Proof of Lemma 4.1
4.13.4 Proof of Theorem 4.3
4.13.5 Proof of Theorem 4.6
4.13.6 Proof of Proposition 4.3
4.13.7 Proof of Theorem 4.8
References
5 Regularization for Linear System Identification
5.1 Preliminaries
5.2 MSE and Regularization
5.3 Optimal Regularization for FIR Models
5.4 Bayesian Formulation and BIBO Stability
5.5 Smoothness and Contractivity: Time- and Frequency-Domain Interpretations
5.5.1 Maximum Entropy Priors for Smoothness and Stability: From Splines to Dynamical Systems
5.6 Regularization and Basis Expansion
5.7 Hankel Nuclear Norm Regularization
5.8 Historical Overview
5.8.1 The Distributed Lag Estimator: Prior Means and Smoothing
5.8.2 Frequency-Domain Smoothing and Stability
5.8.3 Exponential Stability and Stochastic Embedding
5.9 Further Topics and Advanced Reading
5.10 Appendix
5.10.1 Optimal Kernel
5.10.2 Proof of Lemma 5.1
5.10.3 Proof of Theorem 5.5
5.10.4 Proof of Corollary 5.1
5.10.5 Proof of Lemma 5.2
5.10.6 Proof of Theorem 5.6
5.10.7 Proof of Lemma 5.5
5.10.8 Forward Representations of Stable-Splines Kernels
References
6 Regularization in Reproducing Kernel Hilbert Spaces
6.1 Preliminaries
6.2 Reproducing Kernel Hilbert Spaces
6.2.1 Reproducing Kernel Hilbert Spaces Induced by Operations on Kernels
6.3 Spectral Representations of Reproducing Kernel Hilbert Spaces
6.3.1 More General Spectral Representation
6.4 Kernel-Based Regularized Estimation
6.4.1 Regularization in Reproducing Kernel Hilbert Spaces and the Representer Theorem
6.4.2 Representer Theorem Using Linear and Bounded Functionals
6.5 Regularization Networks and Support Vector Machines
6.5.1 Regularization Networks
6.5.2 Robust Regression via Huber Loss
6.5.3 Support Vector Regression
6.5.4 Support Vector Classification
6.6 Kernels Examples
6.6.1 Linear Kernels, Regularized Linear Regression and System Identification
6.6.2 Kernels Given by a Finite Number of Basis Functions
6.6.3 Feature Map and Feature Space
6.6.4 Polynomial Kernels
6.6.5 Translation Invariant and Radial Basis Kernels
6.6.6 Spline Kernels
6.6.7 The Bias Space and the Spline Estimator
6.7 Asymptotic Properties
6.7.1 The Regression Function/Optimal Predictor
6.7.2 Regularization Networks: Statistical Consistency
6.7.3 Connection with Statistical Learning Theory
6.8 Further Topics and Advanced Reading
6.9 Appendix
6.9.1 Fundamentals of Functional Analysis
6.9.2 Proof of Theorem 6.1
6.9.3 Proof of Theorem 6.10
6.9.4 Proof of Theorem 6.13
6.9.5 Proofs of Theorems 6.15 and 6.16
6.9.6 Proof of Theorem 6.21
References
7 Regularization in Reproducing Kernel Hilbert Spaces for Linear System Identification
7.1 Regularized Linear System Identification in Reproducing Kernel Hilbert Spaces
7.1.1 Discrete-Time Case
7.1.2 Continuous-Time Case
7.1.3 More General Use of the Representer Theorem for Linear System Identification
7.1.4 Connection with Bayesian Estimation of Gaussian Processes
7.1.5 A Numerical Example
7.2 Kernel Tuning
7.2.1 Marginal Likelihood Maximization
7.2.2 Stein's Unbiased Risk Estimator
7.2.3 Generalized Cross-Validation
7.3 Theory of Stable Reproducing Kernel Hilbert Spaces
7.3.1 Kernel Stability: Necessary and Sufficient Conditions
7.3.2 Inclusions of Reproducing Kernel Hilbert Spaces in More General Lebesque Spaces
7.4 Further Insights into Stable Reproducing Kernel Hilbert Spaces
7.4.1 Inclusions Between Notable Kernel Classes
7.4.2 Spectral Decomposition of Stable Kernels
7.4.3 Mercer Representations of Stable Reproducing Kernel Hilbert Spaces and of Regularized Estimators
7.4.4 Necessary and Sufficient Stability Condition Using Kernel Eigenvectors and Eigenvalues
7.5 Minimax Properties of the Stable Spline Estimator
7.5.1 Data Generator and Minimax Optimality
7.5.2 Stable Spline Estimator
7.5.3 Bounds on the Estimation Error and Minimax Properties
7.6 Further Topics and Advanced Reading
7.7 Appendix
7.7.1 Derivation of the First-Order Stable Spline Norm
7.7.2 Proof of Proposition 7.1
7.7.3 Proof of Theorem 7.5
7.7.4 Proof of Theorem 7.7
7.7.5 Proof of Theorem 7.9
References
8 Regularization for Nonlinear System Identification
8.1 Nonlinear System Identification
8.2 Kernel-Based Nonlinear System Identification
8.2.1 Connection with Bayesian Estimation of Gaussian Random Fields
8.2.2 Kernel Tuning
8.3 Kernels for Nonlinear System Identification
8.3.1 A Numerical Example
8.3.2 Limitations of the Gaussian and Polynomial Kernel
8.3.3 Nonlinear Stable Spline Kernel
8.3.4 Numerical Example Revisited: Use of the Nonlinear Stable Spline Kernel
8.4 Explicit Regularization of Volterra Models
8.5 Other Examples of Regularization in Nonlinear System Identification
8.5.1 Neural Networks and Deep Learning Models
8.5.2 Static Nonlinearities and Gaussian Process (GP)
8.5.3 Block-Oriented Models
8.5.4 Hybrid Models
8.5.5 Sparsity and Variable Selection
References
9 Numerical Experiments and Real World Cases
9.1 Identification of Discrete-Time Output Error Models
9.1.1 Monte Carlo Studies with a Fixed Output Error Model
9.1.2 Monte Carlo Studies with Different Output Error Models
9.1.3 Real Data: A Robot Arm
9.1.4 Real Data: A Hairdryer
9.2 Identification of ARMAX Models
9.2.1 Monte Carlo Experiment
9.2.2 Real Data: Temperature Prediction
9.3 Multi-task Learning and Population Approaches
9.3.1 Kernel-Based Multi-task Learning
9.3.2 Numerical Example: Real Pharmacokinetic Data
References
Appendix Index
Index


๐Ÿ“œ SIMILAR VOLUMES


Dynamic Mode Decomposition: Data-Driven
โœ J. Nathan Kutz, Steven L. Brunton, Bingni W. Brunton, Joshua L. Proctor ๐Ÿ“‚ Library ๐Ÿ“… 2016 ๐Ÿ› SIAM-Society for Industrial and Applied Mathematic ๐ŸŒ English

Data-driven dynamical systems is a burgeoning fieldโ€”it connects how measurements of nonlinear dynamical systems and/or complex systems can be used with well-established methods in dynamical systems theory. This is a critically important new direction because the governing equations of many problems