<p><span>Symbolic regression (SR) is one of the most powerful machine learning techniques that produces transparent models, searching the space of mathematical expressions for a model that represents the relationship between the predictors and the dependent variable without the need of taking assump
Symbolic Regression
β Scribed by Gabriel Kronberger, Bogdan Burlacu, Michael Kommenda, Stephan M. Winkler, Michael Affenzeller
- Publisher
- CRC Press
- Year
- 2025
- Tongue
- English
- Leaves
- 308
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Table of Contents
Cover
Half Title
Title Page
Copyright Page
Contents
Preface
Symbols and Notation
1. Introduction
2. Basics of Supervised Learning
2.1. Introduction
2.2. Regression
2.2.1. Linear Models
2.2.2. Nonlinear Models
2.2.3. Error Measures
2.3. Classification
2.4. Time Series Prediction
2.5. Model Selection
2.6. Cross-validation
2.7. Further Reading
3. Basics of Symbolic Regression
3.1. Example: Identification of a Polynomial
3.1.1. Data Collection and Preprocessing
3.1.2. Establishing a Baseline
3.1.3. Modeling Approach
3.1.4. Modeling Results
3.2. Example: Discovery of Laws of Physics from Data
3.3. Example: Approximation of the Gamma Function
3.4. Extending Symbolic Regression to Classification
3.4.1. Model Structures for Symbolic Classification
3.4.2. Evaluation of Symbolic Classification Models
3.5. Further Reading
4. Evolutionary Computation and Genetic Programming
4.1. General Concepts
4.1.1. Genotype, Phenotype, and Semantics
4.1.2. Diversity and Evolvability
4.1.3. Buffering, Redundancy, and Neutrality
4.2. Population Initialization
4.2.1. Operators
4.3. Fitness Calculation
4.4. Parent Selection
4.4.1. Operators
4.4.2. Selection Pressure
4.5. Bloat and Introns
4.6. Crossover and Mutation
4.7. Power of the Hypothesis Space
4.8. GP Dynamics
4.8.1. Fitness
4.8.2. Variable Relevance
4.8.3. Model Complexity
4.8.4. Diversity
4.9. Algorithmic Extensions
4.9.1. Brood Selection and Offspring Selection
4.9.2. Age-layered Population Structures
4.9.3. Multi-objective GP
4.9.4. Alternative Encodings: Linear and Graph GP
4.9.5. Restricting Expressions: Syntax and Types
4.9.6. Semantics-aware GP
4.10. Conclusions
4.11. Further Reading
5. Model Validation, Inspection, Simplification, and Selection
5.1. Model Validation
5.1.1. Visual Tools
5.1.2. Explaining Models
5.1.3. Model Interpretability
5.2. Model Selection
5.2.1. Criteria for Model Selection
5.2.2. Hold-out Set for Validation
5.2.3. Cross-validation
5.2.4. Akaikeβs Information Criterion
5.2.5. Bayesian Information Criterion
5.2.6. Minimum Description Length Principle
5.2.7. Comparison of Model Selection Criteria
5.3. Model Simplification
5.3.1. Nested Models
5.3.2. Removal of Subexpressions
5.4. Example: Boston Housing
5.4.1. Data Preprocessing
5.4.2. Model Generation and Selection for Median Values of Homes
5.4.3. Model Generation and Selection for NOX Concentrations
5.5. Conclusions
5.6. Further Reading
6. Advanced Techniques
6.1. Integration of Knowledge
6.1.1. Example Applications
6.1.2. Knowledge Integration Methods
6.1.3. Knowledge Integration via Customized Fitness Evaluation
6.1.4. Shape Constraints
6.1.5. Knowledge Integration via the Hypothesis Space
6.2. Optimization of Coefficients
6.2.1. Linear Scaling
6.2.2. Nonlinear Optimization of Coefficients
6.3. Prediction Intervals
6.3.1. Prediction intervals for Linear Models
6.3.2. Approximate Prediction Intervals for Nonlinear Models
6.3.3. Bayesian Prediction Intervals
6.4. Modeling System Dynamics
6.4.1. Basics of Differential Equations
6.4.2. Finding Differential Equations with Symbolic Regression
6.5. Non-numeric Data
6.6. Non-evolutionary Symbolic Regression
6.6.1. Fast Function Extraction
6.6.2. Sparse Identification of Nonlinear Dynamics (SINDy)
6.6.3. Prioritized Grammar Enumeration
6.6.4. Exhaustive Symbolic Regression
6.6.5. Grammar-guided Exhaustive Equation Search
6.6.6. Equation Learner
6.6.7. Deep Symbolic Regression
6.6.8. AI Feynman
7. Examples and Applications
7.1. Yacht Hydrodynamics
7.2. Industrial Chemical Processes
7.2.1. Correlation Analysis
7.2.2. Experimental Setup
7.2.3. Results for the Chemical Dataset
7.2.4. Results for the Tower Dataset
7.3. Interatomic Potentials
7.4. Friction
7.5. Lithium-ion Batteries
7.5.1. NASA PCoE Battery Datasets
7.5.2. First Version of the State-of-charge Model
7.5.3. Extended Version of the State-of-charge Model
7.5.4. Predicting the Discharge Voltage Curve
7.6. Biomedical Problems
7.6.1. Identification of Virtual Tumor Markers
7.7. Function Approximation
7.8. Atmospheric CO2 Concentration
7.9. Flow Stress
7.10. Dynamics of Simple Mechanical Systems
7.10.1. Oscillator
7.10.2. Pendulum
7.10.3. Double Oscillator
7.10.4. Double Pendulum
7.11. Conclusions
8. Conclusion
8.1. Unique Selling Points of Symbolic Regression
8.2. Limitations and Caveats
9. Appendix
9.1. Benchmarks
9.2. Open-source Software for Genetic Programming
9.2.1. HeuristicLab
9.2.2. Operon
9.2.3. PySR
9.2.4. DEAP
9.2.5. FEAT
9.2.6. ECJ
9.2.7. GPTIPS
9.3. Commercial Software for Genetic Programming
9.3.1. Evolved Analytics DataModeler
9.3.2. Eureqa Formulize
Bibliography
Index
π SIMILAR VOLUMES
<p><span>Symbolic regression (SR) is one of the most powerful machine learning techniques that produces transparent models, searching the space of mathematical expressions for a model that represents the relationship between the predictors and the dependent variable without the need of taking assump
This book provides comprehensive coverage on a new direction in computational mathematics research: automatic search for formulas. Formulas must be sought in all areas of science and life: these are the laws of the universe, the macro and micro world, fundamental physics, engineering, weather and na
This book provides comprehensive coverage on a new direction in computational mathematics research: automatic search for formulas. Formulas must be sought in all areas of science and life: these are the laws of the universe, the macro and micro world, fundamental physics, engineering, weather and na