This book provides a comprehensive coverage of Machine Learning (ML) methods that have proven useful in process industry for dynamic process modeling. Step-by-step instructions, supported with industry-relevant case studies, show (using Python) how to develop solutions for process modeling, process
Machine Learning in Python for Process Systems Engineering: Achieving operational excellence using process data
β Scribed by Ankur Kumar, Jesus Flores-Cerrillo
- Year
- 2022
- Tongue
- English
- Leaves
- 352
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Perhaps you are reading this book because you too have been inspired by the capabilities
of machine learning and would like to use it to solve problems being faced by your
organization. However, you might be struggling to find a definite guide that can help you
decide which specific methodology to chose among the myriad of available
methodologies. You may have come across a nice research article that showcases an
interesting process systems application of a ML method. However, you might be facing
difficulties trying to understand the intricate details of the algorithm. We wonβt be surprised
if you have struggled to find a data-science book that caters to the needs of a process
systems engineer, considers unique characteristics of industrial process systems, and
uses industrial-scale process systems for illustrations. We, the authors, have been in that
phase. A process engineer will arguably find it more relevant and useful to learn principal
component analysis (PCA) by working through a process monitoring application (the most
popular application area of PCA in process industry) and learning how to compute the
monitoring metrics. Similar arguments could be made for several other popular ML
methods. There is a gap in available machine learning resources for industrial
practitioners and this book attempts to cover this gap.
In one sense, we wrote this book for our younger selves; a book that we wish had existed
when we started experimenting with machine learning techniques. Drawing from our
years of experience in developing data-driven industrial solutions, this book has been
written with the focus on de-cluttering the world of machine learning, giving a
comprehensive exposition of ML tools that have proven useful in process industry,
providing step-by-step elucidation of implementation details, cautioning against the`
pitfalls and listing various tips & tricks that we have encountered over the years, and using
dataset from industrial-scale process systems for illustrations. We strongly believe in
βlearning by doingβ and therefore we encourage the readers to work through in-chapter
illustrations as they follow along the text. For readerβs assistance, Jupyter notebooks with
complete code implementations are available for download. We have chosen Python as
the coding language for the book as it convenient to use, has large collection of ML
libraries, and is the de facto standard language for ML. No prior experience with Python
is assumed. The book has been designed to teach machine learning from scratch and
upon completion, the reader will feel comfortable at using ML techniques.
β¦ Table of Contents
Preface
Part 1 Introduction and Fundamentals
β’ Chapter 1 Machine Learning for Process Systems Engineering
o 1.1 What are Process Systems
βͺ 1.1.1 Characteristics of process data
o 1.2 What is Machine Learning
βͺ 1.2.1 Machine learning workflow
βͺ 1.2.2 Type of machine learning systems
o 1.3 Machine Learning Applications in Process Industry
βͺ 1.3.1 Decision hierarchy levels in a process plant
βͺ 1.3.2 Application areas
o 1.4 ML Solution Deployment
o 1.5 The Future of Process Data Science
β’ Chapter 2 The Scripting Environment
o 2.1 Introduction to Python
o 2.2 Introduction to Spyder and Jupyter
o 2.3 Python Language: Basics
o 2.4 Scientific Computing Packages: Basics
βͺ 2.4.1 Numpy
βͺ 2.4.2 Pandas
o 2.5 Typical ML Script 20
β’ Chapter 3 Machine Learning Model Development: Workflow and Best Practices
o 3.1 ML Model Development Workflow
o 3.2 Data Pre-processing: Data Transformation
βͺ 3.2.1 (Robust) Data centering & scaling
βͺ 3.2.2 Feature extraction
βͺ 3.2.3 Feature engineering
βͺ 3.2.4 Workflow automation via pipelines
o 3.3 Model Evaluation
βͺ 3.3.1 Regression metrics
βͺ 3.3.2 Classification metrics
βͺ 3.3.3 Holdout method / cross-validation
βͺ 3.3.4 Residual analysis
o 3.4 Model Tuning
βͺ 3.4.1 Overfitting & underfitting
βͺ 3.4.2 Train/validation/test split
βͺ 3.3.3 K-fold cross-validation
βͺ 3.4.4 Regularization
βͺ 3.4.5 Hyperparameter optimization via GridSearchCV 39
β’ Chapter 4 Data Pre-processing: Cleaning Process Data
o 4.1 Signal De-noising
βͺ 4.1.1 Moving window average filter
βͺ 4.1.2 SG filter 674.2 Variable Selection/Feature Selection
βͺ 4.2.1 Filter methods
βͺ 4.2.2 Wrapper methods
βͺ 4.2.3 Embedded methods
4.3 Outlier Handling
βͺ 4.3.1 Univariate methods
βͺ 4.3.2 Multivariate methods
βͺ 4.3.3 Data-mining methods
4.4 Handling Missing Data
Part 2 Classical Machine Learning Methods
β’ Chapter 5 Dimension Reduction and Latent Variable Methods (Part 1)
o 5.1 PCA: An Introduction
βͺ 5.1.1 Mathematical background
βͺ 5.1.2 Dimensionality reduction for polymer manufacturing process
o 5.2 Process Monitoring via PCA for Polymer Manufacturing Process
βͺ 5.2.1 Process monitoring/fault detection indices
βͺ 5.2.2 Fault detection
βͺ 5.2.3 Fault diagnosis
o 5.3 Variants of Classical PCA
βͺ 5.3.1 Dynamic PCA
βͺ 5.3.2 Multiway PCA
βͺ 5.3.3 Kernel PCA
o 5.4 PLS: An Introduction
βͺ 5.4.1 Mathematical background
o 5.5 Soft Sensing via PLS for Pulp & Paper Manufacturing Process
o 5.6 Process monitoring via PLS for Polyethylene Manufacturing Process
βͺ 5.6.1 Fault detection indices
βͺ 5.6.2 Fault detection
o 5.7 Variants of Classical PLS
β’ Chapter 6 Dimension Reduction and Latent Variable Methods (Part 2)
o 6.1 ICA: An Introduction
βͺ 6.1.1 Mathematical background
βͺ 6.1.2 Complex chemical process: Tennessee Eastman Process
βͺ 6.1.3 Deciding number of ICs
o 6.2 Process Monitoring via ICA for Tennessee Eastman Process
βͺ 6.2.1 Fault detection indices
βͺ 6.2.2 Fault detection
6.3 FDA: An Introduction
βͺ 6.3.1 Mathematical background
βͺ 6.3.2 Dimensionality reduction for Tennessee Eastman Process
o 6.4 Fault Classification via FDA for Tennessee Eastman Process 120
β’ Chapter 7 Support Vector Machines & Kernel-based Learning
o 7.1 SVMs: An Introduction
βͺ 7.1.1 Mathematical background
βͺ 7.1.2 Hard margin vs soft margin classification
o 7.2 The Kernel Trick for Nonlinear Data
βͺ 7.2.1 Mathematical background
7.3 SVDD: An Introduction 142
7.3.1 Mathematical background
7.3.2 OC-SVM vs SVDD
7.3.3 Bandwidth parameter and SVDD illustration
7.4 Process Fault Detection via SVDD
7.5 SVR: An Introduction
βͺ 7.5.1 Mathematical background
7.6 Soft Sensing via SVR in a Polymer Processing Plant
7.7 Soft Sensing via SVR for Debutanizer Column in a Petroleum Refinery
β’ Chapter 8 Finding Groups in Process Data: Clustering & Mixture Modeling
o 8.1 Clustering: An Introduction
βͺ 8.1.1 Multimode semiconductor manufacturing process
o 8.2 Centroid-based Clustering: K-Means
βͺ 8.2.1 Determining the number of clusters via elbow method
βͺ 8.2.2 Silhouette analysis for quantifying clusters quality
βͺ 8.2.3 Pros and cons
8.3 Density-based Clustering: DBSCAN
βͺ 8.3.3 Pros and cons
o 8.4 Probabilistic Clustering: Gaussian Mixtures
βͺ 8.4.1 Mathematical background
βͺ 8.4.2 Determining the number of clusters
o 8.5 Multimode Process Monitoring via GMM for Semiconductor Manufacturing Process
βͺ 8.5.1 Fault detection indices
βͺ 8.5.2 Fault detection β’ Chapter 9 Decision Trees & Ensemble Learning
o 9.1 Decision Trees: An Introduction
βͺ 9.1.1 Mathematical background
o 9.2 Random Forests: An Introduction
βͺ 9.2.1 Mathematical background
o 9.3 Soft Sensing via Random Forest in Concrete Construction Industry
βͺ 9.3.1 Feature importances
o 9.4 Introduction to Ensemble Learning
βͺ 9.4.1 Bagging
βͺ 9.4.2 Boosting
o 9.5 Effluent Quality Prediction in Wastewater Treatment Plant via XGBoost 192
β’ Chapter 10 Other Useful Classical ML Techniques
o 10.1 KDE: An Introduction
βͺ 10.1.1 Mathematical background
βͺ 10.1.2 Deciding KDE hyperparameters
o 10.2 Determining Monitoring Metric Control Limit via KDE
o 10.3 kNN: An Introduction
βͺ 10.3.1 Mathematical background
βͺ 10.3.2 Deciding kNN hyperparameters
βͺ 10.3.3 Applications of kNN for process systems
o 10.4 Process Fault Detection via kNN for semiconductor Manufacturing Process
o 10.5 Combining ML Techniques 214
Part 3 Artificial Neural Networks & Deep Learning
β’ Chapter 11 Feedforward Neural Networks
o 11.1 ANN: An Introduction
βͺ 11.1.1 Deep learning
βͺ 11.1.2 TensorFlow
o 11.2 Process Modeling via FFNN for Combined Cycle Power Plant
o 11.3 Mathematical Background
βͺ 11.3.1 Activation functions
βͺ 11.3.2 Loss functions & cost functions
βͺ 11.3.3 Gradient descent optimization
βͺ 11.3.4 Epochs & batch-size
βͺ 11.3.5 Backpropagation
βͺ 11,3,6 Vanishing/Exploding gradients
o 11.4 Nonlinearity in Neural Nets (Width vs Depth)
o 11.5 Neural Net Hyperparameter Optimization
o 11.6 Strategies for Improved Network Training
βͺ 11.6.1 Early stopping
βͺ 11.6.2 Regularization
βͺ 11.6.3 Initialization
βͺ 11.6.4 Batch normalization
o 11.7 Soft Sensing via FFNN for Debutanizer Column in a Petroleum Refinery
o FFNN Modeling Guidelines
β’ Chapter 12 Recurrent Neural Networks
o 12.1 RNN: An Introduction
βͺ 12.1.1 RNN outputs
βͺ 12.1.2 LSTM networks
o 12.2 System Identification via LSTM RNN for SISO Heater System
o 12.3 Mathematical Background
o 12.4 Stacked/Deep RNNs
o 12.5 Fault Classification vis LSTM for Tennessee Eastman Process
o 12.6 Predictive Maintenance using LSTM Networks
βͺ 12.6.1 Failure prediction using LSTM
βͺ 12.6.2 Remaining useful life (RUL) prediction using LSTM 256
β’ Chapter 13 Reinforcement Learning
o 13.1 Reinforcement Learning: An Introduction
βͺ 13.1.1 RL for process control
o 13.2 RL Terminology & Mathematical Concepts
βͺ 13.2.1 Environment and Markov decision process
βͺ 13.2.2 Reward and return
βͺ 13.2.3 Policy
βͺ 13.2.4 Value function
βͺ 13.2.5 Bellman equation
o 13.3 Fundamentals of Q-learning
o 13.4 Deep RL & Actor-Critic Framework
βͺ 13.4.1 Deep Q-learning
βͺ 13.4.2 Policy gradient methods
βͺ 13.4.3 Actor-Critic framework
o 13.5 Deep Deterministic Policy Gradient (DDPG)
βͺ 13.5.1 Replay memory 285`
13.5.2 Target networks
13.5.3 OU process as exploration noise
13.6 DDPG RL Agent as Level Controller
Part 4 Deploying ML Solutions Over Web
β’ Chapter 14 Process Monitoring Web Application
o 14.1 Process Monitoring Web App: Introduction
o 14.2 A Simple βHello Worldβ Web App
o 14.3 Embedding ML Models into Web Apps
o 14.4 Building Front-end User Interface
Appendix
Dataset Descriptions
π SIMILAR VOLUMES
Computer techniques have made online measurements available at every sampling period in a chemical process. However, measurement errors are introduced that require suitable techniques for data reconciliation and improvements in accuracy. Reconciliation of process data and reliable monitoring are ess
Computer techniques have made online measurements available at every sampling period in a chemical process. However, measurement errors are introduced that require suitable techniques for data reconciliation and improvements in accuracy. Reconciliation of process data and reliable monitoring are ess
<p><span>In both the database and machine learning communities, data quality has become a serious issue which cannot be ignored. In this context, we refer to data with quality problems as βdirty data.β Clearly, for a given data mining or machine learning task, dirty data in both training and test da
<h4><span>Key Features</span></h4><ul><li><span><span>Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulation, and updated source code in Scala</span></span></li><li><span><span>Take your expertise in Scala programming to the ne
<h4><span>Key Features</span></h4><ul><li><span><span>Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulation, and updated source code in Scala</span></span></li><li><span><span>Take your expertise in Scala programming to the ne