๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Machine Learning for Streaming Data with Python: Rapidly build practical online machine learning solutions

โœ Scribed by Joos Korstanje


Publisher
Packt
Year
2022
Tongue
English
Leaves
258
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Apply machine learning to streaming data with the help of practical examples, and deal with challenges that surround streaming
Key Features
Work on streaming use cases that are not taught in most data science courses
Gain experience with state-of-the-art tools for streaming data
Mitigate various challenges while handling streaming data

โœฆ Table of Contents


Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Part 1: Introduction and Core Concepts of Streaming Data
Chapter 1: An Introduction to Streaming Data
Technical requirements
Setting up a Python environment
A short history of data science
Working with streaming data
Streaming data versus batch data
Advantages of streaming data
Examples of successful implementation of streaming analytics
Challenges of streaming data
How to get started with streaming data
Common use cases for streaming data
Streaming versus big data
Real-time data formats and importing an example dataset in Python
Summary
Further reading
Chapter 2: Architectures for Streaming and Real-Time Machine Learning
Technical requirements
Python environment
Defining your analytics as a function
Understanding microservices architecture
Communicating between services through APIs
Demystifying the HTTP protocol
The GET request
The POST request
JSON format for communication between systems
RESTful APIs
Building a simple API on AWS
API Gateway in AWS
Lambda in AWS
Data-generating process on a local machine
Implementing the example
More architectural considerations
Other AWS services and other services in general that have the same functionality
Big data tools for real time streaming
Calling a big data environment in real time
Summary
Further reading
Chapter 3: Data Analysis on Streaming Data
Technical requirements
Python environment
Descriptive statistics on streaming data
Why are descriptive statistics different on streaming data?
Introduction to sampling theory
Comparing population and sample
Population parameters and sample statistics
Sampling distribution
Sample size calculations and confidence level
Rolling descriptive statistics from streaming
Exponential weight
Tracking convergence as an additional KPI
Overview of the main descriptive statistics
The mean
The median
The mode
Standard deviation
Variance
Quartiles and interquartile range
Correlations
Real-time visualizations
Opening the dashboard
Comparing Plotly's Dash and other real-time visualization tools
Building basic alerting systems
Alerting systems on extreme values
Alerting systems on process stability (mean and median)
Alerting systems on constant variability (std and variance)
Basic alerting systems using statistical process control
Summary
Further reading
Part 2: Exploring Use Cases for Data Streaming
Chapter 4: Online Learning with River
Technical requirements
Python environment
What is online machine learning?
How is online learning different from regular learning?
Advantages of online learning
Challenges of online learning
Types of online learning
Using River for online learning
Training an online model with River
Improving the model evaluation
Building a multiclass classifier using one-vs-rest
Summary
Further reading
Chapter 5: Online Anomaly Detection
Technical requirements
Python environment
Defining anomaly detection
Are outliers a problem?
Exploring use cases of anomaly detection
Fraud detection in financial institutions
Anomaly detection on your log data
Fault detection in manufacturing and production lines
Hacking detection in computer networks (cyber security)
Medical risks in health data
Predictive maintenance and sensor data
Comparing anomaly detection and imbalanced classification
The problem of imbalanced data
The F1 score
SMOTE oversampling
Anomaly detection versus classification
Algorithms for detecting anomalies in River
The use of thresholders in River anomaly detection
Anomaly detection algorithm 1 โ€“ One-Class SVM
Anomaly detection algorithm 2 โ€“ Half-Space-Trees
Going further with anomaly detection
Summary
Further reading
Chapter 6: Online Classification
Technical requirements
Python environment
Defining classification
Identifying use cases of classification
Use case 1 โ€“ email spam classification
Use case 2 โ€“ face detection in phone camera
Use case 3 โ€“ online marketing ad selection
Overview of classification algorithms in River
Classification algorithm 1 โ€“ LogisticRegression
Classification algorithm 2 โ€“ Perceptron
Classification algorithm 3 โ€“ AdaptiveRandomForestClassifier
Classification algorithm 4 โ€“ ALMAClassifier
Classification algorithm 5 โ€“ PAClassifier
Evaluating benchmark results
Summary
Further reading
Chapter 7: Online Regression
Technical requirements
Python environment
Defining regression
Use cases of regression
Use case 1 โ€“ Forecasting
Use case 2 โ€“ Predicting the number of faulty products in manufacturing
Overview of regression algorithms in River
Regression algorithm 1 โ€“ LinearRegression
Regression algorithm 2 โ€“ HoeffdingAdaptiveTreeRegressor
Regression algorithm 3 โ€“ SGTRegressor
Regression algorithm 4 โ€“ SRPRegressor
Summary
Further reading
Chapter 8: Reinforcement Learning
Technical requirements
Python environment
Defining reinforcement learning
Comparing online and offline reinforcement learning
A more detailed overview of feedback loops in reinforcement learning
The main steps of a reinforcement learning model
Making the decisions
Updating the decision rules
Exploring Q-learning
The goal of Q-learning
Parameters of the Q-learning algorithm
Deep Q-learning
Using reinforcement learning for streaming data
Use cases of reinforcement learning
Use case one โ€“ trading system
Use case two โ€“ social network ranking system
Use case three โ€“ a self-driving car
Use case four โ€“ chatbots
Use case five โ€“ learning games
Implementing reinforcement learning in Python
Summary
Further reading
Part 3: Advanced Concepts and Best Practices around Streaming Data
Chapter 9: Drift and Drift Detection
Technical requirements
Python environment
Defining drift
Three types of drift
Introducing model explicability
Measuring drift
Measuring data drift
Measuring concept drift
Measuring drift in Python
A basic intuitive approach to measuring drift
Measuring drift with robust tools
Counteracting drift
Offline learning with retraining strategies against drift
Online learning against drift
Summary
Further reading
Chapter 10: Feature Transformation and Scaling
Technical requirements
Python environment
Challenges of data preparation with streaming data
Scaling data for streaming
Introducing scaling
Adapting scaling to a streaming context
Transforming features in a streaming context
Introducing PCA
Mathematical definition of PCA
Regular PCA in Python
Incremental PCA for streaming
Summary
Further reading
Chapter 11: Catastrophic Forgetting
Technical requirements
Python environment
Introducing catastrophic forgetting
Catastrophic forgetting in online models
Detecting catastrophic forgetting
Using Python to detect catastrophic forgetting
Model explicability versus catastrophic forgetting
Explaining models using linear coefficients
Explaining models using dendrograms
Explaining models using variable importance
Summary
Further reading
Chapter 12: Conclusion and Best Practices
Going further
Summary
Index
Other Books You May Enjoy


๐Ÿ“œ SIMILAR VOLUMES


Machine Learning for Streaming Data with
โœ Joos Korstanje ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Packt Publishing ๐ŸŒ English

<p><span>Apply machine learning to streaming data with the help of practical examples, and deal with challenges that surround streaming</span></p><h4><span>Key Features</span></h4><ul><li><span><span>Work on streaming use cases that are not taught in most data science courses</span></span></li><li><

Practical Machine Learning for Streaming
โœ Sayan Putatunda ๐Ÿ“‚ Library ๐Ÿ“… 2021 ๐Ÿ› Apress ๐ŸŒ English

<div>Design, develop, and validate machine learning models with streaming data using the Scikit-Multiflow framework. This book is a quick start guide for data scientists and machine learning engineers looking to implement machine learning models for streaming data with Python to generate real-time i

Practical Machine Learning for Streaming
โœ Sayan Putatunda ๐Ÿ“‚ Library ๐Ÿ“… 2021 ๐Ÿ› Apress ๐ŸŒ English

<div>Design, develop, and validate machine learning models with streaming data using the Scikit-Multiflow framework. This book is a quick start guide for data scientists and machine learning engineers looking to implement machine learning models for streaming data with Python to generate real-time i

Practical Machine Learning for Streaming
โœ Sayan Putatunda ๐Ÿ“‚ Library ๐Ÿ“… 2021 ๐Ÿ› Apress ๐ŸŒ English

<span>Design, develop, and validate machine learning models with streaming data using the Scikit-Multiflow framework. This book is a quick start guide for data scientists and machine learning engineers looking to implement machine learning models for streaming data with Python to generate real-time