Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features

✍ Scribed by Rich Collier; Camilla Montonen; Bahaaldine Azarmi

Publisher: Packt Publishing
Year: 2021
Tongue: English
Leaves: 450
Edition: 2
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analyze search data effectively. With the addition of machine learning, a key commercial feature, the Elastic Stack makes this process even more efficient. This updated second edition of Machine Learning with the Elastic Stack provides a comprehensive overview of Elastic Stack's machine learning features for both time series data analysis as well as for classification, regression, and outlier detection.

The book starts by explaining machine learning concepts in an intuitive way. You'll then perform time series analysis on different types of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you'll deploy machine learning within Elastic Stack for logging, security, and metrics. Finally, you'll discover how data frame analysis opens up a whole new set of use cases that machine learning can help you with.

By the end of this Elastic Stack book, you'll have hands-on machine learning and Elastic Stack experience, along with the knowledge you need to incorporate machine learning in your distributed search and data analysis platform.

✦ Table of Contents

Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Section 1 – Getting Started with Machine Learning with Elastic Stack
Chapter 1: Machine Learning for IT
Overcoming the historical challenges in IT
Dealing with the plethora of data
The advent of automated anomaly detection
Unsupervised versus supervised ML
Using unsupervised ML for anomaly detection
Defining unusual
Learning what's normal
Probability models
Learning the models
De-trending
Scoring of unusualness
The element of time
Applying supervised ML to data frame analytics
The process of supervised learning
Summary
Chapter 2: Enabling and Operationalization
Technical requirements
Enabling Elastic ML features
Enabling ML on a self-managed cluster
Enabling ML in the cloud – Elasticsearch Service
Understanding operationalization
ML nodes
Jobs
Bucketing data in a time series analysis
Feeding data to Elastic ML
The supporting indices
Anomaly detection orchestration
Anomaly detection model snapshots
Summary
Section 2 – Time Series Analysis – Anomaly Detection and Forecasting
Chapter 3: Anomaly Detection
Technical requirements
Elastic ML job types
Dissecting the detector
The function
The field
The partition field
The by field
The over field
The "formula"
Exploring the count functions
Other counting functions
Detecting changes in metric values
Metric functions
Understanding the advanced detector functions
rare
Frequency rare
Information content
Geographic
Time
Splitting analysis along categorical features
Setting the split field
The difference between splitting using partition and by_field
Understanding temporal versus population analysis
Categorization analysis of unstructured messages
Types of messages that are good candidates for categorization
The process used by categorization
Analyzing the categories
Categorization job example
When to avoid using categorization
Managing Elastic ML via the API
Summary
Chapter 4: Forecasting
Technical requirements
Contrasting forecasting with prophesying
Forecasting use cases
Forecasting theory of operation
Single time series forecasting
Looking at forecast results
Multiple time series forecasting
Summary
Chapter 5: Interpreting Results
Technical requirements
Viewing the Elastic ML results index
Anomaly scores
Bucket-level scoring
Normalization
Influencer-level scoring
Influencers
Record-level scoring
Results index schema details
Bucket results
Record results
Influencer results
Multi-bucket anomalies
Multi-bucket anomaly example
Multi-bucket scoring
Forecast results
Querying for forecast results
Results API
Results API endpoints
Getting the overall buckets API
Getting the categories API
Custom dashboards and Canvas workpads
Dashboard "embeddables"
Anomalies as annotations in TSVB
Customizing Canvas workpads
Summary
Chapter 6: Alerting on ML Analysis
Technical requirements
Understanding alerting concepts
Anomalies are not necessarily alerts
In real-time alerting, timing matters
Building alerts from the ML UI
Defining sample anomaly detection jobs
Creating alerts against the sample jobs
Simulating some real-time anomalous behavior
Receiving and reviewing the alerts
Creating an alert with a watch
Understanding the anatomy of the legacy default ML watch
Custom watches can offer some unique functionality
Summary
Chapter 7: AIOps and Root Cause Analysis
Technical requirements
Demystifying the term ''AIOps''
Understanding the importance and limitations of KPIs
Moving beyond KPIs
Organizing data for better analysis
Custom queries for anomaly detection datafeeds
Data enrichment on ingest
Leveraging the contextual information
Analysis splits
Statistical influencers
Bringing it all together for RCA
Outage background
Correlation and shared influencers
Summary
Chapter 9: Anomaly Detection in Other Elastic Stack Apps
Technical requirements
Anomaly detection in Elastic APM
Enabling anomaly detection for APM
Viewing the anomaly detection job results in the APM UI
Creating ML Jobs via the data recognizer
Anomaly detection in the Logs app
Log categories
Log anomalies
Anomaly detection in the Metrics app
Anomaly detection in the Uptime app
Anomaly detection in the Elastic Security app
Prebuilt anomaly detection jobs
Anomaly detection jobs as detection alerts
Summary
Section 3 – Data Frame Analysis
Chapter 9: Introducing Data Frame Analytics
Technical requirements
Learning how to use transforms
Why are transforms useful?
The anatomy of a transform
Using transforms to analyze e-commerce orders
Exploring more advanced pivot and aggregation configurations
Discovering the difference between batch and continuous transforms
Analyzing social media feeds using continuous transforms
Using Painless for advanced transform configurations
Introducing Painless
Working with Python and Elasticsearch
A brief tour of the Python Elasticsearch clients
Summary
Further reading
Chapter 10: Outlier Detection
Technical requirements
Discovering the four techniques used for outlier detection
Understanding feature influence
How does outlier detection differ from anomaly detection?
Applying outlier detection in practice
Evaluating outlier detection with the Evaluate API
Hyperparameter tuning for outlier detection
Summary
Chapter 11: Classification Analysis
Technical requirements
Classification: from data to a trained model
Feature engineering
Evaluating the model
Taking your first steps with classification
Classification under the hood: gradient boosted decision trees
Introduction to decision trees
Gradient boosted decision trees
Hyperparameters
Interpreting results
Summary
Further reading
Chapter 12: Regression
Technical requirements
Using regression analysis to predict house prices
Using decision trees for regression
Summary
Further reading
Chapter 13: Inference
Technical requirements
Examining, exporting, and importing your trained models with the Trained Models API
A tour of the Trained Models API
Exporting and importing trained models with the Trained Models API and Python
Understanding inference processors and ingest pipelines
Handling missing or corrupted data in ingest pipelines
Using inference processor configuration options to gain more insight into your predictions
Importing external models into Elasticsearch using eland
Learning about supported external models in eland
Training a scikit-learn DecisionTreeClassifier and importing it into Elasticsearch using eland
Summary
Appendix: Anomaly Detection Tips
Technical requirements
Understanding influencers in split versus non-split jobs
Using one-sided functions to your advantage
Ignoring time periods
Ignoring an upcoming (known) window of time
Ignoring an unexpected window of time, after the fact
Using custom rules and filters to your advantage
Creating custom rules
Benefiting from custom rules for a "top-down" alerting philosophy
Anomaly detection job throughput considerations
Avoiding the over-engineering of a use case
Using anomaly detection on runtime fields
Summary
Why subscribe?
About Packt
Other Books You May Enjoy
Index

📜 SIMILAR VOLUMES

Machine Learning with the Elastic Stack:

📁 Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features

✍ Rich Collier; Camilla Montonen; Bahaaldine Azarmi 📂 Library 📅 2021 🏛 Packt Publishing 🌐 English

Machine Learning with the Elastic Stack

📁 Machine Learning with the Elastic Stack

✍ Rich Collier 📂 Library 📅 2019 🏛 Packt Publishing 🌐 English

Machine Learning with the Elastic Stack:

📁 Machine Learning with the Elastic Stack: Expert techniques to integrate machine learning with distributed search and analytics

✍ Collier, Rich;Azarmi, Bahaaldine 📂 Library 📅 2018;2019 🏛 Packt Publishing 🌐 English

<p><b>Leverage Elastic Stack's machine learning features to gain valuable insight from your data</b><p><b>Key Features</b><p><li>Combine machine learning with the analytic capabilities of Elastic Stack<li>Analyze large volumes of search data and gain actionable insight from them<li>Use external anal

Data Science Projects with Python: A cas

📁 Data Science Projects with Python: A case study approach to gaining valuable insights from real data with machine learning

✍ Stephen Klosterman 📂 Library 📅 2021 🏛 Packt Publishing 🌐 English

Gain hands-on experience in Python programming with industry-standard machine learning tools using pandas, scikit-learn, and XGBoost Key Features • Think critically about data by exploring and cleaning it • Choose an appropriate machine learning model and train it on your data • Communicate da

Data Science Projects with Python: A cas

📁 Data Science Projects with Python: A case study approach to gaining valuable insights from real data with machine learning, 2nd Edition

✍ Stephen Klosterman 📂 Library 📅 2021 🏛 Packt Publishing 🌐 English

<div><div><div><div><b>Gain hands-on experience in Python programming with industry-standard machine learning tools using pandas, scikit-learn, and XGBoost</b></div><div><b><br>Key Features</b><ul><li>Think critically about data by exploring and cleaning it</li><li>Choose an appropriate machine lear

Data Science Projects with Python: A cas

📁 Data Science Projects with Python: A case study approach to gaining valuable insights from real data with machine learning, 2nd Edition

✍ Stephen Klosterman 📂 Library 🏛 Packt Publishing 🌐 English

<p><span>Gain hands-on experience of Python programming with industry-standard machine learning techniques using pandas, scikit-learn, and XGBoost</span></p><h4><span>Key Features</span></h4><ul><li><span><span>Think critically about data and use it to form and test a hypothesis</span></span></li><l