<p><span>Learn how to leverage feature stores to make the most of your machine learning models</span></p><h4><span>Key Features</span></h4><ul><li><span><span>Understand the significance of feature stores in the ML life cycle</span></span></li><li><span><span>Discover how features can be shared, dis
Feature Store for Machine Learning: Curate, discover, share and serve ML features at scale
โ Scribed by Jayanth Kumar M J
- Publisher
- Packt Publishing
- Year
- 2022
- Tongue
- English
- Leaves
- 281
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Learn how to leverage feature stores to make the most of your machine learning models
Key Features
- Understand the significance of feature stores in the ML life cycle
- Discover how features can be shared, discovered, and re-used
- Learn to make features available for online models during inference
Book Description
Feature store is one of the storage layers in machine learning (ML) operations, where data scientists and ML engineers can store transformed and curated features for ML models. This makes them available for model training, inference (batch and online), and reuse in other ML pipelines. Knowing how to utilize feature stores to their fullest potential can save you a lot of time and effort, and this book will teach you everything you need to know to get started.
Feature Store for Machine Learning is for data scientists who want to learn how to use feature stores to share and reuse each other's work and expertise. You'll be able to implement practices that help in eliminating reprocessing of data, providing model-reproducible capabilities, and reducing duplication of work, thus improving the time to production of the ML model. While this ML book offers some theoretical groundwork for developers who are just getting to grips with feature stores, there's plenty of practical know-how for those ready to put their knowledge to work. With a hands-on approach to implementation and associated methodologies, you'll get up and running in no time.
By the end of this book, you'll have understood why feature stores are essential and how to use them in your ML projects, both on your local system and on the cloud.
What you will learn
- Understand the significance of feature stores in a machine learning pipeline
- Become well-versed with how to curate, store, share and discover features using feature stores
- Explore the different components and capabilities of a feature store
- Discover how to use feature stores with batch and online models
- Accelerate your model life cycle and reduce costs
- Deploy your first feature store for production use cases
Who this book is for
If you have a solid grasp on machine learning basics, but need a comprehensive overview of feature stores to start using them, then this book is for you. Data/machine learning engineers and data scientists who build machine learning models for production systems in any domain, those supporting data engineers in productionizing ML models, and platform engineers who build data science (ML) platforms for the organization will also find plenty of practical advice in the later chapters of this book.
Table of Contents
- An Overview of the Machine Learning Life Cycle
- What Problems Do Feature Stores Solve?
- Feature Store Fundamentals, Terminology, and Usage
- Adding Feature Store to ML Models
- Model Training and Inference
- Model to Production and Beyond
- Feast Alternatives and ML Best Practices
- Use Case โ Customer Churn Prediction
โฆ Table of Contents
Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Section 1 โ Why Do We Need a Feature Store?
Chapter 1: An Overview of the Machine Learning Life Cycle
Technical requirements
The ML life cycle in practice
Problem statement (plan and create)
Data (preparation and cleaning)
Model
Package, release, and monitor
An ideal world versus the real world
Reusability and sharing
Everything in a notebook
The most time-consuming stages of ML
Figuring out the dataset
Data exploration and feature engineering
Modeling to production and monitoring
Summary
Chapter 2: What Problems Do Feature Stores Solve?
Importance of features in production
Ways to bring features to production
Batch model pipeline
Online model pipeline
Common problems with the approaches used for bringing features to production
Re-inventing the wheel
Feature re-calculation
Feature discoverability and sharing
Training vs Serving skew
Model reproducibility
Low latency
Feature stores to the rescue
Standardizing ML with a feature store
Feature store avoids reprocessing data
Features are discoverable and sharable with the feature store
Serving features at low latency with feature stores
Philosophy behind feature stores
Summary
Further reading
Section 2 โ A Feature Store in Action
Chapter 3: Feature Store Fundamentals, Terminology, and Usage
Technical requirements
Introduction to Feast and installation
Feast terminology and definitions
Feast initialization
Feast usage
Register feature definitions
Browsing the feature store
Adding an entity and FeatureView
Generate training data
Load features to the online store
Feast behind the scenes
Data flow in Feast
Summary
Further reading
Chapter 4: Adding Feature Store to ML Models
Technical requirements
Creating Feast resources in AWS
Amazon S3 for storing data
AWS Redshift for an offline store
Creating an IAM user to access the resources
Feast initialization for AWS
Exploring the ML life cycle with Feast
Problem statement (plan and create)
Data (preparation and cleaning)
Model (feature engineering)
Summary
References
Chapter 5: Model Training and Inference
Prerequisites
Technical requirements
Model training with the feature store
Dee's model training experiments
Ram's model training experiments
Model packaging
Batch model inference with Feast
Online model inference with Feast
Syncing the latest features from the offline to the online store
Packaging the online model as a REST endpoint with Feast code
Handling changes to the feature set during development
Step 1 โ Change feature definitions
Step 2 โ Add/update schema in the Glue/Lake Formation console
Step 3 โ Update notebooks with the changes
Summary
Further reading
Chapter 6: Model to Production and Beyond
Technical requirements
Setting up Airflow for orchestration
S3 bucket for Airflow metadata
Amazon MWAA environment for orchestration
Productionizing the batch model pipeline
Productionizing an online model pipeline
Orchestration of a feature engineering job
Deploying the model as a SageMaker endpoint
Beyond model production
Feature drift monitoring and model retraining
Model reproducibility and prediction issues
A headstart for the next model
Changes to feature definition after production
Summary
Section 3 โ Alternatives, Best Practices, and a Use Case
Chapter 7: Feast Alternatives and ML Best Practices
Technical requirements
The available feature stores on the market
The Tecton Feature Store
Databricks Feature Store
Google's Vertex AI Feature Store
The Hopsworks Feature Store
SageMaker Feature Store
Feature management with SageMaker Feature Store
Resources to use SageMaker
Generating features
Defining the feature group
Feature ingestion
Getting records from an online store
Querying historical data with Amazon Athena
Cleaning up a SageMaker feature group
ML best practices
Data validation at source
Breaking down ML pipeline and orchestration
Tracking data lineage and versioning
The feature repository
Experiment tracking, model versioning, and the model repository
Feature and model monitoring
Miscellaneous
Summary
Chapter 8: Use Case โ Customer Churn Prediction
Technical requirements
Infrastructure setup
Introduction to the problem and the dataset
Data processing and feature engineering
Feature group definitions and feature ingestion
Model training
Model prediction
Feature monitoring
Model monitoring
Summary
Index
Other Books You May Enjoy
๐ SIMILAR VOLUMES
Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you'll learn techniques for extracting and transforming features--the numeric representations of raw data--into formats for machine-learning models. Each ch
"Feature engineering plays a vital role in big data analytics. Machine learning and data mining algorithms cannot work without data. Little can be achieved if there are few features to represent the underlying data objects, and the quality of results of those algorithms largely depends on the qualit
Discover how to build commercial-quality, anatomy-based CG characters using Maya with "Maya Feature Creature Creations, Second Edition." In today's competitive entertainment market, animated movies and video games require superior graphics and realistic characters, making it imperative that 3D artis