๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Effective Machine Learning Teams

โœ Scribed by David Tan and Ada Leung


Publisher
O'Reilly Media, Inc.
Year
2023
Tongue
English
Leaves
193
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists and ML engineers will learn how to bridge the gap between data science and software engineering in a practical and simple way. David Tan and Ada Leung from Thoughtworks show you how to apply time-tested software engineering skills and Lean delivery practices that will improve your effectiveness in ML projects.

โœฆ Table of Contents


Preface
Who Is This Book For
How This Book Is Organized
Part 1: Engineering Practices
Part 2: Product and Delivery Practices
Some Parting Thoughts
Conventions Used in This Book
Using Code Examples
Oโ€™Reilly Online Learning
How to Contact Us
1. Challenges and Better Paths in Delivering Machine Learning Solutions
Machine Learning: Promises and Disappointments
Continued Optimism in Machine Learning
Why ML Projects Fail
Macro-level view: barriers to success
Micro-level view: everyday impediments to success
Lifecycle of a story in a low effectiveness environment
Lifecycle of a story in a high effectiveness environment
Is There a Better Way? How Lean and Systems Thinking Can Help
But First, You Canโ€™t โ€œMLOpsโ€ Your Problems Away
See the Whole: A Systems Thinking Lens for Effective ML Delivery
Using Lean to Improve ML Delivery Systems
What is Lean, and why should ML practitioners care?
Product
Prototype testing
Discovery
Delivery
Vertically sliced work
Vertically sliced teams, or cross functional teams
Ways of working
Measuring delivery metrics
Engineering
Automated testing
Refactoring
Code editor effectiveness
Continuous delivery for machine learning (CD4ML)
Machine learning
Framing ML problems
ML systems design
Responsible AI
ML governance
Data
Closing the data collection loop
Reducing data distribution shifts
Data security and privacy
Conclusion
An Invitation to Journey with Us
2. Effective Dependency Management: Principles and Tools
What if Our Code Worked Everywhere, Every Time?
A Better Way: Check Out and Go
Principles for Effective Dependency Management
Reproducible environments
Production-like development environments from day one
Application-level environment isolation
OS-level environment isolation
Tools for Dependency Management
Managing OS-level dependencies (with Docker)
Misconception 1: Docker is over-complicated and unnecessary
Misconception 2: I donโ€™t need Docker because I already use X (e.g. conda)
Misconception 3: Docker will have a significant performance impact
Managing application-level dependencies (with Poetry)
A Crash Course on Docker and batect
What are Containers?
Where Will We Use Docker?
Reduce the Number of Moving Parts in Docker with batect
Benefit 1: Simpler command-line interface
Benefit 2: Local-CI symmetry
Benefit 3: Faster builds with caches
How to use batect in your projects
Conclusion
3. Effective Dependency Management in Practice
In Context: ML Development Workflow
What Exactly Are We Containerizing?
Hands-on Exercise: Reproducible Development Environments, Aided by Containers
1. Check out and go: Installing prerequisite dependencies
2. Create our local development environment
3. Start our local development environment
4. Serving the ML model locally as a web API
5. Configure our code editor
6. Training model on the cloud
7. Deploying model web API
Secure Dependency Management
Remove Unnecessary Dependencies
Automate checks for security vulnerabilities
Conclusion
Further Reading
4. Automated Testing: Move Fast Without Breaking Things
Automated Tests: The Foundation for Iterating Quickly and Reliably
Starting with Why: Benefits of Test Automation
If Automated Testing is so Important, Why Arenโ€™t We Doing It?
Reason 1: We think writing automated tests slows us down
Reason 2: โ€œWe have CI/CDโ€
Reason 3: We just donโ€™t know how to fully test ML systems
Building Blocks for a Comprehensive Test Strategy
The What: Identifying Components For Testing
Software logic
ML models
The How: Structure of a Test
Characteristics of a Good Test and Pitfalls to Avoid
Tests should be independent and idempotent
Tests should fail fast and fail loudly
Tests should check behavior, not implementation
Tests should be runnable locally
Tests must be part of feature development
Tests let us โ€œcatch bugs onceโ€
Software Tests
Unit Tests
Designing unit-testable code
How do I write a unit test?
Training Smoke Tests
How do I write these tests?
API Tests
How do I write these tests?
Recommended practice: Assert on โ€œthe whole elephantโ€
Post-deployment Tests
How do I write these tests?
Conclusion
5. Automated Testing: ML Model Tests
Model Tests
The Necessity of Model Tests
Challenges of Testing ML Models
Fitness Functions for ML Models
Model Metrics Tests (Global and Stratified)
How do I write these tests?
Advantages and limitations of metrics tests
Behavioral Tests
Complementary Practices for Model Tests
Error Analysis and Visualization
Learn from Production by Closing the Data Collection Loop
Open-closed Test Design
Exploratory Testing
Means to Improve the Model
Designing for Failures
Monitoring in Production
Bringing It All Together
Conclusion
Next Steps: Applying What Youโ€™ve Learned
Make incremental improvements
Demonstrate value
6. Supercharging Your Code Editor with Simple Techniques
Why Should I Care? The Benefits (and Surprising Simplicity) of Knowing our IDE
If Itโ€™s so Important, Why Havenโ€™t I Learned It Yet?
The Plan: Getting Productive In Two Stages
Stage 1: Configuring our IDE
Install IDE and basic navigation shortcuts
Create a virtual environment
Configure virtual environment: PyCharm
Configure virtual environment: VS Code
Testing that weโ€™ve configured everything correctly
Stage 2: The Star of the Show โ€“ Keyboard Shortcuts
Coding
Code completion suggestions
Inline documentation / Parameter information
Auto fix errors
Linting
Move / copy lines
Refactoring
Rename variable
Extract variable / method / function
Reformat code
Navigating code without getting lost
Opening things (files, classes, methods, functions) by name
Navigating the flow of code
Screen real estate management
Thatโ€™s it: You Did It!
Guidelines for setting up a code repository for your team
Additional tools and techniques
Conclusion


๐Ÿ“œ SIMILAR VOLUMES


Effective Machine Learning Teams: Best P
โœ David Tan, Ada Leung, David Colls ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p><span>Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists and ML engineers will learn how to bridge the gap between data science and Lean software delivery in a practical and simple way. David Tan an

Effective Machine Learning Teams: Best P
โœ David Tan ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p>Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists, ML engineers, and their leaders will learn how to bridge the gap between data science and Lean product delivery in a practical and simple way. Dav

Effective Machine Learning Teams: Best P
โœ David Tan, Ada Leung, David Colls ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists and ML engineers will learn how to bridge the gap between data science and Lean software delivery in a practical and simple way. David Tan and Ada Leu

Effective Amazon Machine Learning
โœ Alexis Perrier ๐Ÿ“‚ Library ๐Ÿ“… 2017 ๐Ÿ› Packt Publishing ๐ŸŒ English

Key Features โ€ข Create great machine learning models that combine the power of algorithms with interactive tools without worrying about the underlying complexity โ€ข Learn the What's next? of machine learningโ€•machine learning on the cloudโ€•with this unique guide โ€ข Create web services that allow you t

Agile Machine Learning: Effective Machin
โœ Eric Carter, Matthew Hurst ๐Ÿ“‚ Library ๐Ÿ“… 2019 ๐Ÿ› Apress ๐ŸŒ English

<p><p>Build resilient applied machine learning teams that deliver better data products through adapting the guiding principles of the Agile Manifesto.</p><p>Bringing together talented people to create a great applied machine learning team is no small feat. With developers and data scientists both co

Agile Machine Learning: Effective Machin
โœ Eric Carter; Matthew Hurst ๐Ÿ“‚ Library ๐Ÿ“… 2019 ๐Ÿ› Apress ๐ŸŒ English

Build resilient applied machine learning teams that deliver better data products through adapting the guiding principles of the Agile Manifesto. Bringing together talented people to create a great applied machine learning team is no small feat. With developers and data scientists both contributing e