Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits: A practical guide to implementing supervised and unsupervised machine learning algorithms in Python

✍ Scribed by Tarek Amr

Publisher: Packt Publishing Ltd
Year: 2020
Tongue: English
Leaves: 368
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems Key Features Delve into machine learning with this comprehensive guide to scikit-learn and scientific Python Master the art of data-driven problem-solving with hands-on examples Foster your theoretical and practical knowledge of supervised and unsupervised machine learning algorithms Book Description Machine learning is applied everywhere, from business to research and academia, while scikit-learn is a versatile library that is popular among machine learning practitioners. This book serves as a practical guide for anyone looking to provide hands-on machine learning solutions with scikit-learn and Python toolkits. The book begins with an explanation of machine learning concepts and fundamentals, and strikes a balance between theoretical concepts and their applications. Each chapter covers a different set of algorithms, and shows you how to use them to solve real-life problems. You’ll also learn about various key supervised and unsupervised machine learning algorithms using practical examples. Whether it is an instance-based learning algorithm, Bayesian estimation, a deep neural network, a tree-based ensemble, or a recommendation system, you’ll gain a thorough understanding of its theory and learn when to apply it. As you advance, you’ll learn how to deal with unlabeled data and when to use different clustering and anomaly detection algorithms. By the end of this machine learning book, you’ll have learned how to take a data-driven approach to provide end-to-end machine learning solutions. You’ll also have discovered how to formulate the problem at hand, prepare required data, and evaluate and deploy models in production. What you will learn Understand when to use supervised, unsupervised, or reinforcement learning algorithms Find out how to collect and prepare your data for machine learning tasks Tackle imbalanced data and optimize your algorithm for a bias or variance tradeoff Apply supervised and unsupervised algorithms to overcome various machine learning challenges Employ best practices for tuning your algorithm’s hyper parameters Discover how to use neural networks for classification and regression Build, evaluate, and deploy your machine learning solutions to production Who this book is for This book is for data scientists, machine learning practitioners, and anyone who wants to learn how machine learning algorithms work and to build different machine learning models using the Python ecosystem. The book will help you take your knowledge of machine learning to the next level by grasping its ins and outs and tailoring it to your needs. Working knowledge of Python and a basic understanding of underlying mathematical and statistical concepts is required.

✦ Table of Contents

Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Section 1: Supervised Learning
Chapter 1: Introduction to Machine Learning
Understanding machine learning
Types of machine learning algorithms
Supervised learning
Classification versus regression
Supervised learning evaluation
Unsupervised learning
Reinforcement learning
The model development life cycle
Understanding a problem
Splitting our data
Finding the best manner to split the data
Making sure the training and the test datasets are separate
Development set
Evaluating our model
Deploying in production and monitoring
Iterating
When to use machine learning
Introduction to scikit-learn
It plays well with the Python data ecosystem
Practical level of abstraction
When not to use scikit-learn
Installing the packages you need
Introduction to pandas
Python's scientific computing ecosystem conventions
Summary
Further reading
Chapter 2: Making Decisions with Trees
Understanding decision trees
What are decision trees?
Iris classification
Loading the Iris dataset
Splitting the data
Training the model and using it for prediction
Evaluating our predictions
Which features were more important?
Displaying the internal tree decisions
How do decision trees learn?
Splitting criteria
Preventing overfitting
Predictions
Getting a more reliable score
What to do now to get a more reliable score
ShuffleSplit
Tuning the hyperparameters for higher accuracy
Splitting the data
Trying different hyperparameter values
Comparing the accuracy scores
Visualizing the tree's decision boundaries
Feature engineering
Building decision tree regressors
Predicting people's heights
Regressor's evaluation
Setting sample weights
Summary
Chapter 3: Making Decisions with Linear Equations
Understanding linear models
Linear equations
Linear regression
Estimating the amount paid to the taxi driver
Predicting house prices in Boston
Data exploration
Splitting the data
Calculating a baseline
Training the linear regressor
Evaluating our model's accuracy
Showing feature coefficients
Scaling for more meaningful coefficients
Adding polynomial features
Fitting the linear regressor with the derived features
Regularizing the regressor
Training the lasso regressor
Finding the optimum regularization parameter
Finding regression intervals
Getting to know additional linear regressors
Using logistic regression for classification
Understanding the logistic function
Plugging the logistic function into a linear model
Objective function
Regularization
Solvers
Configuring the logistic regression classifier
Classifying the Iris dataset using logistic regression
Understanding the classifier's decision boundaries
Getting to know additional linear classifiers
Summary
Chapter 4: Preparing Your Data
Imputing missing values
Setting missing values to 0
Setting missing values to the mean
Using informed estimations for missing values
Encoding non-numerical columns
One-hot encoding
Ordinal encoding
Target encoding
Homogenizing the columns' scale
The standard scaler
The MinMax scaler
RobustScaler
Selecting the most useful features
VarianceThreshold
Filters
f-regression and f-classif
Mutual information
Comparing and using the different filters
Evaluating multiple features at a time
Summary
Chapter 5: Image Processing with Nearest Neighbors
Nearest neighbors
Loading and displaying images
Image classification
Using a confusion matrix to understand the model's mistakes
Picking a suitable metric
Setting the correct K
Hyperparameter tuning using GridSearchCV
Using custom distances
Using nearest neighbors for regression
More neighborhood algorithms
Radius neighbors
Nearest centroid classifier
Reducing the dimensions of our image data
Principal component analysis
Neighborhood component analysis
Comparing PCA to NCA
Picking the most informative components
Using the centroid classifier with PCA
Restoring the original image from its components
Finding the most informative pixels
Summary
Chapter 6: Classifying Text Using Naive Bayes
Splitting sentences into tokens
Tokenizing with string split
Tokenizing using regular expressions
Using placeholders before tokenizing
Vectorizing text into matrices
Vector space model
Bag of words
Different sentences, same representation
N-grams
Using characters instead of words
Capturing important words with TF-IDF
Representing meanings with word embedding
Word2Vec
Understanding Naive Bayes
The Bayes rule
Calculating the likelihood naively
Naive Bayes implementations
Additive smoothing
Classifying text using a Naive Bayes classifier
Downloading the data
Preparing the data
Precision, recall, and F1 score
Pipelines
Optimizing for different scores
Creating a custom transformer
Summary
Section 2: Advanced Supervised Learning
Chapter 7: Neural Networks – Here Comes Deep Learning
Getting to know MLP
Understanding the algorithm's architecture
Training the neural network
Configuring the solvers
Classifying items of clothing
Downloading the Fashion-MNIST dataset
Preparing the data for classification
Experiencing the effects of the hyperparameters
Learning not too quickly and not too slowly
Picking a suitable batch size
Checking whether more training samples are needed
Checking whether more epochs are needed
Choosing the optimum architecture and hyperparameters
Adding your own activation function
Untangling the convolutions
Extracting features by convolving
Reducing the dimensionality of the data via max pooling
Putting it all together
MLP regressors
Summary
Chapter 8: Ensembles – When One Model Is Not Enough
Answering the question why ensembles?
Combining multiple estimators via averaging
Boosting multiple biased estimators
Downloading the UCI Automobile dataset
Dealing with missing values
Differentiating between numerical features and categorical ones
Splitting the data into training and test sets
Imputing the missing values and encoding the categorical features
Using random forest for regression
Checking the effect of the number of trees
Understanding the effect of each training feature
Using random forest for classification
The ROC curve
Using bagging regressors
Preparing a mixture of numerical and categorical features
Combining KNN estimators using a bagging meta-estimator
Using gradient boosting to predict automobile prices
Plotting the learning deviance
Comparing the learning rate settings
Using different sample sizes
Stopping earlier and adapting the learning rate
Regression ranges
Using AdaBoost ensembles
Exploring more ensembles
Voting ensembles
Stacking ensembles
Random tree embedding
Summary
Chapter 9: The Y is as Important as the X
Scaling your regression targets
Estimating multiple regression targets
Building a multi-output regressor
Chaining multiple regressors
Dealing with compound classification targets
Converting a multi-class problem into a set of binary classifiers
Estimating multiple classification targets
Calibrating a classifier's probabilities
Calculating the precision at k
Summary
Chapter 10: Imbalanced Learning – Not Even 1% Win the Lottery
Getting the click prediction dataset
Installing the imbalanced-learn library
Predicting the CTR
Weighting the training samples differently
The effect of the weighting on the ROC
Sampling the training data
Undersampling the majority class
Oversampling the minority class
Combining data sampling with ensembles
Equal opportunity score
Summary
Section 3: Unsupervised Learning and More
Chapter 11: Clustering – Making Sense of Unlabeled Data
Understanding clustering
K-means clustering
Creating a blob-shaped dataset
Visualizing our sample data
Clustering with K-means
The silhouette score
Choosing the initial centroids
Agglomerative clustering
Tracing the agglomerative clustering's children
The adjusted Rand index
Choosing the cluster linkage
DBSCAN
Summary
Chapter 12: Anomaly Detection – Finding Outliers in Data
Unlabeled anomaly detection
Generating sample data
Detecting anomalies using basic statistics
Using percentiles for multi-dimensional data
Detecting outliers using EllipticEnvelope
Outlier and novelty detection using LOF
Novelty detection using LOF
Detecting outliers using isolation forest
Summary
Chapter 13: Recommender System – Getting to Know Their Taste
The different recommendation paradigms
Downloading surprise and the dataset
Downloading the KDD Cup 2012 dataset
Processing and splitting the dataset
Creating a random recommender
Using KNN-inspired algorithms
Using baseline algorithms
Using singular value decomposition
Extracting latent information via SVD
Comparing the similarity measures for the two matrices
Click prediction using SVD
Deploying machine learning models in production
Summary
Other Books You May Enjoy
Index

📜 SIMILAR VOLUMES

Hands-On Machine Learning with scikit-le

📁 Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits: A practical guide to implementing supervised and unsupervised machine learning algorithms in Python

✍ Tarek Amr 📂 Library 📅 2020 🏛 Packt Publishing Ltd 🌐 English

Hands-on Supervised Learning with Python

📁 Hands-on Supervised Learning with Python: Learn How to Solve Machine Learning Problems with Supervised Learning Algorithms Using Python

✍ Gnana Lakshmi T C, Madeleine Shang 📂 Library 📅 2020 🏛 BPB Publications 🌐 English

Hands-On ML problem solving and creating solutions using Python. Key Features<li>Introduction to Python Programming </li><li>Python for Machine Learning </li><li>Introduction to Machine Learning </li><li>Introduction to Predictive Modelling, Supervised and Unsupervised A

Supervised machine learning with Python:

📁 Supervised machine learning with Python: develop rich Python coding practices while exploring supervised machine learning

✍ Smith, Taylor 📂 Library 📅 2019 🏛 Packt Publishing 🌐 English

Teach your machine to think for itself!Key Features<li>Delve into supervised learning and grasp how a machine learns from data<li>Implement popular machine learning algorithms from scratch, developing a deep understanding along the way<li>Explore some of the most popular scien

Hands-On Unsupervised Learning with Pyth

📁 Hands-On Unsupervised Learning with Python: Implement machine learning and deep learning models using Scikit-Learn, TensorFlow, and more

✍ Bonaccorso, Giuseppe 📂 Library 📅 2019 🏛 Packt 🌐 English

Unsupervised learning is an increasingly important branch of data science, the goal of which is to train models that can learn the structure of a dataset and provide the user with helpful pieces of information about new samples. In many different business sectors (such as marketing, business intelli

Hands-on unsupervised learning with Pyth

📁 Hands-on unsupervised learning with Python : implement machine learning and deep learning models using Scikit-Learn, TensorFlow, and more

✍ Bonaccorso, Giuseppe 📂 Library 📅 2019 🏛 Packt Publishing 🌐 English

Machine Learning: Step-by-Step Guide To

📁 Machine Learning: Step-by-Step Guide To Implement Machine Learning Algorithms with Python

✍ Rudolph Russell 📂 Library 📅 2018 🏛 CreateSpace Independent Publishing Platform 🌐 English

<h1>MACHINE LEARNING - PYTHON Buy the Paperback version of this book, and get the Kindle eBook version included for FREE! </h1> <h2>Do You Want to Become An Expert Of Machine Learning?? Start Getting this Book and Follow My Step by Step Explanat