š”– Scriptorium
✦   LIBER   ✦

šŸ“

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits: A practical guide to implementing supervised and unsupervised machine learning algorithms in Python

āœ Scribed by Tarek Amr


Publisher
Packt Publishing Ltd
Year
2020
Tongue
English
Leaves
368
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems Key Features Delve into machine learning with this comprehensive guide to scikit-learn and scientific Python Master the art of data-driven problem-solving with hands-on examples Foster your theoretical and practical knowledge of supervised and unsupervised machine learning algorithms Book Description Machine learning is applied everywhere, from business to research and academia, while scikit-learn is a versatile library that is popular among machine learning practitioners. This book serves as a practical guide for anyone looking to provide hands-on machine learning solutions with scikit-learn and Python toolkits. The book begins with an explanation of machine learning concepts and fundamentals, and strikes a balance between theoretical concepts and their applications. Each chapter covers a different set of algorithms, and shows you how to use them to solve real-life problems. You’ll also learn about various key supervised and unsupervised machine learning algorithms using practical examples. Whether it is an instance-based learning algorithm, Bayesian estimation, a deep neural network, a tree-based ensemble, or a recommendation system, you’ll gain a thorough understanding of its theory and learn when to apply it. As you advance, you’ll learn how to deal with unlabeled data and when to use different clustering and anomaly detection algorithms. By the end of this machine learning book, you’ll have learned how to take a data-driven approach to provide end-to-end machine learning solutions. You’ll also have discovered how to formulate the problem at hand, prepare required data, and evaluate and deploy models in production. What you will learn Understand when to use supervised, unsupervised, or reinforcement learning algorithms Find out how to collect and prepare your data for machine learning tasks Tackle imbalanced data and optimize your algorithm for a bias or variance tradeoff Apply supervised and unsupervised algorithms to overcome various machine learning challenges Employ best practices for tuning your algorithm’s hyper parameters Discover how to use neural networks for classification and regression Build, evaluate, and deploy your machine learning solutions to production Who this book is for This book is for data scientists, machine learning practitioners, and anyone who wants to learn how machine learning algorithms work and to build different machine learning models using the Python ecosystem. The book will help you take your knowledge of machine learning to the next level by grasping its ins and outs and tailoring it to your needs. Working knowledge of Python and a basic understanding of underlying mathematical and statistical concepts is required.

✦ Table of Contents


Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Section 1: Supervised Learning
Chapter 1: Introduction to Machine Learning
Understanding machine learning
Types of machine learning algorithms
Supervised learning
Classification versus regression
Supervised learning evaluation
Unsupervised learning
Reinforcement learning
The model development life cycle
Understanding a problem
Splitting our data
Finding the best manner to split the data
Making sure the training and the test datasets are separate
Development set
Evaluating our model
Deploying in production and monitoring
Iterating
When to use machine learning
Introduction to scikit-learn
It plays well with the Python data ecosystem
Practical level of abstraction
When not to use scikit-learn
Installing the packages you need
Introduction to pandas
Python's scientific computing ecosystemĀ conventions
Summary
Further reading
Chapter 2: Making Decisions with Trees
Understanding decision trees
What are decision trees?
Iris classification
Loading the Iris dataset
Splitting the data
Training the model and using it for prediction
Evaluating our predictions
Which features were more important?
Displaying the internal tree decisionsĀ 
How do decision trees learn?Ā 
SplittingĀ criteria
Preventing overfitting
Predictions
Getting a more reliable score
What to do now to get a more reliable score
ShuffleSplit
Tuning the hyperparameters for higher accuracy
Splitting the data
Trying different hyperparameter values
Comparing the accuracy scores
Visualizing the tree's decision boundaries
Feature engineering
Building decision tree regressors
Predicting people's heights
Regressor's evaluationĀ Ā 
Setting sample weights
Summary
Chapter 3: Making Decisions with Linear Equations
Understanding linear models
Linear equations
Linear regression
Estimating the amount paid to the taxi driver
Predicting house prices in Boston
Data exploration
Splitting the data
Calculating a baselineĀ 
Training the linear regressor
Evaluating our model's accuracy
Showing feature coefficientsĀ 
Scaling for more meaningful coefficients
Adding polynomial features
Fitting the linear regressor with the derived features
Regularizing the regressor
Training the lasso regressor
Finding the optimum regularization parameter
Finding regression intervals
Getting to know additional linear regressors
Using logistic regression for classification
Understanding the logistic function
Plugging the logistic function into a linear model
Objective function
Regularization
Solvers
Configuring the logistic regression classifier
Classifying the Iris dataset using logistic regression
Understanding the classifier's decision boundaries
Getting to know additional linear classifiers
Summary
Chapter 4: Preparing Your Data
Imputing missing values
Setting missing values to 0
Setting missing values to the mean
Using informed estimations for missing values
Encoding non-numerical columns
One-hot encoding
Ordinal encoding
Target encoding
Homogenizing the columns' scale
The standard scaler
The MinMax scaler
RobustScaler
Selecting the most useful features
VarianceThreshold
Filters
f-regression and f-classif
Mutual information
Comparing and using the different filters
Evaluating multiple features at a time
Summary
Chapter 5: Image Processing with Nearest Neighbors
Nearest neighbors
Loading and displaying images
Image classification
Using a confusion matrix to understand the model's mistakes
Picking a suitable metric
Setting the correct K
Hyperparameter tuning using GridSearchCV
Using custom distances
Using nearest neighbors for regression
More neighborhood algorithmsĀ 
Radius neighborsĀ 
Nearest centroid classifier
Reducing the dimensions of our image data
Principal component analysis
Neighborhood component analysis
Comparing PCA to NCA
Picking the most informative componentsĀ 
Using the centroid classifier with PCAĀ 
Restoring the original image from its componentsĀ 
Finding the most informative pixelsĀ 
Summary
Chapter 6: Classifying Text Using Naive Bayes
Splitting sentences into tokens
Tokenizing with string split
Tokenizing using regular expressions
Using placeholders before tokenizing
Vectorizing text into matrices
Vector space model
Bag of words
Different sentences, same representation
N-grams
Using characters instead of words
Capturing important words with TF-IDF
Representing meanings with word embedding
Word2Vec
Understanding Naive Bayes
The BayesĀ ruleĀ 
Calculating theĀ likelihood naivelyĀ 
Naive Bayes implementations
Additive smoothing
Classifying text using a Naive Bayes classifier
Downloading the data
Preparing the data
Precision, recall, and F1 score
Pipelines
Optimizing for different scores
Creating a custom transformer
Summary
Section 2: Advanced Supervised Learning
Chapter 7: Neural Networks – Here Comes Deep Learning
Getting to know MLP
UnderstandingĀ the algorithm's architectureĀ 
Training the neural network
Configuring the solversĀ 
Classifying items of clothingĀ 
Downloading theĀ Fashion-MNIST dataset
Preparing the data for classification
Experiencing the effects of the hyperparametersĀ 
Learning not too quickly and not too slowly
Picking a suitable batch size
Checking whether more training samples are needed
Checking whether more epochs are needed
Choosing theĀ optimum architecture and hyperparametersĀ 
Adding your own activation function
Untangling the convolutions
Extracting features by convolving
Reducing the dimensionality of the data via max pooling
Putting it all together
MLPĀ regressors
Summary
Chapter 8: Ensembles – When One Model Is Not Enough
Answering the question why ensembles?Ā 
Combining multiple estimators via averaging
Boosting multiple biased estimatorsĀ 
Downloading the UCI Automobile dataset
Dealing with missing values
Differentiating between numerical features and categorical ones
Splitting the data into training and test sets
Imputing the missing values and encoding the categorical features
Using random forest for regression
Checking the effect of the number of trees
Understanding the effect of each trainingĀ feature
Using randomĀ forest for classification
The ROC curve
Using bagging regressors
Preparing a mixture of numerical and categorical features
CombiningĀ KNN estimators using a bagging meta-estimator
Using gradient boosting to predict automobile prices
Plotting the learning deviance
Comparing the learning rate settings
Using different sample sizes
Stopping earlier and adapting the learning rate
Regression rangesĀ 
Using AdaBoost ensemblesĀ 
Exploring more ensembles
Voting ensemblesĀ 
Stacking ensemblesĀ 
Random tree embedding
Summary
Chapter 9: The Y is as Important as the X
Scaling your regression targets
Estimating multiple regression targetsĀ 
Building a multi-output regressorĀ 
Chaining multiple regressorsĀ 
Dealing with compound classification targets
Converting a multi-class problem into a set of binary classifiers
Estimating multiple classification targetsĀ 
Calibrating a classifier's probabilitiesĀ 
Calculating the precision at k
Summary
Chapter 10: Imbalanced Learning – Not Even 1% Win the Lottery
Getting the click prediction datasetĀ 
Installing the imbalanced-learn library
Predicting the CTR
Weighting the training samples differently
The effect of the weighting on the ROC
Sampling the training data
Undersampling the majority class
Oversampling the minority class
Combining data sampling with ensemblesĀ 
Equal opportunity score
Summary
Section 3: Unsupervised Learning and More
Chapter 11: Clustering – Making Sense of Unlabeled Data
Understanding clustering
K-means clustering
Creating a blob-shaped dataset
Visualizing our sample data
Clustering with K-means
TheĀ silhouette score
Choosing the initial centroids
Agglomerative clustering
Tracing the agglomerative clustering's children
TheĀ adjusted Rand index
Choosing the cluster linkageĀ 
DBSCAN
Summary
Chapter 12: Anomaly Detection – Finding Outliers in Data
Unlabeled anomaly detection
Generating sample data
Detecting anomaliesĀ using basic statistics
Using percentiles for multi-dimensional data
Detecting outliers using EllipticEnvelope
Outlier and novelty detection usingĀ LOF
Novelty detection using LOF
Detecting outliers using isolation forest
Summary
Chapter 13: Recommender System – Getting to Know Their Taste
The different recommendation paradigms
Downloading surprise and the datasetĀ 
Downloading the KDD Cup 2012 dataset
Processing and splitting the dataset
Creating a random recommender
Using KNN-inspired algorithms
Using baseline algorithms
Using singular value decomposition
Extracting latent information via SVDĀ Ā 
Comparing the similarity measures for the two matrices
Click prediction using SVD
Deploying machine learning models in production
Summary
Other Books You May Enjoy
Index


šŸ“œ SIMILAR VOLUMES


Hands-On Machine Learning with scikit-le
āœ Tarek Amr šŸ“‚ Library šŸ“… 2020 šŸ› Packt Publishing Ltd 🌐 English

Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems Key Features Delve into machine learning with this comprehensive guide to scikit-learn and scientific Python Master the art of data-driven p

Hands-on Supervised Learning with Python
āœ Gnana Lakshmi T C, Madeleine Shang šŸ“‚ Library šŸ“… 2020 šŸ› BPB Publications 🌐 English

<span><b>Hands-On ML problem solving and creating solutions using Python. </b><br><br> <b>Key Features</b><li>Introduction to Python Programming </li><li>Python for Machine Learning </li><li>Introduction to Machine Learning </li><li>Introduction to Predictive Modelling, Supervised and Unsupervised A

Supervised machine learning with Python:
āœ Smith, Taylor šŸ“‚ Library šŸ“… 2019 šŸ› Packt Publishing 🌐 English

<p><b>Teach your machine to think for itself!</b><p><b>Key Features</b><p><li>Delve into supervised learning and grasp how a machine learns from data<li>Implement popular machine learning algorithms from scratch, developing a deep understanding along the way<li>Explore some of the most popular scien

Hands-On Unsupervised Learning with Pyth
āœ Bonaccorso, Giuseppe šŸ“‚ Library šŸ“… 2019 šŸ› Packt 🌐 English

Unsupervised learning is an increasingly important branch of data science, the goal of which is to train models that can learn the structure of a dataset and provide the user with helpful pieces of information about new samples. In many different business sectors (such as marketing, business intelli

Machine Learning: Step-by-Step Guide To
āœ Rudolph Russell šŸ“‚ Library šŸ“… 2018 šŸ› CreateSpace Independent Publishing Platform 🌐 English

<h1><strong>MACHINE LEARNING - PYTHON<br>Buy the Paperback version of this book, and get the Kindle eBook version included for FREE! </strong></h1><p><strong></strong><br></p><h2>Do You Want to Become An Expert Of Machine Learning?? StartĀ <strong>Getting this Book and Follow My Step by Step Explanat