Key Features β’ Master popular machine learning models including k-nearest neighbors, random forests, logistic regression, k-means, naive Bayes, and artificial neural networks β’ Learn how to build and evaluate performance of efficient models using scikit-learn β’ Practical guide to master your basi
Mastering machine learning with scikit-learn: apply effective learning algorithms to real-world problems using scikit-learn
β Scribed by Hackeling, Gavin
- Publisher
- Packt Publishing
- Year
- 2014
- Tongue
- English
- Leaves
- 263
- Series
- Community experience distilled
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Apply effective learning algorithms to real-world problems using scikit-learnAbout This Book Design and troubleshoot machine learning systems for common tasks including regression, classification, and clustering Acquaint yourself with popular machine learning algorithms, including decision trees, logistic regression, and support vector machines A practical example-based guide to help you gain expertise in implementing and evaluating machine learning systems using scikit-learn Who This Book Is ForIf you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential.What You Will LearnReview fundamental concepts including supervised and unsupervised experiences, common tasks, and performance metricsPredict the values of continuous variables using linear regressionCreate representations of documents and images that can be used in machine learning modelsCategorize documents and text messages using logistic regression and support vector machinesClassify images by their subjectsDiscover hidden structures in data using clustering and visualize complex data using decompositionEvaluate the performance of machine learning systems in common tasksDiagnose and redress problems with models due to bias and varianceIn DetailThis book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features.You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models. The book will also walk you through an example project that prompts you to label the most uncertain training examples. You will also use an unsupervised Hidden Markov Model to predict stock prices.By the end of the book, you will be an expert in scikit-learn and will be well versed in machine learning
β¦ Table of Contents
Mastering Machine Learning with scikit-learn......Page 5
Table of Contents......Page 2
Mastering Machine Learning with scikit-learn......Page 6
Credits......Page 7
About the Author......Page 9
About the Reviewers......Page 10
Free access for Packt account holders......Page 11
Preface......Page 13
What this book covers......Page 14
What you need for this book......Page 16
Who this book is for......Page 17
Conventions......Page 18
Reader feedback......Page 19
Customer support......Page 20
Downloading the example code......Page 21
Errata......Page 22
Piracy......Page 23
Questions......Page 24
1. The Fundamentals of Machine Learning......Page 25
Learning from experience......Page 26
Machine learning tasks......Page 28
Training data and test data......Page 30
Performance measures, bias, and variance......Page 33
An introduction to scikit-learn......Page 36
Installing scikit-learn on Windows......Page 37
Verifying the installation......Page 38
Installing pandas and matplotlib......Page 40
Summary......Page 41
Simple linear regression......Page 42
Evaluating the fitness of a model with a cost function......Page 45
Solving ordinary least squares for simple linear regression......Page 47
Evaluating the model......Page 50
Multiple linear regression......Page 53
Polynomial regression......Page 57
Regularization......Page 62
Exploring the data......Page 63
Fitting and evaluating the model......Page 66
Fitting models with gradient descent......Page 69
Summary......Page 72
Extracting features from categorical variables......Page 73
The bag-of-words representation......Page 75
Stop-word filtering......Page 78
Stemming and lemmatization......Page 79
Extending bag-of-words with TF-IDF weights......Page 81
Space-efficient feature vectorizing with the hashing trick......Page 83
Extracting features from pixel intensities......Page 85
Extracting points of interest as features......Page 87
SIFT and SURF......Page 89
Data standardization......Page 91
Summary......Page 92
Binary classification with logistic regression......Page 93
Spam filtering......Page 96
Binary classification performance metrics......Page 99
Accuracy......Page 100
Precision and recall......Page 101
Calculating the F1 measure......Page 104
ROC AUC......Page 105
Tuning models with grid search......Page 107
Multi-class classification......Page 110
Multi-class classification performance metrics......Page 113
Multi-label classification and problem transformation......Page 115
Multi-label classification performance metrics......Page 119
Summary......Page 121
Decision trees......Page 122
Training decision trees......Page 124
Selecting the questions......Page 125
Information gain......Page 128
Gini impurity......Page 133
Decision trees with scikit-learn......Page 135
Tree ensembles......Page 137
The advantages and disadvantages of decision trees......Page 138
Summary......Page 140
6. Clustering with K-Means......Page 141
Clustering with the K-Means algorithm......Page 142
Local optima......Page 150
The elbow method......Page 151
Evaluating clusters......Page 155
Image quantization......Page 157
Clustering to learn features......Page 159
Summary......Page 162
An overview of PCA......Page 163
Variance, Covariance, and Covariance Matrices......Page 168
Eigenvectors and eigenvalues......Page 169
Dimensionality reduction with Principal Component Analysis......Page 172
Using PCA to visualize high-dimensional data......Page 176
Face recognition with PCA......Page 178
Summary......Page 181
8. The Perceptron......Page 182
Activation functions......Page 183
The perceptron learning algorithm......Page 184
Binary classification with the perceptron......Page 186
Document classification with the perceptron......Page 194
Limitations of the perceptron......Page 197
Summary......Page 199
Kernels and the kernel trick......Page 200
Maximum margin classification and support vectors......Page 205
Classifying handwritten digits......Page 208
Classifying characters in natural images......Page 211
Summary......Page 214
Nonlinear decision boundaries......Page 215
Multilayer perceptrons......Page 218
Forward propagation......Page 220
Backpropagation......Page 226
Approximating XOR with Multilayer perceptrons......Page 241
Classifying handwritten digits......Page 243
Summary......Page 244
Index......Page 245
β¦ Subjects
Computer Science;Technical
π SIMILAR VOLUMES
This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-uns
If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential.