Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you'll learn techniques for extracting and transforming features--the numeric representations of raw data--into formats for machine-learning models. Each ch
The Art of Feature Engineering: Essentials for Machine Learning
β Scribed by Pablo Duboue
- Publisher
- Cambridge University Press
- Year
- 2020
- Tongue
- English
- Leaves
- 287
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
When machine learning engineers work with data sets, they may find the results aren't as good as they need. Instead of improving the model or collecting more data, they can use the feature engineering process to help improve results by modifying the data's features to better capture the nature of the problem. This practical guide to feature engineering is an essential addition to any data scientist's or machine learning engineer's toolbox, providing new ideas on how to improve the performance of a machine learning solution. Beginning with the basic concepts and techniques, the text builds up to a unique cross-domain approach that spans data on graphs, texts, time series, and images, with fully worked out case studies. Key topics include binning, out-of-fold estimation, feature selection, dimensionality reduction, and encoding variable-length data. The full source code for the case studies is available on a companion website as Python Jupyter notebooks.
β¦ Table of Contents
Contents
Preface
PART ONE FUNDAMENTALS
1 Introduction
1.1 Feature Engineering
1.2 Evaluation
1.3 Cycles
1.4 Analysis
1.5 Other Processes
1.6 Discussion
1.7 Learning More
2 Features, Combined: Normalization, Discretization and Outliers
2.1 Normalizing Features
2.2 Discretization and Binning
2.3 Descriptive Features
2.4 Dealing with Outliers
2.5 Advanced Techniques
2.6 Learning More
3 Features, Expanded: Computable Features, Imputation and Kernels
3.1 Computable Features
3.2 Imputation
3.3 Decomposing Complex Features
3.4 Kernel-Induced Feature Expansion
3.5 Learning More
4 Features, Reduced: Feature Selection, Dimensionality Reduction and Embeddings
4.1 Feature Selection
4.2 Regularization and Embedded Feature Selection
4.3 Dimensionality Reduction
4.4 Learning More
5 Advanced Topics: Variable-Length Data and Automated Feature Engineering
5.1 Variable-Length Feature Vectors
5.2 Instance-Based Engineering
5.3 Deep Learning and Feature Engineering
5.4 Automated Feature Engineering
5.5 Learning More
PART TWO CASE STUDIES
6 Graph Data
6.1 WikiCities Dataset
6.2 Exploratory Data Analysis (EDA)
6.3 First Feature Set
6.4 Second Feature Set
6.5 Final Feature Sets
6.6 Learning More
7 Timestamped Data
7.1 WikiCities: Historical Features
7.2 Time Lagged Features
7.3 Sliding Windows
7.4 Third Featurization: EMA
7.5 Historical Data as Data Expansion
7.6 Time Series
7.7 Learning More
8 Textual Data
8.1 WikiCities: Text
8.2 Exploratory Data Analysis
8.3 Numeric Tokens Only
8.4 Bag-of-Words
8.5 Stop Words and Morphological Features
8.6 Features in Context
8.7 Skip Bigrams and Feature Hashing
8.8 Dimensionality Reduction and Embeddings
8.9 Closing Remarks
8.10 Learning More
9 Image Data
9.1 WikiCities: Satellite Images
9.2 Exploratory Data Analysis
9.3 Pixels as Features
9.4 Automatic Dataset Expansion
9.5 Descriptive Features: Histograms
9.6 Local Feature Detectors: Corners
9.7 Dimensionality Reduction: HOGs
9.8 Closing Remarks
9.9 Learning More
10 Other Domains: Video, GIS and Preferences
10.1 Video
10.2 Geographical Features
10.3 Preferences
Bibliography
Index
β¦ Subjects
Machine Learning; Deep Learning; Image Analysis; Python; Feature Engineering; Graph Data Model; Text Analysis; Dimensionality Reduction; Time Series Analysis
π SIMILAR VOLUMES
"Feature engineering plays a vital role in big data analytics. Machine learning and data mining algorithms cannot work without data. Little can be achieved if there are few features to represent the underlying data objects, and the quality of results of those algorithms largely depends on the qualit
Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, youβll learn techniques for extracting and transforming featuresβthe numeric representations of raw dataβinto formats for machine-learning models. Each chap
Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, youβll learn techniques for extracting and transforming featuresβthe numeric representations of raw dataβinto formats for machine-learning models. Each chap
Intro; Copyright; Table of Contents; Preface; Introduction; Conventions Used in This Book; Using Code Examples; O'Reilly Safari; How to Contact Us; Acknowledgments; Special Thanks from Alice; Special Thanks from Amanda; Chapter 1. The Machine Learning Pipeline; Data; Tasks; Models; Features; Model E