𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Feature Engineering and Selection: A Practical Approach for Predictive Models (Chapman & Hall/CRC Data Science Series)

✍ Scribed by Max Kuhn, Kjell Johnson


Publisher
Chapman and Hall/CRC
Year
2019
Tongue
English
Leaves
308
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

✦ Table of Contents


Cover
Half Title
Title Page
Copyright Page
Dedication
Table of Contents
Preface
Author Bios
1: Introduction
1.1 A Simple Example
1.2 Important Concepts
1.3 A More Complex Example
1.4 Feature Selection
1.5 An Outline of the Book
1.6 Computing
2: Illustrative Example: Predicting Risk of Ischemic Stroke
2.1 Splitting
2.2 Preprocessing
2.3 Exploration
2.4 Predictive Modeling across Sets
2.5 Other Considerations
2.6 Computing
3: A Review of the Predictive Modeling Process
3.1 Illustrative Example: OkCupid Profile Data
3.2 Measuring Performance
3.3 Data Splitting
3.4 Resampling
3.5 Tuning Parameters and Overfitting
3.6 Model Optimization and Tuning
3.7 Comparing Models Using the Training Set
3.8 Feature Engineering without Overfitting
3.9 Summary
3.10 Computing
4: Exploratory Visualizations
4.1 Introduction to the Chicago Train Ridership Data
4.2 Visualizations for Numeric Data: Exploring Train Ridership Data
4.3 Visualizations for Categorical Data: Exploring the OkCupid Data
4.4 Postmodeling Exploratory Visualizations
4.5 Summary
4.6 Computing
5: Encoding Categorical Predictors
5.1 Creating Dummy Variables for Unordered Categories
5.2 Encoding Predictors with Many Categories
5.3 Approaches for Novel Categories
5.4 Supervised Encoding Methods
5.5 Encodings for Ordered Data
5.6 Creating Features from Text Data
5.7 Factors versus Dummy Variables in Tree-Based Models
5.8 Summary
5.9 Computing
6: Engineering Numeric Predictors
6.1 1:1 Transformations
6.2 1:Many Transformations
6.3 Many:Many Transformations
6.4 Summary
6.5 Computing
7: Detecting Interaction Effects
7.1 Guiding Principles in the Search for Interactions
7.2 Practical Considerations
7.3 The Brute-Force Approach to Identifying Predictive Interactions
7.4 Approaches when Complete Enumeration Is Practically Impossible
7.5 Other Potentially Useful Tools
7.6 Summary
7.7 Computing
8: Handling Missing Data
8.1 Understanding the Nature and Severity of Missing Information
8.2 Models that Are Resistant to Missing Values
8.3 Deletion of Data
8.4 Encoding Missingness
8.5 Imputation Methods
8.6 Special Cases
8.7 Summary
8.8 Computing
9: Working with Profile Data
9.1 Illustrative Data: Pharmaceutical Manufacturing Monitoring
9.2 What Are the Experimental Unit and the Unit of Prediction?
9.3 Reducing Background
9.4 Reducing Other Noise
9.5 Exploiting Correlation
9.6 Impacts of Data Processing on Modeling
9.7 Summary
9.8 Computing
10: Feature Selection Overview
10.1 Goals of Feature Selection
10.2 Classes of Feature Selection Methodologies
10.3 Effect of Irrelevant Features
10.4 Overfitting to Predictors and External Validation
10.5 A Case Study
10.6 Next Steps
10.7 Computing
11: Greedy Search Methods
11.1 Illustrative Data: Predicting Parkinson’s Disease
11.2 Simple Filters
11.3 Recursive Feature Elimination
11.4 Stepwise Selection
11.5 Summary
11.6 Computing
12: Global Search Methods
12.1 Naive Bayes Models
12.2 Simulated Annealing
12.3 Genetic Algorithms
12.4 Test Set Results
12.5 Summary
12.6 Computing
Bibliography
Index


πŸ“œ SIMILAR VOLUMES


Feature Engineering and Selection: A Pra
✍ Max Kuhn, Kjell Johnson πŸ“‚ Library πŸ“… 2021 πŸ› CRC Press 🌐 English

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset o

Feature Engineering and Selection: A Pra
✍ Max Kuhn, Kjell Johnson πŸ“‚ Library πŸ“… 2019 πŸ› Chapman and Hall/CRC 🌐 English

<p>The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subse

Explanatory Model Analysis: Explore, Exp
✍ Przemyslaw Biecek, Tomasz Burzykowski πŸ“‚ Library πŸ› Chapman and Hall/CRC 🌐 English

<p><span>Explanatory Model Analysis</span><span> </span><span>Explore, Explain and Examine Predictive Models</span><span> is a set of methods and tools designed to build better predictive models and to monitor their behaviour in a changing environment. Today, the true bottleneck in predictive modell

Data Science for Sensory and Consumer Sc
✍ Thierry Worch, Julien Delarue, Vanessa Rios De Souza, John Ennis πŸ“‚ Library πŸ“… 2023 πŸ› Chapman and Hall/CRC 🌐 English

<p><span>Data Science for Sensory and Consumer Scientists</span><span> is a comprehensive textbook that provides a practical guide to using data science in the field of sensory and consumer science through real-world applications. It covers key topics including data manipulation, preparation, visual

Design and Modeling for Computer Experim
✍ Kai-Tai Fang, Runze Li, Agus Sudjianto πŸ“‚ Library πŸ“… 2005 πŸ› Chapman and Hall CRC 🌐 English

Computer simulations based on mathematical models have become ubiquitous across the engineering disciplines and throughout the physical sciences. Successful use of a simulation model, however, requires careful interrogation of the model through systematic computer experiments. While specific theoret