𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Feature Engineering and Selection: A Practical Approach for Predictive Models

✍ Scribed by Max Kuhn, Kjell Johnson


Publisher
CRC Press
Year
2021
Tongue
English
Leaves
314
Series
Chapman & Hall/CRC Data Science
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

✦ Table of Contents


  1. Introduction
    A Simple Example
    Important Concepts
    A More Complex Example
    Feature Selection
    An Outline of the Book
    Computing

  2. Illustrative Example: Predicting Risk of Ischemic Stroke
    Splitting
    Preprocessing
    Exploration
    Predictive Modeling Across Sets
    Other Considerations
    Computing

  3. A Review of the Predictive Modeling Process
    Illustrative Example: OkCupid Profile Data
    Measuring Performance
    Data Splitting
    Resampling
    Tuning Parameters and Overfitting
    Model Optimization and Tuning
    Comparing Models Using the Training Set
    Feature Engineering Without Overfitting
    Summary
    Computing

  4. Exploratory Visualizations
    Introduction to the Chicago Train Ridership Data
    Visualizations for Numeric Data: Exploring Train Ridership Data
    Visualizations for Categorical Data: Exploring the OkCupid Data
    Post Modeling Exploratory Visualizations
    Summary
    Computing

  5. Encoding Categorical Predictors
    Creating Dummy Variables for Unordered Categories
    Encoding Predictors with Many Categories
    Approaches for Novel Categories
    Supervised Encoding Methods
    Encodings for Ordered Data
    Creating Features from Text Data
    Factors versus Dummy Variables in Tree-Based Models
    Summary
    Computing

  6. Engineering Numeric Predictors
    Transformations
    Many Transformations
    Many: Many Transformations
    Summary
    Computing

  7. Detecting Interaction Effects
    Guiding Principles in the Search for Interactions
    Practical Considerations
    The Brute-Force Approach to Identifying Predictive Interactions
    Approaches when Complete Enumeration is Practically Impossible
    Other Potentially Useful Tools
    Summary
    Computing

  8. Handling Missing Data
    Understanding the Nature and Severity of Missing Information
    Models that are Resistant to Missing Values
    Deletion of Data
    Encoding Missingness
    Imputation methods
    Special Cases
    Summary
    Computing

  9. Working with Profile Data
    Illustrative Data: Pharmaceutical Manufacturing Monitoring
    What are the Experimental Unit and the Unit of Prediction?
    Reducing Background
    Reducing Other Noise
    Exploiting Correlation
    Impacts of Data Processing on Modeling
    Summary
    Computing

  10. Feature Selection Overview
    Goals of Feature Selection
    Classes of Feature Selection Methodologies
    Effect of Irrelevant Features
    Overfitting to Predictors and External Validation
    A Case Study
    Next Steps
    Computing

  11. Greedy Search Methods
    Illustrative Data: Predicting Parkinson’s Disease
    Simple Filters
    Recursive Feature Elimination
    Stepwise Selection
    Summary
    Computing

  12. Global Search Methods
    Naive Bayes Models
    Simulated Annealing
    Genetic Algorithms
    Test Set Results
    Summary
    Computing

✦ Subjects


Predictive Models; Data Visualization; Feature Engineering; Categorical Variables; R; Greedy Algorithms; Search Algorithms; Feature Selection


πŸ“œ SIMILAR VOLUMES


Feature Engineering and Selection: A Pra
✍ Max Kuhn, Kjell Johnson πŸ“‚ Library πŸ“… 2019 πŸ› Chapman and Hall/CRC 🌐 English

<p>The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subse

Feature Engineering and Selection: A Pra
✍ Max Kuhn, Kjell Johnson πŸ“‚ Library πŸ“… 2019 πŸ› Chapman and Hall/CRC 🌐 English

<p>The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subse

Feature Engineering & Selection for Expl
✍ Md Azimul Haque πŸ“‚ Library πŸ“… 2023 πŸ› Leanpub 🌐 English

I found the root cause of many challenges faced by my students who recently transitioned into data science and machine learning. I have tried to address these issues in my book and would like to dedicate this book to all my students for all the love and respect I have received.

Model Selection and Multimodel Inference
✍ Kenneth P. Burnham, David R. Anderson πŸ“‚ Library πŸ“… 2002 πŸ› Springer 🌐 English

<span>A unique and comprehensive text on the philosophy of model-based data analysis and strategy for the analysis of empirical data. The book introduces information theoretic approaches and focuses critical attention on a priori modeling and the selection of a good approximating model that best rep