Examine the latest technological advancements in building a scalable machine-learning model with big data using R. This second edition shows you how to work with a machine-learning algorithm and use it to build a ML model from raw data. You will see how to use R programming with TensorFlow, thus avo
Practical Machine Learning with R: Tutorials and Case Studies
β Scribed by Carsten Lange
- Year
- 2024
- Tongue
- English
- Leaves
- 369
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
This textbook is a comprehensive guide to machine learning and artificial intelligence tailored for students in business and economics. It takes a hands-on approach to teach machine learning, emphasizing practical applications over complex mathematical concepts. Students are not required to have advanced mathematics knowledge such as matrix algebra or calculus.
The author introduces machine learning algorithms, utilizing the widely used R language for statistical analysis. Each chapter includes examples, case studies, and interactive tutorials to enhance understanding. No prior programming knowledge is needed. The book leverages the tidymodels package, an extension of R, to streamline data processing and model workflows. This package simplifies commands, making the logic of algorithms more accessible by minimizing programming syntax hurdles. The use of tidymodels ensures a unified experience across various machine learning models.
With interactive tutorials that students can download and follow along at their own pace, the book provides a practical approach to apply machine learning algorithms to real-world scenarios.
In addition to the interactive tutorials, each chapter includes a Digital Resources section, offering links to articles, videos, data, and sample R code scripts. A companion website further enriches the learning and teaching experience:https://ai.lange-analytics.com.
This book is not just a textbook; it is a dynamic learning experience that empowers students and instructors alike with a practical and accessible approach to machine learning in business and economics.
Key Features
Unlocks machine learning basics without advanced mathematics β no calculus or matrix algebra required.
Demonstrates each concept with R code and real-world data for a deep understanding β no prior programming knowledge is needed.
Bridges the gap between theory and real-world applications with hands-on interactive projects and tutorials in every chapter, guided with hints and solutions.
Encourages continuous learning with chapter-specific online resourcesβvideo tutorials, R-scripts, blog posts, and an online community.
Supports instructors through a companion website that includes customizable materials such as slides and syllabi to fit their specific course needs.
β¦ Table of Contents
Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
List of Figures
List of Tables
1. Introduction
1.1. How the Book is Organized
1.2. Using tidymodels for Data Processing and Model Workflows
1.3. Interactive Sections and Digital Resources
1.4. Companion Website
1.5. Acknowledgments
2. Basics of Machine Learning
2.1. Learning Outcomes
2.2. Machine Learning, Artificial Intelligence, and Deep Learning
2.3. Machine Learning Tasks
2.4. Machine Learning Terminology
2.5. Digital Resources
3. Introduction to R and RStudio
3.1. Learning Outcomes
3.2. Install and Set Up R and RStudio
3.3. RStudio the Integrated Development Environment (IDE) for R
3.3.1. The Window Layout in RStudio
3.3.2. RStudio Configuration
3.4. R Packages
3.5. How R Stores Data
3.5.1. R Data Types
3.5.2. R Object Types
3.6. Naming Rules for R Objects
3.7. How R Displays Very Big and Very Small Numbers
3.8. The Structure of R Commands
3.9. Data Wrangling with the tidyverse Package
3.9.1. Select Data
3.9.2. Filter Data
3.9.3. Mutate (Calculate) New Variables in a Data Frame
3.9.4. Linking R Commands Together with Piping
3.10. A Project Using the tidyverse Package
3.10.1. Introduction: Was Chivalry Dead on the Titanic?
3.10.2. Analyzing Titanic Survival Rates for Women and Men
3.11. Exercises: Working with R
3.12. Digital Resourses
4. k-Nearest Neighbors β Getting Started
4.1. Learning Outcomes
4.2. R Packages Required for the Chapter
4.3. Preparing the Wine Dataset
4.4. Visualizing the Training Data
4.5. The Idea Behind k-Nearest Neighbors
4.5.1. k-Nearest Neighbors for k=1
4.5.2. k-Nearest Neighbors for k>1
4.6. Scaling Predictor Variables
4.7. Using Tidymodels for k-Nearest Neighbors
4.7.1. The tidymodels Package
4.7.2. Loading and Splitting the Data
4.7.3. Recipe for Data Pre-Processing
4.7.4. Creating a Model-Design
4.7.5. Creating and Training a Workflow
4.7.6. Predicting with a Fitted Workflow Model
4.7.7. Assessing the Predictive Quality with Metrics
4.8. Interpreting a Confusion Matrix
4.9. Project: Predicting Wine Color with Several Chemical Properties
4.10. Project: Recognize Handwriten Numbers
4.10.1. How Images Are Stored
4.10.2. Build Recipe, Model-Design, and Workflow
4.11. When and When Not to Use kNN Models
4.12. Digital Resourses
5. Linear Regression β Key Machine Learning Concepts
5.1. Learning Outcomes
5.2. R Packages Required for the Chapter
5.3. The Basic Idea Behind Linear Regression
5.4. Univariate Mockup: Study Time and Grades
5.4.1. Predictions and Errors
5.4.2. Calculate Optimal Parameter Values Based on OLS
5.4.3. Trial-and-Error to find Optimal Parameters
5.5. Project: Predict House Prices with Multivariate Regression
5.6. When and When not to use Linear Regression
5.7. Digital Resources
6. Polynomial Regression β Overfitting/Tuning Explained
6.1. Learning Outcomes
6.2. R Packages Required for the Chapter
6.3. The Problem of Overfitting
6.4. Demonstrating Overfitting with a Polynomial Model
6.5. Tuning the Complexity of a Polynomial Model
6.5.1. Hyper-Parameters vs. Model Parameters
6.5.2. Creating the Tuning Workflow
6.5.3. Validating the Tuning Results
6.5.4. Executing the Tuning Process
6.6. 10-Step Template to Tune with tidymodels
6.7. Project: Tuning a k-Nearest Neighbors Model
6.8. When and When not to Use Polynomial Regression
6.9. When and When not to Use Tuning
6.10. Digital Resources
7. Ridge, Lasso, and Elastic-Net β Regularization Explained
7.1. Learning Outcomes
7.2. R Packages Required for the Chapter
7.3. Unregularized Benchmark Model
7.4. The Idea Behind Regularization
7.4.1. Lasso Regularization
7.4.2. Ridge Regularization
7.4.3. Elastic-Net β Combining Lasso and Ridge
7.5. Project: Predicting House Prices with Elastic-Net
7.5.1. The Data
7.5.2. Unregularized Benchmark Model
7.5.3. Regularized Elastic-Net Polynomial Model
7.5.4. Tuning the Elastic-Net Polynomial Model
7.6. When and When Not to Use Ridge and Lasso Models
7.7. Digital Resources
8. Logistic Regression β Handling Imbalanced Data
8.1. Learning Outcomes
8.2. R Packages Required for the Chapter
8.3. The Idea Behind Logistic Regression
8.4. Analyzing Churn with Logistic Regression
8.5. Balancing Data with Downsampling, Upsampling, and SMOTE
8.6. Repeating the Churn Analysis with Balanced Data
8.7. When and When Not to Use Logistic Regression
8.8. Digital Resources
9. Deep Learning β MLP Neural Networks Explained
9.1. Learning Outcomes
9.2. R Packages Required for the Chapter
9.3. Data
9.4. The Idea Behind Neural Network Models
9.4.1. Graphical Representation of a Neural Network Explained
9.4.2. Transforming the Graphic Approach Into a Prediction Equation
9.4.3. Numerical Example: Predicting Prices for Diamonds with a Neural Network
9.4.4. How the Optimizer Improves the Parameters in a Neural Network
9.5. Build a Simplified Neural Network Model
9.6. NNet vs. PyTorch (brulee)
9.6.1. Hyper-Parameters
9.6.2. ReLU Activation Functions
9.7. Using PyTorch to Predict Diamond Prices
9.8. When and When Not to Use Neural Networks
9.9. Digital Resources
10. Tree-Based Models β Bootstrapping Explained
10.1. Learning Outcomes
10.2. R Packages Required for the Chapter
10.3. Decision Trees
10.3.1. The Idea Behind a Decision Tree
10.3.2. The Instability of Decision Trees
10.3.3. Project: Test the Instability of Decision Trees
10.4. Random Forest
10.4.1. The Idea Behind Random Forest
10.4.2. Predicting Vaccination Behavior with Random Forest
10.5. Boosting Trees Algorithms
10.5.1. The Idea Behind Gradient Boosting
10.5.2. Using XGBoost to Predict Vaccination Rates
10.6. When and When Not to Use Tree-Based Models
10.7. Digital Resources
10.7.1. Decision Trees
10.7.2. Random Forest
10.7.3. Boosting Trees Algorithms
11. Interpreting Machine Learning Results
11.1. Learning Outcomes
11.2. R Packages Required for the Chapter
11.3. Categorizing Interpretation Methods
11.4. Data, Model Design, and Workflow-Model
11.5. Visualizing the Impact of Changing Predictor Variables
11.5.1. Ceteris Paribus Plots
11.5.2. Partial Dependence Plot
11.6. Variable Importance for Tree-Based Models
11.6.1. Impurity-Based Variable Importance Plot
11.6.2. Permutation-Based Variable Importance Plot
11.7. SHAP Contribution of Predictor Variables
11.7.1. SHAPLEY Values
11.7.2. From SHAPLEY to SHAP Values
11.7.3. Project: Apply the SHAP Algorithms in a Project
11.8. Local Interpretable Model-agnostic Explanations (LIME)
11.9. When and When Not to Use Interpretation
11.10. Digital Resources
12. Concluding Remarks
Bibliography
Index
π SIMILAR VOLUMES
<p></p><p>Examine the latest technological advancements in building a scalable machine-learning model with big data using R. This second edition shows you how to work with a machine-learning algorithm and use it to build a ML model from raw data. You will see how to use R programming with TensorFlow
This book implements many common Machine Learning algorithms in equivalent R and Python. The book touches on R and Python implementations of different regression models, classification algorithms including logistic regression, KNN classification, SVMs, b-splines, random forest, boosting etc. Other t
<p><p></p><p>This book presents machine learning as a set of pre-requisites, co-requisites, and post-requisites, focusing on mathematical concepts and engineering applications in advanced welding and cutting processes. It describes a number of advanced welding and cutting processes and then assesses
"The versatile capabilities and large set of add-on packages make R an excellent alternative to many existing and often expensive data mining tools. Exploring this area from the perspective of a practitioner, Data mining with R: learning with case studies uses practical examples to illustrate the po