Statistical Methods for Machine Learning
โ Scribed by Brownlee J.
- Year
- 2019
- Tongue
- English
- Leaves
- 291
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Table of Contents
Copyright
Contents
Preface
I Introduction
II Statistics
Introduction to Statistics
Statistics is Required Prerequisite
Why Learn Statistics?
What is Statistics?
Further Reading
Summary
Statistics vs Machine Learning
Machine Learning
Predictive Modeling
Statistical Learning
Two Cultures
Further Reading
Summary
Examples of Statistics in Machine Learning
Overview
Problem Framing
Data Understanding
Data Cleaning
Data Selection
Data Preparation
Model Evaluation
Model Configuration
Model Selection
Model Presentation
Model Predictions
Summary
III Foundation
Gaussian and Summary Stats
Tutorial Overview
Gaussian Distribution
Sample vs Population
Test Dataset
Central Tendency
Variance
Describing a Gaussian
Extensions
Further Reading
Summary
Simple Data Visualization
Tutorial Overview
Data Visualization
Introduction to Matplotlib
Line Plot
Bar Chart
Histogram Plot
Box and Whisker Plot
Scatter Plot
Extensions
Further Reading
Summary
Random Numbers
Tutorial Overview
Randomness in Machine Learning
Pseudorandom Number Generators
Random Numbers with Python
Random Numbers with NumPy
When to Seed the Random Number Generator
How to Control for Randomness
Common Questions
Extensions
Further Reading
Summary
Law of Large Numbers
Tutorial Overview
Law of Large Numbers
Worked Example
Implications in Machine Learning
Extensions
Further Reading
Summary
Central Limit Theorem
Tutorial Overview
Central Limit Theorem
Worked Example with Dice
Impact on Machine Learning
Extensions
Further Reading
Summary
IV Hypothesis Testing
Statistical Hypothesis Testing
Tutorial Overview
Statistical Hypothesis Testing
Statistical Test Interpretation
Errors in Statistical Tests
Degrees of Freedom in Statistics
Extensions
Further Reading
Summary
Statistical Distributions
Tutorial Overview
Distributions
Gaussian Distribution
Student's t-Distribution
Chi-Squared Distribution
Extensions
Further Reading
Summary
Critical Values
Tutorial Overview
Why Do We Need Critical Values?
What Is a Critical Value?
How to Use Critical Values
How to Calculate Critical Values
Extensions
Further Reading
Summary
Covariance and Correlation
Tutorial Overview
What is Correlation?
Test Dataset
Covariance
Pearson's Correlation
Extensions
Further Reading
Summary
Significance Tests
Tutorial Overview
Parametric Statistical Significance Tests
Test Data
Student's t-Test
Paired Student's t-Test
Analysis of Variance Test
Repeated Measures ANOVA Test
Extensions
Further Reading
Summary
Effect Size
Tutorial Overview
The Need to Report Effect Size
What Is Effect Size?
How to Calculate Effect Size
Extensions
Further Reading
Summary
Statistical Power
Tutorial Overview
Statistical Hypothesis Testing
What Is Statistical Power?
Power Analysis
Student's t-Test Power Analysis
Extensions
Further Reading
Summary
V Resampling Methods
Introduction to Resampling
Tutorial Overview
Statistical Sampling
Statistical Resampling
Extensions
Further Reading
Summary
Estimation with Bootstrap
Tutorial Overview
Bootstrap Method
Configuration of the Bootstrap
Worked Example
Bootstrap in Python
Extensions
Further Reading
Summary
Estimation with Cross-Validation
Tutorial Overview
k-Fold Cross-Validation
Configuration of k
Worked Example
Cross-Validation in Python
Variations on Cross-Validation
Extensions
Further Reading
Summary
VI Estimation Statistics
Introduction to Estimation Statistics
Tutorial Overview
Problems with Hypothesis Testing
Estimation Statistics
Effect Size
Interval Estimation
Meta-Analysis
Extensions
Further Reading
Summary
Tolerance Intervals
Tutorial Overview
Bounds on Data
What Are Statistical Tolerance Intervals?
How to Calculate Tolerance Intervals
Tolerance Interval for Gaussian Distribution
Extensions
Further Reading
Summary
Confidence Intervals
Tutorial Overview
What is a Confidence Interval?
Interval for Classification Accuracy
Nonparametric Confidence Interval
Extensions
Further Reading
Summary
Prediction Intervals
Tutorial Overview
Why Calculate a Prediction Interval?
What Is a Prediction Interval?
How to Calculate a Prediction Interval
Prediction Interval for Linear Regression
Worked Example
Extensions
Further Reading
Summary
VII Nonparametric Methods
Rank Data
Tutorial Overview
Parametric Data
Nonparametric Data
Ranking Data
Working with Ranked Data
Extensions
Further Reading
Summary
Normality Tests
Tutorial Overview
Normality Assumption
Test Dataset
Visual Normality Checks
Statistical Normality Tests
What Test Should You Use?
Extensions
Further Reading
Summary
Make Data Normal
Tutorial Overview
Gaussian and Gaussian-Like
Sample Size
Data Resolution
Extreme Values
Long Tails
Power Transforms
Use Anyway
Extensions
Further Reading
Summary
5-Number Summary
Tutorial Overview
Nonparametric Data Summarization
Five-Number Summary
How to Calculate the Five-Number Summary
Use of the Five-Number Summary
Extensions
Further Reading
Summary
Rank Correlation
Tutorial Overview
Rank Correlation
Test Dataset
Spearman's Rank Correlation
Kendall's Rank Correlation
Extensions
Further Reading
Summary
Rank Significance Tests
Tutorial Overview
Nonparametric Statistical Significance Tests
Test Dataset
Mann-Whitney U Test
Wilcoxon Signed-Rank Test
Kruskal-Wallis H Test
Friedman Test
Extensions
Further Reading
Independence Test
Tutorial Overview
Contingency Table
Pearson's Chi-Squared Test
Example Chi-Squared Test
Extensions
Further Reading
Summary
VIII Appendix
Getting Help
Statistics on Wikipedia
Statistics Textbooks
Python API Resources
Ask Questions About Statistics
How to Ask Questions
Contact the Author
How to Setup a Workstation for Python
Overview
Download Anaconda
Install Anaconda
Start and Update Anaconda
Further Reading
Summary
Basic Math Notation
Tutorial Overview
The Frustration with Math Notation
Arithmetic Notation
Greek Alphabet
Sequence Notation
Set Notation
Other Notation
Tips for Getting More Help
Further Reading
Summary
IX Conclusions
How Far You Have Come
๐ SIMILAR VOLUMES
<b>A practical guide that will help you understand the Statistical Foundations of any Machine Learning Problem.</b> <b>Key Features</b><li> Develop a Conceptual and Mathematical understanding of Statistics</li><li>Get an overview of Statistical Applications in Python</li><li>Learn how to perform
<span>This book is open access under a CC BY 4.0 license<div><br></div><div><div>This open access book brings together the latest genome base prediction models currently being used by statisticians, breeders and data scientists. It provides an accessible way to understand the theory behind each stat
<span>This book is open access under a CC BY 4.0 license<br>This open access book brings together the latest genome base prediction models currently being used by statisticians, breeders and data scientists. It provides an accessible way to understand the theory behind each statistical learning tool