<h4>Key Features</h4><ul><li>Get to know seven algorithms for your data science needs in this concise, insightful guide</li><li>Ensure youβre confident in the basics by learning when and where to use various data science algorithms</li><li>Learn to use machine learning algorithms in a period of just
Data Science Algorithms in a Week: Top 7 algorithms for scientific computing, data analysis, and machine learning, 2nd Edition
β Scribed by David Natingga
- Publisher
- Packt Publishing
- Year
- 2018
- Tongue
- English
- Leaves
- 207
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Build a strong foundation of machine learning algorithms in 7 days
Key Features
- Use Python and its wide array of machine learning libraries to build predictive models
- Learn the basics of the 7 most widely used machine learning algorithms within a week
- Know when and where to apply data science algorithms using this guide
Book Description
Machine learning applications are highly automated and self-modifying, and continue to improve over time with minimal human intervention, as they learn from the trained data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed. Through algorithmic and statistical analysis, these models can be leveraged to gain new knowledge from existing data as well.
Data Science Algorithms in a Week addresses all problems related to accurate and efficient data classification and prediction. Over the course of seven days, you will be introduced to seven algorithms, along with exercises that will help you understand different aspects of machine learning. You will see how to pre-cluster your data to optimize and classify it for large datasets. This book also guides you in predicting data based on existing trends in your dataset. This book covers algorithms such as k-nearest neighbors, Naive Bayes, decision trees, random forest, k-means, regression, and time-series analysis.
By the end of this book, you will understand how to choose machine learning algorithms for clustering, classification, and regression and know which is best suited for your problem
What you will learn
- Understand how to identify a data science problem correctly
- Implement well-known machine learning algorithms efficiently using Python
- Classify your datasets using Naive Bayes, decision trees, and random forest with accuracy
- Devise an appropriate prediction solution using regression
- Work with time series data to identify relevant data events and trends
- Cluster your data using the k-means algorithm
Who this book is for
This book is for aspiring data science professionals who are familiar with Python and have a little background in statistics. You'll also find this book useful if you're currently working with data science algorithms in some capacity and want to expand your skill set
Table of Contents
- Classification using K Nearest Neighbors
- Naive Bayes
- Decision Trees
- Random Forests
- Clustering into K clusters
- Regression
- Time Series Analysis
- Python Reference
- Statistics
- Glossary of Algorithms and Methods in Data Science
β¦ Table of Contents
Title Page
Copyright and Credits
Packt Upsell
Contributors
Table of Contents
Preface
Classification Using K-Nearest Neighbors
Mary and her temperature preferences
Implementation of the k-nearest neighbors algorithm
Map of Italy example β choosing the value of k
Analysis
House ownership β data rescaling
Analysis
Text classification β using non-Euclidean distances
Analysis
Text classification β k-NN in higher dimensions
Analysis
Summary
Problems
Mary and her temperature preference problems
Map of Italy β choosing the value of k
House ownership
Analysis
Naive Bayes
Medical testsΒ β basic application of Bayes' theorem
Analysis
Bayes' theorem and its extension
Bayes' theorem
Proof
Extended Bayes' theorem
Proof
Playing chess β independent events
Analysis
Implementation of a Naive Bayes classifier
Playing chess β dependent events
Analysis
Gender classification β Bayes for continuous random variables
Analysis
Summary
Problems
Analysis
Decision Trees
Swim preference β representing data using a decision tree
Information theory
Information entropy
Coin flipping
Definition of information entropy
Information gain
Swim preference βΒ information gain calculation
ID3 algorithm β decision tree construction
Swim preference β decision tree construction by the ID3 algorithm
Implementation
Classifying with a decision tree
Classifying a data sample with the swimming preference decision tree
Playing chess β analysis with a decision tree
Analysis
Classification
Going shopping β dealing with data inconsistencies
Analysis
Summary
Problems
Analysis
Random Forests
Introduction to the random forest algorithm
Overview of random forest construction
Swim preference β analysis involving a random forest
Analysis
Random forest construction
Construction of random decision tree number 0
Construction of random decision tree number 1
Constructed random forest
Classification using random forest
Implementation of the random forest algorithm
Playing chess example
Analysis
Random forest construction
Classification
Going shopping β overcoming data inconsistencies with randomness and measuring the level of confidence
Analysis
Summary
Problems
Analysis
Clustering into K Clusters
Household incomes β clustering into k clusters
K-means clustering algorithm
Picking the initial k-centroids
Computing a centroid of a given cluster
Using the k-means clustering algorithm on the household income example
Gender classification β clustering to classify
Analysis
Implementation of the k-means clustering algorithm
Input data from gender classification
Program output for gender classification data
House ownership β choosing the number of clusters
Analysis
Document clustering β understanding the number of k clusters in a semantic context
Analysis
Summary
Problems
Analysis
Regression
Fahrenheit and Celsius conversion βΒ linear regression on perfect data
Analysis from first principles
Least squares method for linear regression
Analysis using the least squares method in Python
Visualization
Weight prediction from height β linear regression on real-world data
Analysis
Gradient descent algorithm and its implementation
Gradient descent algorithm
Implementation
Visualization β comparison of the least squares method and the gradient descent algorithm
Flight time duration prediction based on distance
Analysis
Ballistic flight analysis β non-linear model
Analysis
Analysis by using the least squares method in Python
Summary
Problems
Analysis
Time Series Analysis
Business profits β analyzing trends
Analysis
Analyzing trends using the least squares method in Python
Visualization
Conclusion
Electronics shop's sales βΒ analyzing seasonality
Analysis
Analyzing trends using the least squares method in Python
Visualization
Analyzing seasonality
Conclusion
Summary
Problems
Analysis
Python Reference
Introduction
Python Hello World example
Comments
Data types
int
float
String
Tuple
List
Set
Dictionary
Flow control
Conditionals
For loop
For loop onΒ range
For loop onΒ list
Break and continue
Functions
Input and output
Program arguments
Reading and writing a file
Statistics
Basic concepts
Bayesian inference
Distributions
Normal distribution
Cross-validation
K-fold cross-validation
A/B testing
Glossary of Algorithms and Methods in Data Science
Other Books You May Enjoy
Index
π SIMILAR VOLUMES
"Machine learning applications are highly automated and self-modifying, and they continue to improve over time with minimal human intervention as they learn with more data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed
Machine learning has gained tremendous popularity for its powerful and fast predictions with large datasets. However, the true forces behind its powerful output are the complex algorithms involving substantial statistical analysis that churn large datasets and generate substantial insight. This s
This book describes in detail the fundamental mathematics and algorithms of machine learning (an example of artificial intelligence) and signal processing, two of the most important and exciting technologies in the modern information economy. Taking a gradual approach, it builds up concepts in a sol
This book describes in detail the fundamental mathematics and algorithms of machine learning (an example of artificial intelligence) and signal processing, two of the most important and exciting technologies in the modern information economy. Taking a gradual approach, it builds up concepts in a sol
Enhance your data science programming and analysis with the Wolfram programming language and Mathematica, an applied mathematical tools suite. This second edition introduces the latest LLM Wolfram capabilities, delves into the exploration of data types in Mathematica, covers key programming concepts