An Introduction to IoT Analytics (Chapman & Hall/CRC Data Science Series)

✍ Scribed by Harry G. Perros

Publisher: Chapman and Hall/CRC
Year: 2021
Tongue: English
Leaves: 373
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This book covers techniques that can be used to analyze data from IoT sensors and addresses questions regarding the performance of an IoT system. It strikes a balance between practice and theory so one can learn how to apply these tools in practice with a good understanding of their inner workings. This is an introductory book for readers who have no familiarity with these techniques.

The techniques presented in An Introduction to IoT Analytics come from the areas of machine learning, statistics, and operations research. Machine learning techniques are described that can be used to analyze IoT data generated from sensors for clustering, classification, and regression. The statistical techniques described can be used to carry out regression and forecasting of IoT sensor data and dimensionality reduction of data sets. Operations research is concerned with the performance of an IoT system by constructing a model of the system under study and then carrying out a what-if analysis. The book also describes simulation techniques.

Key Features

IoT analytics is not just machine learning but also involves other tools, such as forecasting and simulation techniques.

Many diagrams and examples are given throughout the book to fully explain the material presented.

Each chapter concludes with a project designed to help readers better understand the techniques described.

The material in this book has been class tested over several semesters.

Practice exercises are included with solutions provided online at www.routledge.com/9780367686314

Harry G. Perros is a Professor of Computer Science at North Carolina State University, an Alumni Distinguished Graduate Professor, and an IEEE Fellow. He has published extensively in the area of performance modeling of computer and communication systems.

✦ Table of Contents

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Table of Contents
Preface
Author
Chapter 1: Introduction
1.1 The Internet of Things (IoT)
1.2 IoT Application Domains
1.3 IoT Reference Model
1.4 Performance Evaluation and Modeling of IoT Systems
1.5 Machine Learning and Statistical Techniques for IoT
1.6 Overview of the Book
Exercises
References
Chapter 2: Review of Probability Theory
2.1 Random Variables
2.2 Discrete Random Variables
2.2.1 The Binomial Random Variable
2.2.2 The Geometric Random Variable
2.2.3 The Poisson Random Variable
2.2.4 The Cumulative Distribution
2.3 Continuous Random Variables
2.3.1 The Uniform Random Variable
2.3.2 The Exponential Random Variable
2.3.3 Mixtures of Exponential Random Variables
2.3.4 The Normal Random Variable
2.4 The Joint Probability Distribution
2.4.1 The Marginal Probability Distribution
2.4.2 The Conditional Probability
2.5 Expectation and Variance
2.5.1 The Expectation and Variance of Some Random Variables
Exercises
References
Chapter 3: Simulation Techniques
3.1 Introduction
3.2 The Discrete-event Simulation Technique
3.2.1 Recertification of IoT Devices: A Simple Model
3.2.2 Recertification of IoT Devices: A More Complex Model
3.3 Generating Random Numbers
3.3.1 Generating Pseudo-Random Numbers
3.3.2 Generating Random Variates
3.4 Simulation Designs
3.4.1 The Event List
3.4.2 Selecting the Unit Time
3.5 Estimation Techniques
3.5.1 Collecting Endogenously Created Data
3.5.2 Transient-State versus Steady-State Simulation
3.5.3 Estimation of the Confidence Interval of the Mean
3.5.4 Estimation of the Confidence Interval of a Percentile
3.5.5 Estimation of the Confidence Interval of a Probability
3.5.6 Achieving a Required Accuracy
3.6 Validation of a Simulation Model
3.7 Simulation Languages
Exercises
Simulation Project
References
Chapter 4: Hypothesis Testing
4.1 Statistical Hypothesis Testing for a Mean
4.1.1 The p -Value
4.1.2 Hypothesis Testing for the Difference between Two Population Means
4.1.3 Hypothesis Testing for a Proportion
4.1.4 Type I and Type II Errors
4.2 Analysis of Variance (ANOVA)
4.2.1 Degrees of Freedom
Exercises
References
Chapter 5: Multivariable Linear Regression
5.1 Simple Linear Regression
5.2 Multivariable Linear Regression
5.2.1 Significance of the Regression Coefficients
5.2.2 Residual Analysis
5.2.3 R -Squared
5.2.4 Multicollinearity
5.2.5 Data Transformations
5.3 An Example
5.4 Polynomial Regression
5.5 Confidence and Prediction Intervals
5.6 Ridge, Lasso, and Elastic Net Regression
5.6.1 Ridge Regression
5.6.2 Lasso Regression
5.6.3 Elastic Net Regression
Exercises
Regression Project
Data Set Generation
References
Chapter 6: Time Series Forecasting
6.1 A Stationary Time Series
6.1.1 How to Recognize Seasonality
6.1.2 Techniques for Removing Non-Stationary Features
6.2 Moving Average or Smoothing Models
6.2.1 The Simple Average Model
6.2.2 The Exponential Moving Average Model
6.2.3 The Average Age of a Model
6.2.4 Selecting the Best Value for k and a
6.3 The Moving Average MA( q) Model
6.3.1 Derivation of the Mean and Variance of X t
6.3.2 Derivation of the Autocorrelation Function of the MA(1)
6.3.3 Invertibility of MA( q)
6.4 The Autoregressive Model
6.4.1 The AR(1) Model
6.4.2 Stationarity Condition of AR( p)
6.4.3 Derivation of the Coefficients a i, i = 1, 2, …, p
6.4.4 Determination of the Order of AR( p)
6.5 The Non-Seasonal ARIMA ( p,d,q) Model
6.5.1 Determination of the ARIMA Parameters
6.6 Decomposition Models
6.6.1 Basic Steps for the Decomposition Model
6.7 Forecast Accuracy
6.8 Prediction Intervals
6.9 Vector Autoregression
6.9.1 Fitting a VAR( p)
Exercises
Forecasting Project
Data Set
References
Chapter 7: Dimensionality Reduction
7.1 A Review of Eigenvalues and Eigenvectors
7.2 Principal Component Analysis (PCA)
7.2.1 The PCA Algorithm
7.3 Linear and Multiple Discriminant Analysis
7.3.1 Linear Discriminant Analysis (LDA)
7.3.2 Multiple Discriminant Analysis (MDA)
Exercises
References
Chapter 8: Clustering Techniques
8.1 Distance Metrics
8.2 Hierarchical Clustering
8.2.1 The Hierarchical Clustering Algorithm
8.2.2 Linkage Criteria
8.3 The k -Means Algorithm
8.3.1 The Algorithm
8.3.2 Determining the Number k of Clusters
a. Silhouette Scores
b. Akaike’s Information Criterion (AIC)
8.4 The Fuzzy c -Means Algorithm
8.5 The Gaussian Mixture Decomposition
8.6 The DBSCAN Algorithm
8.6.1 Determining MinPts and ε
8.6.2 Advantages and Disadvantages of DBSCAN
Exercises
Clustering Project
Data Set Generation
References
Chapter 9: Classification Techniques
9.1 The k -Nearest Neighbor ( k -NN) Method
9.1.1 Selection of k
9.1.2 Using Kernels with the k -NN Method
9.1.3 Curse of Dimensionality
9.1.4 Voronoi Diagrams
9.1.5 Advantages and Disadvantages of the k -NN Method
9.2 The Naive Bayes Classifier
9.2.1 The Simple Bayes Classifier
9.2.2 The Naive Bayes Classifier
9.2.3 The Gaussian Naive Bayes Classifier
9.2.4 Advantages and Disadvantages
9.2.5 The k -NN Method Using Bayes’ Theorem
9.3 Decision Trees
9.3.1 Regression Trees
9.3.2 Classification Trees
9.3.3 Pre-Pruning and Post-Pruning
9.3.4 Advantages and Disadvantages of Decision Trees
9.3.5 Decision Trees Ensemble Methods
9.4 Logistic Regression
9.4.1 The Binary Logistic Regression
9.4.2 Multinomial Logistics Regression
9.4.3 Ordinal Logistic Regression
Exercises
Classification Project
References
Chapter 10: Artificial Neural Networks
10.1 The Feedforward Artificial Neural Network
10.2 Other Artificial Neural Networks
10.3 Activation Functions
10.4 Calculation of the Output Value
10.5 Selecting the Number of Layers and Nodes
10.6 The Backpropagation Algorithm
10.6.1 The Gradient Descent Algorithm
10.6.2 Calculation of the Gradients
10.7 Stochastic, Batch, Mini-Batch Gradient Descent Methods
10.8 Feature Normalization
10.9 Overfitting
10.9.1 The Early Stopping Method
10.9.2 Regularization
10.9.3 The Dropout Method
10.10 Selecting the Hyper-Parameters
10.10.1 Selecting the Learning Rate γ
10.10.2 Selecting the Regularization Parameter λ
Exercises
Neural Network Project
Data Set Generation
References
Chapter 11: Support Vector Machines
11.1 Some Basic Concepts
11.2 The SVM Algorithm: Linearly Separable Data
11.3 Soft-Margin SVM ( C- SVM)
11.4 The SVM Algorithm: Non-Linearly Separable Data
11.5 Other SVM methods
11.6 Multiple Classes
11.7 Selecting the Best Values for C and γ
11.8 ε -Support Vector Regression ( ε -SVR)
Exercises
SVM Project
Data Set Generation
References
Chapter 12: Hidden Markov Models
12.1 Markov Chains
12.2 Hidden Markov Models – An Example
12.3 The Three Basic HMM Problems
12.3.1 Problem 1 – The Evaluation Problem
12.3.2 Problem 2 – The Decoding Problem
12.3.3 Problem 3 – The Learning Problem
12.4 Mathematical Notation
12.5 Solution to Problem 1
12.5.1 A Brute Force Solution
12.5.2 The Forward–Backward Algorithm
12.6 Solution to Problem 2
12.6.1 The Heuristic Solution
12.6.2 The Viterbi Algorithm
12.7 Solution to Problem 3
12.8 Selection of the Number of States N
12.9 Forecasting O T+t
12.10 Continuous Observation Probability Distributions
12.11 Autoregressive HMMs
Exercises
HMM Project
Data Set Generation
References
Appendix A: Some Basic Concepts of Queueing Theory
Appendix B: Maximum Likelihood Estimation (MLE)
B.1 The MLE Method
B.2 Relation of MLE to Bayesian Inference
B.3 MLE and the Least Squares Method
B.4 MLE of the Gaussian MA(1)
B.5 MLE of the Gaussian AR(1)
Index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V

📜 SIMILAR VOLUMES

Big Data Analytics: A Guide to Data Scie

📁 Big Data Analytics: A Guide to Data Science Practitioners Making the Transition to Big Data (Chapman & Hall/CRC Data Science Series)

✍ Ulrich Matter 📂 Library 📅 2023 🏛 Chapman and Hall/CRC 🌐 English

Successfully navigating the data-driven economy presupposes a certain understanding of the technologies and methods to gain insights from Big Data. This book aims to help data science practitioners to successfully manage the transition to Big Data. Building on familiar content from appl

Microarray Image Analysis: An Algorithmi

📁 Microarray Image Analysis: An Algorithmic Approach (Chapman & Hall CRC Computer Science & Data Analysis)

✍ Karl Fraser, Zidong Wang, Xiaohu Liu 📂 Library 📅 2010 🏛 Chapman and Hall/CRC 🌐 English

To harness the high-throughput potential of DNA microarray technology, it is crucial that the analysis stages of the process are decoupled from the requirements of operator assistance. Microarray Image Analysis: An Algorithmic Approach presents an automatic system for microarray image processing to

Big Data Analytics: A Guide to Data Scie

📁 Big Data Analytics: A Guide to Data Science Practitioners Making the Transition to Big Data (Chapman & Hall/CRC Data Science)

✍ Matter, Ulrich; 📂 Library 📅 2023 🏛 CRC Press LLC 🌐 English

Successfully navigating the data-driven economy presupposes a certain understanding of the technologies and methods to gain insights from Big Data. This book aims to help data science practitioners to successfully manage the transition to Big Data. Building on familiar content from applied econometr

Basketball Data Science: With Applicatio

📁 Basketball Data Science: With Applications in R (Chapman & Hall/CRC Data Science Series)

✍ Paola Zuccolotto, Marica Manisera 📂 Library 📅 2020 🏛 Chapman and Hall/CRC 🌐 English

Using data from one season of NBA games, Basketball Data Science: With Applications in R is the perfect book for anyone interested in learning and applying data analytics in basketball. Whether assessing the spatial performance of an NBA player’s shots or doing an analysis of the

Data Science for Sensory and Consumer Sc

📁 Data Science for Sensory and Consumer Scientists (Chapman & Hall/CRC Data Science Series)

✍ Thierry Worch, Julien Delarue, Vanessa Rios De Souza, John Ennis 📂 Library 📅 2023 🏛 Chapman and Hall/CRC 🌐 English

Data Science for Sensory and Consumer Scientists is a comprehensive textbook that provides a practical guide to using data science in the field of sensory and consumer science through real-world applications. It covers key topics including data manipulation, preparation, visual

An Introduction to Nonparametric Statist

📁 An Introduction to Nonparametric Statistics (Chapman & Hall/CRC Texts in Statistical Science)

✍ John E. Kolassa 📂 Library 📅 2020 🏛 Chapman and Hall/CRC 🌐 English

An Introduction to Nonparametric Statistics presents techniques for statistical analysis in the absence of strong assumptions about the distributions generating the data. Rank-based and resampling techniques are heavily represented, but robust techniques are considered as well.