<p>Leverage machine and deep learning models to build applications on real-time data using PySpark. This book is perfect for those who want to learn to use this language to perform exploratory data analysis and solve an array of business challenges.<br>You'll start by reviewing PySpark fundamentals,
Learn PySpark. Build Python-based Machine Learning and Deep Learning Models
✍ Scribed by Pramod Singh
- Publisher
- Apress
- Year
- 2019
- Tongue
- English
- Leaves
- 214
- Category
- Library
No coin nor oath required. For personal study only.
✦ Table of Contents
Contents......Page 3
Introduction......Page 8
History......Page 10
Data Collection......Page 11
Data Processing......Page 12
Spark Architecture......Page 13
Resource Management......Page 14
Structured Streaming......Page 17
Programming Language APIs......Page 18
Local Setup......Page 19
Databricks......Page 20
Conclusion......Page 25
2 Data Processing......Page 26
Creating Dataframes......Page 27
Null Values......Page 28
Subset of a Dataframe......Page 32
Select......Page 33
Filter......Page 34
Aggregations......Page 35
Collect......Page 44
User-Defined Functions (UDFs)......Page 46
Pandas UDF......Page 49
Joins......Page 50
Pivoting......Page 52
Window Functions or Windowed Aggregates......Page 53
Conclusion......Page 57
Batch vs. Stream......Page 58
Stream Processing......Page 59
Spark Streaming......Page 60
Structured Streaming......Page 62
Data Input......Page 65
Building a Structured App......Page 66
Operations......Page 68
Joins......Page 72
Conclusion......Page 74
Workflows......Page 75
Undirected Graphs......Page 77
Directed Graphs......Page 78
DAG Overview......Page 79
Operators......Page 81
Airflow Using Docker......Page 82
Creating Your First DAG......Page 84
Step 2: Defining the Default Arguments......Page 86
Step 4: Declaring Tasks......Page 87
Step 5: Mentioning Dependencies......Page 88
Conclusion......Page 92
5 MLlib - Machine Learning Library......Page 93
Calculating Correlations......Page 94
Chi-Square Test......Page 97
Binarizer......Page 102
Principal Component Analysis......Page 104
Normalizer......Page 106
Standard Scaling......Page 108
Min-Max Scaling......Page 109
MaxAbsScaler......Page 111
Binning......Page 112
Step 1: Load the Dataset......Page 115
Step 2: Explore the Dataframe......Page 116
Step 3: Data Transformation......Page 118
Step 5: Model Training......Page 120
Step 6: Hyperparameter Tuning......Page 121
Conclusion......Page 123
Supervised Machine Learning Primer......Page 124
Binary Classification......Page 127
Building a Linear Regression Model......Page 128
Step 2: Read the Dataset......Page 130
Step 3: Feature Engineering......Page 132
Step 4: Split the Dataset......Page 133
Step 5: Build and Train Linear Regression Model......Page 134
Step 1: Build and Train Generalized Linear Regression Model......Page 135
Step 2: Evaluate the Model Performance on Test Data......Page 136
Decision Tree Regression......Page 138
Step 2: Evaluate the Model Performance on Test Data......Page 139
Random Forest Regressors......Page 140
Step 1: Build and Train Random Forest Regressor Model......Page 141
Step 2: Evaluate the Model Performance on Test Data......Page 142
Step 1: Build and Train a GBT Regressor Model......Page 143
Step 2: Evaluate the Model Performance on Test Data......Page 144
Logistic Regression......Page 145
Step 1: Read the Dataset......Page 146
Step 2: Feature Engineering for Model......Page 147
Step 4: Build and Train the Logistic Regression Model......Page 149
Step 5: Evaluate Performance on Training Data......Page 150
Step 6: Evaluate Performance on Test Data......Page 153
Step 1: Build and Train Decision Tree Classifier Model......Page 155
Step 2: Evaluate Performance on Test Data......Page 156
Support Vector Machines Classifiers......Page 157
Step 2: Evaluate Performance on Test Data......Page 158
Naive Bayes Classifier......Page 159
Step 2: Evaluate Performance on Test Data......Page 160
Step 1: Build and Train the GBT Model......Page 161
Step 2: Evaluate Performance on Test Data......Page 162
Step 1: Build and Train the Random Forest Model......Page 163
Step 2: Evaluate Performance on Test Data......Page 164
Hyperparameter Tuning and Cross-Validation......Page 165
Conclusion......Page 166
Unsupervised Machine Learning Primer......Page 167
Importing SparkSession and Creating an Object......Page 171
Reshaping a Dataframe for Clustering......Page 175
Building Clusters with K-Means......Page 179
Conclusion......Page 187
Deep Learning Fundamentals......Page 188
Human Brain Neuron vs. Artificial Neuron......Page 190
Hyperbolic Tangent......Page 193
Rectified Linear Unit......Page 194
Neuron Computation......Page 195
Training Process: Neural Network......Page 197
Building a Multilayer Perceptron Model......Page 203
Conclusion......Page 208
Index......Page 209
📜 SIMILAR VOLUMES
<p><span>This book from the bestselling and widely acclaimed Python Machine Learning series is a comprehensive guide to machine and deep learning using PyTorch's simple-to-code framework.</span></p><p><span>Purchase of the print or Kindle book includes a free eBook in PDF format.</span></p><h4><span
Unlock modern machine learning and deep learning techniques with Python by using the latest cutting-edge open source Python libraries. About This Book Second edition of the bestselling book on Machine Learning A practical approach to key frameworks in data science, machine learning, and deep learnin