Computational and Statistical Methods for Analysing Big Data with Applications

✍ Scribed by Shen Liu, James Mcgree, Zongyuan Ge, Yang Xie

Publisher: Academic Press;Elsevier
Year: 2016
Tongue: English
Leaves: 195
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods to explore, model and draw inferences from big data. This book aims to introduce suitable approaches for such endeavours, providing applications and case studies for the purpose of demonstration.

Computational and Statistical Methods for Analysing Big Data with Applications starts with an overview of the era of big data. It then goes onto explain the computational and statistical methods which have been commonly applied in the big data revolution. For each of these methods, an example is provided as a guide to its application. Five case studies are presented next, focusing on computer vision with massive training data, spatial data analysis, advanced experimental design methods for big data, big data in clinical medicine, and analysing data collected from mobile devices, respectively. The book concludes with some final thoughts and suggested areas for future research in big data.

Advanced computational and statistical methodologies for analysing big data are developed
Experimental design methodologies are described and implemented to make the analysis of big data more computationally tractable
Case studies are discussed to demonstrate the implementation of the developed methods
Five high-impact areas of application are studied: computer vision, geosciences, commerce, healthcare and transportation
Computing code/programs are provided where appropriate

✦ Table of Contents

Content: Front Cover
Computational and Statistical Methods for Analysing Big Data with Applications
Copyright Page
Contents
List of Figures
List of Tables
Acknowledgment
1 Introduction
1.1 What is big data?
1.1.1 Volume
1.1.2 Velocity
1.1.3 Variety
1.1.4 Another two V's
1.2 What is this book about?
1.3 Who is the intended readership?
References
2 Classification methods
2.1 Fundamentals of classification
2.1.1 Features and training samples
Example: Discriminating owners from non-owners of riding mowers
2.1.2 Probabilities of misclassification and the associated costs 2.1.3 Classification by minimizing the ECMExample: Medical diagnosis
2.1.4 More than two classes
2.2 Popular classifiers for analysing big data
2.2.1 k-Nearest neighbour algorithm
2.2.2 Regression models
2.2.3 Bayesian networks
2.2.4 Artificial neural networks
2.2.5 Decision trees
2.3 Summary
References
3 Finding groups in data
3.1 Principal component analysis
3.2 Factor analysis
3.3 Cluster analysis
3.3.1 Hierarchical clustering procedures
3.3.2 Nonhierarchical clustering procedures
3.3.3 Deciding on the number of clusters
3.4 Fuzzy clustering
Appendix R code for principal component analysis and factor analysisMATLAB code for cluster analysis
References
4 Computer vision in big data applications
4.1 Big datasets for computer vision
4.2 Machine learning in computer vision
4.2.1 Feature engineering
4.2.2 Classifiers
Regression
Support vector machine
Gaussian mixture models
4.3 State-of-the-art methodology: deep learning
4.3.1 A single-neuron model
4.3.2 A multilayer neural network
4.3.3 Training process of multilayer neural networks
Feed-forward pass
Back-propagation pass
4.4 Convolutional neural networks
4.4.1 Pooling 4.4.2 Training a CNN4.4.3 An example of CNN in image recognition
Overall structure of the network
Data preprocessing
Prevention of overfitting
4.5 A tutorial: training a CNN by ImageNet
4.5.1 Caffe
4.5.2 Architecture of the network
Input layer
Convolutional layer
Pooling layer
LRN layer
Fully-connected layers
Dropout layers
Softmax layer
4.5.3 Training
4.6 Big data challenge: ILSVRC
4.6.1 Performance evaluation
4.6.2 Winners in the history of ILSVRC
4.7 Concluding remarks: a comparison between human brains and computers
Acknowledgements
References 5 A computational method for analysing large spatial datasets5.1 Introduction to spatial statistics
5.1.1 Spatial dependence
5.1.2 Cross-variable dependence
5.1.3 Limitations of conventional approaches to spatial analysis
5.2 The HOS method
5.2.1 Cross-variable high-order statistics
5.2.2 Searching process
5.2.3 Local CPDF approximation
5.3 MATLAB functions for the implementation of the HOS method
5.3.1 Spatial template and searching process
5.3.2 Higher-order statistics
5.3.3 Coefficients of Legendre polynomials
5.3.4 CPDF approximation
5.4 A case study
References

📜 SIMILAR VOLUMES

Computational and Statistical Methods fo

📁 Computational and Statistical Methods for Analysing Big Data with Applications

✍ Ge, Zongyuan; Liu, Shen; McGree, James; Xie, Yang 📂 Library 📅 2016 🏛 Academic Press 🌐 English

Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods

Statistical Methods for Data Analysis: W

📁 Statistical Methods for Data Analysis: With Applications in Particle Physics

✍ Luca Lista 📂 Library 📅 2023 🏛 Springer 🌐 English

This third edition expands on the original material. Large portions of the text have been reviewed and clarified. More emphasis is devoted to machine learning including more modern concepts and examples. This book provides the reader with the main concepts and tools needed to perform statistical ana

Statistical Methods for Data Analysis: W

📁 Statistical Methods for Data Analysis: With Applications in Particle Physics

✍ Luca Lista 📂 Library 📅 2023 🏛 Springer 🌐 English

This third edition expands on the original material. Large portions of the text have been reviewed and clarified. More emphasis is devoted to machine learning including more modern concepts and examples. This book provides the reader with the main concepts and tools needed to perform statis

SAS for Data Analysis: Intermediate Stat

📁 SAS for Data Analysis: Intermediate Statistical Methods (Statistics and Computing)

✍ Mervyn G. Marasinghe, William J. Kennedy 📂 Library 📅 2008 🏛 Springer 🌐 English

This book is intended for use as the textbook in a second course in applied statistics that covers topics in multiple regression and analysis of variance at an intermediate level. Generally, students enrolled in such courses are p- marily graduate majors or advanced undergraduate students from

📁 Computational intelligence in business analytics : concepts, methods, and tools for big data applications

✍ Sztandera, Les 📂 Library 📅 2014 🏛 Upper Saddle River, NJ : Pearson Education 🌐 English

1 online resource (1 v.) :

Big Data Analytics: Methods and Applicat

📁 Big Data Analytics: Methods and Applications

✍ Saumyadipta Pyne, B.L.S. Prakasa Rao, S.B. Rao (eds.) 📂 Library 📅 2016 🏛 Springer India 🌐 English

This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cove