<p>Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods
Computational and Statistical Methods for Analysing Big Data with Applications
β Scribed by Shen Liu, James Mcgree, Zongyuan Ge, Yang Xie
- Publisher
- Academic Press;Elsevier
- Year
- 2016
- Tongue
- English
- Leaves
- 195
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods to explore, model and draw inferences from big data. This book aims to introduce suitable approaches for such endeavours, providing applications and case studies for the purpose of demonstration.
Computational and Statistical Methods for Analysing Big Data with Applications starts with an overview of the era of big data. It then goes onto explain the computational and statistical methods which have been commonly applied in the big data revolution. For each of these methods, an example is provided as a guide to its application. Five case studies are presented next, focusing on computer vision with massive training data, spatial data analysis, advanced experimental design methods for big data, big data in clinical medicine, and analysing data collected from mobile devices, respectively. The book concludes with some final thoughts and suggested areas for future research in big data.
- Advanced computational and statistical methodologies for analysing big data are developed
- Experimental design methodologies are described and implemented to make the analysis of big data more computationally tractable
- Case studies are discussed to demonstrate the implementation of the developed methods
- Five high-impact areas of application are studied: computer vision, geosciences, commerce, healthcare and transportation
- Computing code/programs are provided where appropriate
β¦ Table of Contents
Content: Front Cover
Computational and Statistical Methods for Analysing Big Data with Applications
Copyright Page
Contents
List of Figures
List of Tables
Acknowledgment
1 Introduction
1.1 What is big data?
1.1.1 Volume
1.1.2 Velocity
1.1.3 Variety
1.1.4 Another two V's
1.2 What is this book about?
1.3 Who is the intended readership?
References
2 Classification methods
2.1 Fundamentals of classification
2.1.1 Features and training samples
Example: Discriminating owners from non-owners of riding mowers
2.1.2 Probabilities of misclassification and the associated costs 2.1.3 Classification by minimizing the ECMExample: Medical diagnosis
2.1.4 More than two classes
2.2 Popular classifiers for analysing big data
2.2.1 k-Nearest neighbour algorithm
2.2.2 Regression models
2.2.3 Bayesian networks
2.2.4 Artificial neural networks
2.2.5 Decision trees
2.3 Summary
References
3 Finding groups in data
3.1 Principal component analysis
3.2 Factor analysis
3.3 Cluster analysis
3.3.1 Hierarchical clustering procedures
3.3.2 Nonhierarchical clustering procedures
3.3.3 Deciding on the number of clusters
3.4 Fuzzy clustering
Appendix R code for principal component analysis and factor analysisMATLAB code for cluster analysis
References
4 Computer vision in big data applications
4.1 Big datasets for computer vision
4.2 Machine learning in computer vision
4.2.1 Feature engineering
4.2.2 Classifiers
Regression
Support vector machine
Gaussian mixture models
4.3 State-of-the-art methodology: deep learning
4.3.1 A single-neuron model
4.3.2 A multilayer neural network
4.3.3 Training process of multilayer neural networks
Feed-forward pass
Back-propagation pass
4.4 Convolutional neural networks
4.4.1 Pooling 4.4.2 Training a CNN4.4.3 An example of CNN in image recognition
Overall structure of the network
Data preprocessing
Prevention of overfitting
4.5 A tutorial: training a CNN by ImageNet
4.5.1 Caffe
4.5.2 Architecture of the network
Input layer
Convolutional layer
Pooling layer
LRN layer
Fully-connected layers
Dropout layers
Softmax layer
4.5.3 Training
4.6 Big data challenge: ILSVRC
4.6.1 Performance evaluation
4.6.2 Winners in the history of ILSVRC
4.7 Concluding remarks: a comparison between human brains and computers
Acknowledgements
References 5 A computational method for analysing large spatial datasets5.1 Introduction to spatial statistics
5.1.1 Spatial dependence
5.1.2 Cross-variable dependence
5.1.3 Limitations of conventional approaches to spatial analysis
5.2 The HOS method
5.2.1 Cross-variable high-order statistics
5.2.2 Searching process
5.2.3 Local CPDF approximation
5.3 MATLAB functions for the implementation of the HOS method
5.3.1 Spatial template and searching process
5.3.2 Higher-order statistics
5.3.3 Coefficients of Legendre polynomials
5.3.4 CPDF approximation
5.4 A case study
References
π SIMILAR VOLUMES
This third edition expands on the original material. Large portions of the text have been reviewed and clarified. More emphasis is devoted to machine learning including more modern concepts and examples. This book provides the reader with the main concepts and tools needed to perform statistical ana
<p><span>This third edition expands on the original material. Large portions of the text have been reviewed and clarified. More emphasis is devoted to machine learning including more modern concepts and examples. This book provides the reader with the main concepts and tools needed to perform statis
<span>This book is intended for use as the textbook in a second course in applied statistics that covers topics in multiple regression and analysis of variance at an intermediate level. Generally, students enrolled in such courses are p- marily graduate majors or advanced undergraduate students from
1 online resource (1 v.) :
<p>This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with aΒ detailed overview of the field of Big Data Analytics as it is practiced today. The chaptersΒ cove