Statistics and Data Science: Research School on Statistics and Data Science, RSSDS 2019, Melbourne, VIC, Australia, July 24–26, 2019, Proceedings (Communications in Computer and Information Science)

✍ Scribed by Hien Nguyen (editor)

Publisher: Springer
Year: 2020
Tongue: English
Leaves: 271
Edition: 1st ed. 2019
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This book constitutes the proceedings of the Research School on Statistics and Data Science, RSSDS 2019, held in Melbourne, VIC, Australia, in July 2019.

The 11 papers presented in this book were carefully reviewed and selected from 23 submissions. The volume also contains 7 invited talks. The workshop brought together academics, researchers, and industry practitioners of statistics and data science, to discuss numerous advances in the disciplines and their impact on the sciences and society. The topics covered are data analysis, data science, data mining, data visualization, bioinformatics, machine learning, neural networks, statistics, and probability.

✦ Table of Contents

Preface
Organization
Contents
Invited Papers
Symbolic Formulae for Linear Mixed Models
1 Introduction
2 Symbolic Formulae for Linear Models
2.1 Trees Volume: Linear Model
2.2 Herbicide: Categorical Variable
2.3 Specification of Intercept
3 Linear Mixed Models
3.1 lme4
3.2 asreml
4 Motivating Examples for LMMs
4.1 Chicken Weight: Longitudinal Analysis
4.2 Field Trial: Covariance Structure
4.3 Multi-environmental Trial: Separable Structure
5 Discussion
References
code::proof: Prepare for Most Weather Conditions
1 The Kafkaesque Dystopia of DevOps
2 Toolchain Walkthrough
3 Two Research Compendia Case Studies
3.1 The varameta:: Package; a Comparative Analysis
3.2 The simeta:: Package
3.3 Coverage Probability Simulation
3.4 Simulating Meta-analysis Data
3.5 Complexity and Formalised Analysis Structures
4 Research Compendia Toolchain Walkthrough
4.1 DevOps
4.2 Create Compendium Architecture
4.3 Common Steps Across both Packages
5 Testing
5.1 What Is a Test?
5.2 Non-empty Thing of Expected Type
5.3 Test-Driven Development
6 Prepare for most weather conditions
References
Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models
1 Introduction
2 Gaussian-Gated Mixture-of-Experts
2.1 MoE Modeling Framework
2.2 Gaussian-Gated Mixture-of-Experts
2.3 Maximum Likelihood Estimation via the EM Algorithm
2.4 The EM Algorithm for the MoGGE Model
3 Penalized Maximum Likelihood Parameter Estimation
3.1 The EM-Lasso Algorithm for the MoGGE Model
3.2 Algorithm Tuning and Model Selection
4 Experimental Study
4.1 Simulation Study
5 Conclusion and Future Work
References
Flexible Modelling via Multivariate Skew Distributions
1 Introduction
2 Skew Symmetric Distributions
3 CFUSN Distribution
3.1 Restricted Multivariate Skew Normal (rMSN) Distribution
3.2 Unrestricted MultivariateSkew Normal (uMSN) Distribution
4 CFUST Distribution
5 Scale Mixture of CFUSN Distribution
6 CFUSH Distribution
7 Mixtures of CFUST Distributions Versus Mixtures of HTH Distributions
8 Conclusions
References
Estimating Occupancy and Fitting Models with the Two-Stage Approach
1 Introduction
2 Full Likelihood
3 Boundary Solutions
4 Plausible Region
5 Bias
6 Two-Stage Approach and Modelling Occupancy
6.1 Homogeneous Case
6.2 Heterogeneous Case
6.3 GAMs
7 Discussion
References
Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions
1 Introduction
2 Bayesian Mixtures of Multiple Scale Distributions
2.1 Multiple Scale Mixtures of Gaussians
2.2 Priors on Parameters
2.3 Inference Using Variational Expectation-Maximization
3 Single-Run Number of Component Selection
3.1 Tested Procedures
4 Experiments
4.1 Simulated Data
5 Discussion and Conclusion
6 Supplementary Material
References
An Introduction to Approximate Bayesian Computation
1 Introduction
2 Approximate Bayesian Computation
3 The Energy Statistic
4 Artificial Examples
4.1 Normal Model
4.2 Normal Mixture Model
4.3 Triangle Distribution
5 Application
6 Conclusion
References
Contributing Papers
Truth, Proof, and Reproducibility: There's No Counter-Attack for the Codeless
1 The Technological Shift in Mathematical Inquiry
2 Truth in Mathematics
2.1 Prove It!
2.2 The Steps in the Making of a Proof
2.3 Is Computational Mathematics Mired in Proof Methodology?
3 Testing
3.1 What Is a Test?
3.2 How Good Are We at good Enough testing?
3.3 Analysis of Testing Code in R Packages
4 Tempered Uncertainty and Computational Proof
4.1 Coda
References
On Adaptive Gauss-Hermite Quadrature for Estimation in GLMM's
1 Introduction
2 The Logistic Regression with Random Intercept Model, Its Log-Likelihood and Adaptive Gauss-Hermite Quadrature
3 The Teratology Data and Importance Sampling
4 The Performance of Adaptive Gauss-Hermite Quadrature for Cluster 29 of the Teratology Data
5 Discussion
References
Deep Learning with Periodic Features and Applications in Particle Physics
1 Introduction
1.1 Data: Physics Observables from Colliders
2 Periodic Loss and Activation Function
3 Example 1: Predicting the Angle of an Invisible Particle
4 Example 2: Autoencoding Periodic Features
5 Conclusion
References
Copula Modelling of Nurses' Agitation-Sedation Rating of ICU Patients
1 Introduction
1.1 Background
2 Methodology
3 Results
4 Conclusion
References
Predicting the Whole Distribution with Methods for Depth Data Analysis Demonstrated on a Colorectal Cancer Treatment Study
Abstract
1 Introduction
2 Methods
2.1 Data Details
2.2 Modelling Details
2.2.1 Boosting to Assess Variable Selection and Functional Fit
2.2.2 Model Specification for Additive Quantile Regression with Boosting
2.2.3 Recovering the Unconditional Predicted Quantile
2.2.4 Smoothing Count Data – a Technicality
2.2.5 Building the Second Additive Quantile Regression Model Without Boosting
2.2.6 Comparing the AQR Models - with Boosting to Without Boosting
3 Results
3.1 Mean Annual Volume Association with LOS
3.2 Counterfactual Prediction of Change in LOS Contingent on Change in MAV
3.3 Further Results for Patient and Hospital Factors
3.3.1 Laparoscope Use – See Fig. 5
3.3.2 Separation Mode – See Fig. 6
3.3.3 Month of Year – See Fig. 7
3.3.4 Sex - See Fig. 8
3.4 Quantile Crossing
4 Discussion
5 Conclusion
References
Resilient and Deep Network for Internet of Things (IoT) Malware Detection
Abstract
1 Introduction
2 Related Works
3 Proposed Method
3.1 Word2vec Model
3.2 The Proposed CNN Architecture
4 Experiments and Results
4.1 Dataset
4.2 Experimental Environment and Evaluation Metrics
4.3 Experiments
5 Summary and Conclusion
References
Prediction of Neurological Deterioration of Patients with Mild Traumatic Brain Injury Using Machine Learning
Abstract
1 Background
2 Methods
2.1 Data Collection and Preprocessing
2.2 Training and Validation Datasets
2.3 Modeling Methods Using Neural Network and Non-neural Network Algorithms
3 Results
4 Discussion
5 Conclusion
Acknowledgements
References
Spherical Data Handling and Analysis with R package rcosmo
1 Introduction
2 Coordinate Systems for Spherical Data Representation
3 Continuous Geographic Data
4 Point Pattern Data
5 Directional Data
References
On the Parameter Estimation in the Schwartz-Smith's Two-Factor Model
1 Introduction
2 Two-Factor Model
2.1 A Commodity Spot Price Modelling
2.2 Risk-Neutral Approach to Spot Price Modelling
2.3 Risk-Neutral Approach to Pricing of Futures
3 Kalman Filter
4 Simulation Study
5 Conclusions
A Derivations of (1) and (2)
References
Interval Estimators for Inequality Measures Using Grouped Data
1 Introduction
2 Some Inequality Measures
2.1 Gini Index
2.2 Theil Index
2.3 Atkinson Index
2.4 Quantile Ratio Index
3 Density Estimation Methods
3.1 GLD Estimation Method
3.2 Linear Interpolation Method
4 Interval Estimators Using Grouped Data
5 Simulations and Examples
5.1 Simulations
6 Applications
6.1 Example 1: Household Income Reported with Group Means
6.2 Example 2: Comparison of Equalized Disposable Household Income Data
7 Discussion
References
Exact Model Averaged Tail Area Confidence Intervals
1 Introduction
2 Description of the MATA Confidence Interval
3 The New Optimized Weight Function
3.1 Performance of the Optimized Weight Function
4 Empirical Example
5 Computational Method Used to Find the Parameters of the New Optimized Weight Function
6 Can We Do Better if We Optimize the Weight Function for both m and ?
7 Conclusion
References
Author Index

📜 SIMILAR VOLUMES

Data Management Technologies and Applica

📁 Data Management Technologies and Applications (Communications in Computer and Information Science)

✍ Alfredo Cuzzocrea (editor), Oleg Gusikhin (editor), Slimane Hammoudi (editor), C 📂 Library 📅 2023 🏛 Springer 🌐 English

This book constitutes the refereed post-proceedings of the 10th International Conference and 11th International Conference on Data Management Technologies and Applications, DATA 2021 and DATA 2022, was held virtually due to the COVID-19 crisis on July 6–8, 2021 and in Lisbon, Portugal on Ju

Computational Statistics in Data Science

📁 Computational Statistics in Data Science

✍ Richard A. Levine, Walter W. Piegorsch, Hao Helen Zhang, Thomas C. M. Lee 📂 Library 📅 2022 🏛 Wiley 🌐 English

An essential roadmap to the application of computational statistics in contemporary data scienceIn Computational Statistics in Data Science, a team of distinguished mathematicians and statisticians delivers an expert compilation of concepts, the

Medical Image Understanding and Analysis

📁 Medical Image Understanding and Analysis: 23rd Conference, MIUA 2019, Liverpool, UK, July 24–26, 2019, Proceedings (Communications in Computer and Information Science)

✍ Yalin Zheng; Bryan M. Williams; Ke Chen 📂 Library 🌐 English

Intelligent Computing and Innovation on

📁 Intelligent Computing and Innovation on Data Science: Proceedings of ICTIDS 2019

✍ Sheng-Lung Peng, Le Hoang Son, G. Suseendran, D. Balaganesh 📂 Library 📅 2020 🏛 Springer Singapore;Springer 🌐 English

This book covers both basic and high-level concepts relating to the intelligent computing paradigm and data sciences in the context of distributed computing, big data, data sciences, high-performance computing and Internet of Things. It is becoming increasingly important to develop adaptiv

Statistical learning and data science

📁 Statistical learning and data science

✍ Mireille Gettler Summa; et al 📂 Library 📅 2011 🏛 Chapman & Hall/CRC 🌐 English

Data analysis is changing fast. Driven by a vast range of application domains and affordable tools, machine learning has become mainstream. Unsupervised data analysis, including cluster analysis, factor analysis, and low dimensionality mapping methods continually being updated, have reached new heig

R Programming: Mastering Data Science an

📁 R Programming: Mastering Data Science and Statistical Computing

✍ Nolan, Rama 📂 Library 📅 2024 🏛 Independently published 🌐 English

Unlock Your Data Science Potential with R Programming! Dive into “R Programming: Mastering Data Science and Statistical Computing”, the ultimate guide to one of the most powerful tools in the world of data science. Whether you're a complete beginner or an experienced professional looking to refin