Today, data science is an indispensable tool for any organization, allowing for the analysis and optimization of decisions and strategy. R has become the preferred software for data science, thanks to its open source nature, simplicity, applicability to data analysis, and the abundance of libraries
A Beginnerโs Guide to Data Exploration and Visualisation with R
โ Scribed by Elena N Ieno, Alain F Zuur
- Publisher
- Highland Statistics Ltd.
- Year
- 2015
- Tongue
- English
- Leaves
- 175
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Table of Contents
TOC
Chapter1
1.1 Speaking the same language
1.2. General points
1.3 Outline of this book
Chapter2
2.1 What is an outlier?
2.2 Boxplot to identify outliers in one dimension
2.3 Cleveland dotplot to identify outliers in onedimension
2.4 Boxplots or Cleveland dotplots?
2.5 Can we apply a test for outliers?
2.6 Outliers in the two-dimensional space
2.7 Influential observations in regression models
2.8 What to do if you detect potential outliers
2.9 Outliers and multivariate data
2.10 The pros and cons of transformations
Chapter3
3.1 What is normality?
3.2 Histograms and conditional histograms
3.3 Kernel density plots
3.4 Quantileโquantile plots
3.5 Using tests to check for normality
3.6 Homogeneity of variance
3.7 Using tests to check for homogeneity
Chapter4
4.1 Simple scatterplots
4.2 Multipanel scatterplots
4.3 Pairplots
4.4 Can we include interactions?
4.5 Design and interaction plots
Chapter5
5.1 What is collinearity?
5.2 The sample correlation coefficient
5.3 Correlation and outliers
5.4 Correlation matrices
5.5 Correlation and pairplots
5.6 Collinearity due to interactions
5.7 Visualising collinearity with conditional boxplots
5.8 Quantifying collinearity using variance inflation factors
5.9 Generalised VIF values
5.10 Visualising collinearity using PCA biplot
5.11 Causes of collinearity and solutions
5.12 Be stubborn and keep collinear covariates in the model?
5.13 Confounding variables
Chapter6
6.1 Introduction
6.2 Data exploration
6.3 Statistical analysis using linear regression
6.4 Statistical analysis using a mixed effects model
6.5 Conclusions
6.6 What to present in a paper
Chapter7
7.1 Importing the data
7.2 Data exploration
7.3 Applying a linear regression model
7.4 Understanding the results
7.5 Trouble
7.6 Conclusions
Chapter8
8.1 Importing the data
8.2 Coding the data
8.3 Multi-panel graph using xyplot from lattice
8.4 Multi-panel graph using ggplot2
8.5 Conclusions
References
Index
OtherBooks
๐ SIMILAR VOLUMES
This elementary review covers the basics of working with astronomical data, notably with images, spectra and higher-level (catalog) data. The basic concepts and tools are presented using both application software (DS9 and TOPCAT) and Python. The level of presentation is suitable for undergraduate st
<i>Practical SQL</i> is an approachable and fast-paced guide to SQL (Structured Query Language), the standard programming language for defining, organizing, and exploring data in relational databases. The book focuses on using SQL to find the story your data tells, with the popular open-source datab
<i>Practical SQL</i> is an approachable and fast-paced guide to SQL (Structured Query Language), the standard programming language for defining, organizing, and exploring data in relational databases. The book focuses on using SQL to find the story your data tells, with the popular open-source datab