𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Tidyverse Skills for Data Science in R

✍ Scribed by Carrie Wright, Shannon Ellis, Stephanie Hicks, Roger D. Peng


Publisher
Leanpub
Year
2021
Tongue
English
Leaves
780
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Develop insights from data with tidy tools. Import, wrangle, visualize, and model data with the Tidyverse R packages.

This book is intended for data scientists with some familiarity with the R programming language who are seeking to do Data Science using the Tidyverse family of packages. Through 5 chapters, you will cover importing, wrangling, visualizing, and modeling data using the powerful Tidyverse packages, including the new Tidymodels framework. The Tidyverse packages provide a simple but powerful approach to Data Science which scales from the most basic analyses to massive data deployments. This book covers the entire life cycle of a Data Science project and presents specific tidy tools for each stage.

This course introduces a powerful set of Data Science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of β€œtidy data” and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy data can be transformed to tidy data, the Data Science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a Data Science project.

Functional programming is an approach to programming in which the code evaluated is treated as a mathematical function. It is declarative, so expressions (or declarations) are used instead of statements. Functional programming is often touted and used due to the fact that cleaner, shorter code can be written. In this shorter code, functional programming allows for code that is elegant but also understandable. Ultimately, the goal is to have simpler code that minimizes time required for debugging, testing, and maintaining.

R at its core is a functional programming language. If you’re familiar with the apply() family of functions in base R, you’ve carried out some functional programming! Here, we’ll discuss functional programming and utilize the purrr package, designed to enhance functional programming in R. By utilizing functional programming, you’ll be able to minimize redundancy within your code. The way this happens in reality is by determining what small building blocks your code needs. These will each be a function. These small building block functions are then combined into more complex structures to be your final program.

✦ Table of Contents


Table of Contents
Introduction to the Tidyverse
About This Course
Tidy Data
From Non-Tidy –> Tidy
The Data Science Life Cycle
The Tidyverse Ecosystem
Data Science Project Organization
Data Science Workflows
Case Studies
Importing Data in the Tidyverse
About This Course
Tibbles
Spreadsheets
CSVs
TSVs
Delimited Files
Exporting Data from R
JSON
XML
Databases
Web Scraping
APIs
Foreign Formats
Images
googledrive
Case Studies
Wrangling Data in the Tidyverse
About This Course
Tidy Data Review
Reshaping Data
Data Wrangling
Working With Factors
Working With Dates and Times
Working With Strings
Working With Text
Functional Programming
Exploratory Data Analysis
Case Studies
Visualizing Data in the Tidyverse
About This Course
Data Visualization Background
Plot Types
Making Good Plots
Plot Generation Process
ggplot2: Basics
ggplot2: Customization
Tables
ggplot2: Extensions
Case Studies
Modeling Data in the Tidyverse
About This Course
The Purpose of Data Science
Types of Data Science Questions
Data Needs
Descriptive and Exploratory Analysis
Inference
Linear Modeling
Multiple Linear Regression
Beyond Linear Regression
More Statistical Tests
Hypothesis Testing
Prediction Modeling
The tidymodels Ecosystem
Case Studies
Summary of tidymodels
About the Authors


πŸ“œ SIMILAR VOLUMES


Introduction to Data Science in Biostati
✍ Thomas W. MacFarland πŸ“‚ Library πŸ“… 2024 πŸ› Springer 🌐 English

Introduction to Data Science in Biostatistics: Using R, the Tidyverse Ecosystem, and APIs defines and explores the term "data science" and discusses the many professional skills and competencies affiliated with the industry. With data science being a leading indicator of interest in STEM fields, the

R in Action: Data analysis and graphics
✍ Robert I. Kabacoff πŸ“‚ Library πŸ“… 2022 πŸ› Manning 🌐 English

R is the most powerful tool you can use for statistical analysis. This definitive guide smooths R’s steep learning curve with practical solutions and real-world applications for commercial environments. In R in Action, Third Edition you will learn how to: β€’ Set up and install R and RStudio β€’ Cl

Exploring Data Science with R and the Ti
✍ Jerry Bonnell, Mitsunori Ogihara πŸ“‚ Library πŸ“… 2023 πŸ› CRC Press 🌐 English

This book introduces the reader to data science using R and the tidyverse. No prerequisite knowledge is needed in college-level programming or mathematics (e.g., calculus or statistics). The book is self-contained so readers can immediately begin building data science workflows without needing to re

Data Visualization and Exploration with
✍ Eric Pimpler πŸ“‚ Library πŸ“… 2018 πŸ› Geospatial Training Services 🌐 English

Today, data science is an indispensable tool for any organization, allowing for the analysis and optimization of decisions and strategy. R has become the preferred software for data science, thanks to its open source nature, simplicity, applicability to data analysis, and the abundance of libraries

Statistical Inference via Data Science:
✍ Chester Ismay; Albert Y. Kim πŸ“‚ Library πŸ“… 2019 πŸ› CRC Press 🌐 English

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse provides a pathway for learning about statistical inference using data science tools widely used in industry, academia, and government. It introduces the tidyverse suite of R packages, including the ggplot2 package for dat

Statistical Inference via Data Science:
✍ Chester Ismay; Albert Y. Kim πŸ“‚ Library πŸ“… 2019 πŸ› CRC Press 🌐 English

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse provides a pathway for learning about statistical inference using data science tools widely used in industry, academia, and government. It introduces the tidyverse suite of R packages, including the ggplot2 package for dat