𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text (Quantitative Methods in the Humanities and Social Sciences)

✍ Scribed by Taylor Arnold, Lauren Tilton


Publisher
Springer; Second Edition 2024
Year
2024
Tongue
English
Leaves
287
Edition
2
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


This book teaches readers to integrate data analysis techniques into humanities research practices using the R programming language. Methods for general-purpose visualization and analysis are introduced first, followed by domain-specific techniques for working with networks, text, geospatial data, temporal data, and images. The book is designed to be a bridge between quantitative and qualitative methods, individual and collaborative work, and the humanities and social sciences. The second edition of the text is a significant revision, with almost every aspect of the text rewritten in some way. The most notable difference is the incorporation of new R packages such as ggplot2 and dplyr that center broad data-science concepts.

This 2nd edition of Humanities Data with R does not presuppose background programming experience. Early chapters take readers from R set-up to exploratory data analysis, with one chapter dedicated to each stage of the data-science pipeline (data collection, visualization, manipulation, and relational joins). Following this, text analysis, networks, temporal data, geospatial data, and image analysis each have a dedicated chapter. These are grounded in examples to move readers beyond the intimidation of adding new tools to their research. The final section of the book extends the core material with additional computer science techniques for processing large datasets.

Everything is hands-on: image analysis is explained using digitized photographs from the 1930s, and networks are applied to page links on Wikipedia. After working through these examples with the provided data, code and book website, readers are prepared to apply new methods to their own work. The open source R programming language, with its myriad packages and popularity within the sciences and social sciences, is particularly well-suited to working with humanities data. R packages are also highlighted in an appendix.

The methodology will have wide application in classrooms and self-study for the humanities, but also for use in linguistics, anthropology, and political science. Outside the classroom, this intersection of humanities and computing is particularly relevant for research and new modes of dissemination across archives, museums and libraries.

✦ Table of Contents


Preface
Preface to Second Edition
Humanities Data
Supplementary Materials
Acknowledgments
Contents
Part I Core
1 Working with Data in R
1.1 Introduction
1.2 Setup
1.3 Working with R and R Markdown
1.4 Running R Code
1.5 Functions in R
1.6 Loading Data in R
1.7 Datasets
1.8 Formatting R Code
1.9 Extensions
2 EDA I: Grammar of Graphics
2.1 Introduction
2.2 Text Geometry
2.3 Lines and Bars
2.4 Optional Aesthetics
2.5 Scales
2.6 Labels and Themes
2.7 Conventions for Graphics Code
2.8 Extensions
3 EDA II: Organizing Data
3.1 Introduction
3.2 Choosing Rows
3.3 Data and Layers
3.4 Selecting Columns
3.5 Arranging Rows
3.6 Summarize and Group By
3.7 Geometries for Summaries
3.8 Mutate
3.9 Extensions
4 EDA III: Restructuring Data
4.1 Introduction
4.2 Joining by Relation
4.3 Mutating and Filtering Joins
4.4 Pivot Longer
4.5 Pivot Wider
4.6 Patterns for Table Pivots
4.7 Extensions
5 Collecting Data
5.1 Introduction
5.2 Rectangular Data
5.3 Naming Variables
5.4 What Goes in a Cell
5.5 Dates
5.6 Output Format
5.7 Data Dictionary
5.8 Summary of Data Collection Guidelines
5.9 Extensions
Part II Data Types
6 Textual Data
6.1 Introduction
6.2 Working with a Textual Corpus
6.3 Natural Language Processing Pipeline
6.4 Term Frequency-Inverse Document Frequency (TF-IDF)
6.5 Document Distance
6.6 Dimensionality Reduction
6.7 Word Relationships
6.8 Texts in Other Languages
6.9 Extensions
7 Network Data
7.1 Introduction
7.2 Creating a Network Object
7.3 Centrality
7.4 Clusters
7.5 Co-citation Networks
7.6 Directed Networks
7.7 Distance Networks
7.8 Nearest Neighbor Networks
7.9 Extensions
8 Temporal Data
8.1 Introduction
8.2 Temporal Data and Ordering
8.3 Date Objects
8.4 Datetime Objects
8.5 Language and Time Zones
8.6 Manipulating Dates and Datetimes
8.7 Window Functions and Range Joins
8.8 Extensions
9 Spatial Data
9.1 Introduction
9.2 Spatial Points
9.3 Polygons
9.4 Spatial Metrics
9.5 Spatial Joins
9.6 Raster Maps
9.7 Extensions
10 Image Data
10.1 Introduction
10.2 Loading Images
10.3 Pixels and Color
10.4 Computer Vision
10.5 Object Detection
10.6 Face Detection
10.7 Pose Detection
10.8 Embeddings
10.9 Extensions
Part III Additional Methods
11 Programming in R
11.1 Introduction
11.2 Vectors
11.3 Data Types and Lists
11.4 Selecting and Modifying Vectors
11.5 Matrices
11.6 Control Flow
11.7 Functional Programming
11.8 Extensions
12 Data Formats
12.1 Introduction
12.2 Strings
12.3 Regular Expressions
12.4 JSON Data
12.5 XML and HTML Formats
12.6 XML Path Language (XPath)
12.7 Building Datasets Through an API
12.8 Extensions
References
Index


πŸ“œ SIMILAR VOLUMES


Humanities Data in R: Exploring Networks
✍ Taylor Arnold, Lauren Tilton πŸ“‚ Library πŸ“… 2015 πŸ› Springer International Publishing 🌐 English

​This pioneering book teaches readers to use R within four core analytical areas applicable to the Humanities: networks, text, geospatial data, andΒ images. This book is also designed to be a bridge: between quantitative and qualitative methods, individual and collaborative work, and the humanities a

Higher Education Policy Analysis Using Q
✍ Marvin Titus πŸ“‚ Library πŸ“… 2021 πŸ› Springer 🌐 English

<span>This textbook introduces graduate students in education and policy research to data and statistical methods in state-level higher education policy analysis. It also serves as a methodological guide to students, practitioners, and researchers who want a clear approach to conducting higher educa

Multivariate Humanities (Quantitative Me
✍ Pieter M. Kroonenberg πŸ“‚ Library πŸ“… 2021 πŸ› Springer 🌐 English

<p><span>This case study-based textbook in multivariate analysis for advanced students in the humanities emphasizes descriptive, exploratory analyses of various types of datasets from a wide range of sub-disciplines, promoting the use of multivariate analysis and illustrating its wide applicability.

Big Data in Computational Social Science
✍ Shu-Heng Chen πŸ“‚ Library πŸ“… 2018 πŸ› Springer 🌐 English

This edited volume focuses on big data implications for computational social science and humanities from management to usage. The first part of the book covers geographic data, text corpus data, and social media data, and exemplifies their concrete applications in a wide range of fields including an

Empowering Human Dynamics Research with
✍ Atsushi Nara (editor), Ming-Hsiang Tsou (editor) πŸ“‚ Library πŸ“… 2021 πŸ› Springer 🌐 English

<p><span>This book discusses theoretical backgrounds, techniques and methodologies, and applications of the current state-of-the-art human dynamics research utilizing social media and geospatial big data. It describes various forms of social media and big data with location information, theory devel

Data Science, Human Science, and Ancient
✍ Sandra Blakely; Megan Daniels πŸ“‚ Library πŸ“… 2023 πŸ› Lockwood Press 🌐 English

The studies in this volume share a focus on religion in the ancient Mediterranean world: How ritual, myth, spectatorship, and travel reflect the continual interaction of human beings with the richly fictive beings who defined the boundaries of groups, access to the past, and mobility across land and