Statistical Methods for Annotation Analysis
โ Scribed by Silviu Paun; Ron Artstein; Massimo Poesio
- Publisher
- Morgan & Claypool
- Year
- 2022
- Tongue
- English
- Leaves
- 217
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Labelling data is one of the most fundamental activities in science, and has underpinned practice, particularly in medicine, for decades, as well as research in corpus linguistics since at least the development of the Brown corpus. With the shift towards Machine Learning in Artificial Intelligence (AI), the creation of datasets to be used for training and evaluating AI systems, also known in AI as corpora, has become a central activity in the field as well.
Early AI datasets were created on an ad-hoc basis to tackle specific problems. As larger and more reusable datasets were created, requiring greater investment, the need for a more systematic approach to dataset creation arose to ensure increased quality. A range of statistical methods were adopted, often but not exclusively from the medical sciences, to ensure that the labels used were not subjective, or to choose among different labels provided by the coders. A wide variety of such methods is now in regular use. This book is meant to provide a survey of the most widely used among these statistical methods supporting annotation practice.
As far as the authors know, this is the first book attempting to cover the two families of methods in wider use. The first family of methods is concerned with the development of labelling schemes and, in particular, ensuring that such schemes are such that sufficient agreement can be observed among the coders. The second family includes methods developed to analyze the output of coders once the scheme has been agreed upon, particularly although not exclusively to identify the most likely label for an item among those provided by the coders.
The focus of this book is primarily on Natural Language Processing, the area of AI devoted to the development of models of language interpretation and production, but many if not most of the methods discussed here are also applicable to other areas of AI, or indeed, to other areas of Data Science.
โฆ Table of Contents
Preface
Acknowledgements
Introduction
Reliability and Validity, and Other Issues
A Very Short Guide to the Probabilistically More Advanced Content in the Book
The Companion Website
Analysing Agreement
Coefficients of Agreement
Introduction and Motivations
Coefficients of Agreement
Agreement, Reliability, and Validity
A Common Notation
The Need for Dedicated Measures of Agreement
Chance-Corrected Coefficients for Measuring Agreement Between Two Coders
More than Two Coders
Krippendorff's Alpha and Other Weighted Agreement Coefficients
Relations Among Coefficients
An Integrated Example
Missing Data
Unitizing or Markable Identification
Bias and Prevalence
Annotator Bias
Prevalence
Appendix: Proofs for Theorems Presented in This Chapter
Annotator Bias and Variance with Multiple Coders
Annotator Bias for Weighted Measures
Using Agreement Measures for CL Annotation Tasks
General Methodological Recommendations
Generating Data to Measure Reproducibility
Establishing Significance
Interpreting the Value of Kappa-Like Coefficients
Agreement and Machine Learning
Labelling Units with a Common and Predefined Set of Categories
Part-of-Speech Tagging
Dialogue Act Tagging
Named Entities
Other Labelling Tasks
Marking Boundaries and Unitizing
Segmentation and Topic Marking
Prosody
Set-Based Labels
Anaphora
Discourse Deixis
Summarization
Word Senses
Summary
Methodology
Choosing a Coefficient
Interpreting the Values
Probabilistic Models of Agreement
Introduction
Easy Items, Difficult Items, and Agreement
Aickin's
Modelling Stability
Coder Stability: A Discussion
Latent Class Analysis of Agreement Patterns
Varying Panel of Coders
Fixed Panel of Coders
An NLP Case Study
Summary
Analysing and Using Crowd Annotations
Probabilistic Models of Annotation
Introduction
Terminology and a Simple Annotation Model
Modelling Annotator Behaviour
Modelling Item Difficulty
Hierarchical Structures
Adding Features
Modelling Sequence Labelling Tasks
Aggregating Anaphoric Annotations
Aggregation with Variational Autoencoders
Modelling Complex Annotations
Learning from Multi-Annotated Corpora
Introduction
Learning with Soft Labels
Learning Individual Coder Models
Dealing with Noise
Pooling Coder Confusions
Summary
Bibliography
Authors' Biographies
๐ SIMILAR VOLUMES
The main purpose of this book is to address the statistical issues for integrating independent studies. There exist a number of papers and books that discuss the mechanics of collecting, coding, and preparing data for a meta-analysis , and we do not deal with these. Because this book concerns method
The main purpose of this book is to address the statistical issues for integrating independent studies. There exist a number of papers and books that discuss the mechanics of collecting, coding, and preparing data for a meta-analysis , and we do not deal with these. Because this book concerns method
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical m
The main purpose of this book is to address the statistical issues for integrating independent studies. There exist a number of papers and books that discuss the mechanics of collecting, coding, and preparing data for a meta-analysis , and we do not deal with these. <br>Because this book concerns me