𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Machine Learning Methods for Stylometry: Authorship Attribution and Author Profiling

✍ Scribed by Jacques Savoy


Publisher
Springer International Publishing;Springer
Year
2020
Tongue
English
Leaves
294
Edition
1st ed.
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science.

The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years.

A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.

✦ Table of Contents


Front Matter ....Pages i-xix
Front Matter ....Pages 1-2
Introduction to Stylistic Models and Applications (Jacques Savoy)....Pages 3-17
Basic Lexical Concepts and Measurements (Jacques Savoy)....Pages 19-32
Distance-Based Approaches (Jacques Savoy)....Pages 33-51
Front Matter ....Pages 53-54
Evaluation Methodology and Test Corpora (Jacques Savoy)....Pages 55-81
Features Identification and Selection (Jacques Savoy)....Pages 83-108
Machine Learning Models (Jacques Savoy)....Pages 109-151
Advanced Models for Stylometric Applications (Jacques Savoy)....Pages 153-187
Front Matter ....Pages 189-190
Elena Ferrante: A Case Study in Authorship Attribution (Jacques Savoy)....Pages 191-210
Author Profiling of Tweets (Jacques Savoy)....Pages 211-227
Applications to Political Speeches (Jacques Savoy)....Pages 229-249
Conclusion (Jacques Savoy)....Pages 251-253
Back Matter ....Pages 255-286

✦ Subjects


Computer Science; Information Storage and Retrieval; Computational Linguistics; Library Science


πŸ“œ SIMILAR VOLUMES


Machine Learning Methods for Stylometry:
✍ Jacques Savoy πŸ“‚ Library πŸ“… 2020 πŸ› Springer 🌐 English

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic f

Machine Learning for Authorship Attribut
✍ Farkhund Iqbal, Mourad Debbabi, Benjamin C. M. Fung πŸ“‚ Library πŸ“… 2020 πŸ› Springer International Publishing;Springer 🌐 English

<p><p>The book first explores the cybersecurity’s landscape and the inherent susceptibility of online communication system such as e-mail, chat conversation and social media in cybercrimes. Common sources and resources of digital crimes, their causes and effects together with the emerging threats fo

Comparative study for Stylometric analys
✍ Raafat, Maryam A. (author);El-Wakil, Rania Abdel-Fattah (author);Atia, Ayman (au πŸ“‚ Scientific πŸ“… 2021 πŸ› IEEE

A text is a meaningful source of information. Capturing the right patterns in written text gives metrics to measure and infer to what extent this text belongs or is relevant to a specific author. This research aims to introduce a new feature that goes more in deep in the language structure. The feat

Machine learning for OpenCV : advanced m
✍ Michael Beyeler πŸ“‚ Library πŸ“… 2018 πŸ› Packt Publishing 🌐 English

"A practical introduction to the world of machine learning and image processing using OpenCV and Python. Computer vision is one of today's most exciting application fields of Machine Learning, From self-driving cars to medical diagnosis, computer vision has been widely used in various domains. This

Authorship Attribution
✍ Patrick Juola πŸ“‚ Library πŸ“… 2008 🌐 English

Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. It is an important problem not only in information retrieval but in many other disciplines as

Ensemble Methods for Machine Learning
✍ Gautam Kunapuli πŸ“‚ Library πŸ“… 2023 πŸ› Manning 🌐 English

Ensemble machine learning combines the power of multiple machine learning approaches, working together to deliver models that are highly performant and highly accurate. Inside Ensemble Methods for Machine Learning you will find: β€’ Methods for classification, regression, and recommendations β€’ So