Cluster Analysis for Corpus Linguistics

✍ Scribed by Hermann Moisl

Publisher: De Gruyter Mouton
Year: 2015
Tongue: English
Leaves: 396
Series: Quantitative Linguistics [QL]; 66
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

The standard scientific methodology in linguistics is empirical testing of falsifiable hypotheses. As such the process of hypothesis generation is central, and involves formulation of a research question about a domain of interest and statement of a hypothesis relative to it. In corpus linguistics the domain is text, and generation involves abstraction of data from text, data analysis, and formulation of a hypothesis based on inference from the results. Traditionally this process has been paper-based, but the advent of electronic text has increasingly rendered it obsolete both because the size of digital corpora is now at or beyond the limit of what can efficiently be used in the traditional way, and because the complexity of data abstracted from them can be impenetrable to understanding. Linguists are increasingly turning to mathematical and statistical computational methods for help, and cluster analysis is such a method. It is used across the sciences for hypothesis generation by identification of structure in data which are too large or complex, or both, to be interpretable by direct inspection. This book aims to show how cluster analysis can be used for hypothesis generation in corpus linguistics, thereby contributing to a quantitative empirical methodology for the discipline.

Describes a range of clustering methods for analysis of data derived from language corpora.
Gives an intuitively accessible account of the mathematical concepts which underlie data creation, data transformation, and cluster analysis.

📜 SIMILAR VOLUMES

Cluster Analysis for Corpus Linguistics

📁 Cluster Analysis for Corpus Linguistics

✍ Hermann Moisl 📂 Library 📅 2015 🏛 De Gruyter Mouton 🌐 English

The standard scientific methodology in linguistics is empirical testing of falsifiable hypotheses. As such the process of hypothesis generation is central, and involves formulation of a research question about a domain of interest and statement of a hypothesis relative to it. In corpus linguistic

Cluster Analysis for Corpus Linguistics

📁 Cluster Analysis for Corpus Linguistics

✍ Hermann Moisl 📂 Library 📅 2015 🏛 Mouton De Gruyter 🌐 English

The rapidly growing volume of digital natural language text and the complexity of data abstracted from it have increasingly rendered traditional corpus linguistic analytical methodology obsolete. This book describes a cluster analytic methodology for generating linguistic hypotheses on the basis of

Multifactorial Analysis in Corpus Lingui

📁 Multifactorial Analysis in Corpus Linguistics (Open Linguistics Series)

✍ Stefan Thomas Gries 📂 Library 📅 2003 🌐 English

Corpus-based analysis and diachronic lin

📁 Corpus-based analysis and diachronic linguistics

✍ Yuji Kawaguchi; Makoto Minegishi; Wolfgang Viereck 📂 Library 📅 2012 🏛 John Benjamins Pub. Co. 🌐 English

Corpus Analysis and Variation in Linguis

📁 Corpus Analysis and Variation in Linguistics

✍ Yuji Kawaguchi, Makoto Minegishi, Jacques Durand 📂 Library 📅 2009 🏛 John Benjamins Publishing Company 🌐 English

For sale in all countries except Japan. For customers in Japan: please contact Yushodo Co. In this new edition of TUFS Studies in Linguistics, we aim to showcase the various linguistics research conducted at Tokyo University of Foreign Studies. In this first volume, we

Programming for Corpus Linguistics: How

📁 Programming for Corpus Linguistics: How to Do Text Analysis with Java

✍ Oliver Mason 📂 Library 📅 2022 🏛 Edinburgh University Press 🌐 English

The ability to program a computer has become increasingly important in work that involves corpora. Specialised research needs can no longer be met by available software, and purchasing customised programs is usually not an option. This book enables the researcher to write programs for text and co