The theme of this volume was inspired by the theme of the VIIIth International Congress for the Study of Child Language which was held in San Sebastian, Spain, in July 1999. The chapters in this volume are based on papers that were presented at that meeting. They provide a snapshot of the current st
Corpora in Language Acquisition Research: History, Methods, Perspectives (Trends in Language Acquisition Research, Volume 6)
β Scribed by Heike Behrens (Editor)
- Year
- 2008
- Tongue
- English
- Leaves
- 266
- Edition
- 6
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Corpus research forms the backbone of research on children's language development. Leading researchers in the field present a survey on the history of data collection, different types of data, and the treatment of methodological problems. Morphologically and syntactically parsed corpora allow for the concise explorations of formal phenomena, the quick retrieval of errors, and reliability checks.New probabilistic and connectionist computations investigate how children integrate the multiple sources of information available in the input, and new statistical methods compute rates of acquisition as well as error rates dependent on sample size. Sample analyses show how multi-modal corpora are used to investigate the interaction of discourse and linguistic structure, how cross-linguistic generalizations for acquisition can be formulated and tested, and how individual variation can be explored. Finally, ways in which corpus research interacts with computational linguistics and experimental research are presented.
β¦ Table of Contents
Corpora in Language Acquisition Research......Page 2
Editorial page
......Page 3
Title page
......Page 4
LCC data
......Page 5
Table of contents......Page 6
List of contributors......Page 8
Preface......Page 10
2. Building child language corpora: Sampling methods......Page 12
2.1.1 Diaries......Page 13
2.1.2 Audio- and video-recorded longitudinal data......Page 15
2.1.3 Cross-sectional studies......Page 17
3.1 From diaries and mimeographs to machine-readable corpora......Page 18
3.3 Establishing databases......Page 19
3.4 Data maintenance......Page 20
3.5 Annotation......Page 21
4. Information retrieval: From manual to automatic analyses......Page 22
5.2 Institutional responsibilities......Page 24
6.1 Phonetic and prosodic analyses......Page 25
6.3 Distributional analyses......Page 26
6.6 Communicative processes......Page 27
6.8 Research synthesis and meta-analyses......Page 28
7. About this volume......Page 29
1. Introduction......Page 32
2. Sampling and errors in childrenβs early productions......Page 33
2.1.2 Small samples fail to capture short-lived errors or errors in low frequency structures......Page 35
Figure 1. Percentage of Laraβs wh-questions with forms of DO/modal auxiliaries that were errors of commission over stage IV.......Page 36
Table 1. Rates of inversion error in Laraβs wh-questions calculated from samples of different sizes (% of questions).......Page 38
2.1.3 Small corpora yield unreliable error rates, especially in low frequency structures......Page 37
2.2.1 High frequency items dominate overall error rates......Page 39
2.2.3 Overall error rates collapse over subsystems......Page 40
Table 2. Number of verb contexts requiring present tense inflection and percentage rate of agreement error.*......Page 41
2.3.1.1 Statistical methods for assessing how much data is required......Page 42
Figure 2. Probability of capturing at least one target during a one week period, given different sampling densities and target frequencies.......Page 43
2.3.2.1 Statistical methods......Page 44
2.3.2.2 Combining different types of samples......Page 46
Table 3. Comparison of descriptive statistics: Manchester corpus children and Lara......Page 47
3. Sampling and the investigation of productivity......Page 48
3.1 The effect of sample size on measures of productivity......Page 49
3.2 The effect of frequency statistics on measures of productivity......Page 50
3.4 Assessing productivity: A solution......Page 51
4. Conclusion......Page 53
Appendix: The use of error codes with the CHAT transcription system and the CHILDES database......Page 55
1. Introduction......Page 56
1.1 Noun plurals in acquisition......Page 58
1.1.2 Challenges to the dual-route......Page 59
1.2 Complexity in the formation of noun plurals......Page 61
Table 1. A fragment of the interaction between gender and sonority in Austrian German......Page 62
2. Language systems......Page 63
Table 2. Sonority in Dutch......Page 64
Table 3. Interaction of gender and sonority in Austrian German......Page 65
2.3 Danish plural formation......Page 66
Table 4. Interaction of gender and sonority in Danish......Page 67
2.4 Hebrew plural formation......Page 68
Table 5. Interaction of gender and sonority in Hebrew......Page 69
3.3 Danish......Page 70
Table 6. General word frequencies in types and tokens across the four data-sets......Page 71
Table 7. Raw frequencies and percentages of nouns and noun plurals in CDS......Page 72
4.1.1 Dutch......Page 73
Table 10. Suffix distribution on the basis of word-final phonology: tokens in Dutch CDS......Page 74
Table 11. Suffix distribution on the basis of item gender and word-final phonology: types in German CDS......Page 75
Table 12. Suffix distribution on the basis of item gender and word-final phonology: tokens in German CDS......Page 76
Table 13. Suffix distribution on the basis of item gender and word-final phonology: types in Danish CDS......Page 77
Table 14. Suffix distribution on the basis of item gender and word-final phonology: tokens in Danish CDS......Page 78
Table 15. Suffix distribution on the basis of item gender and word-final phonology: types in Hebrew CDS......Page 79
Table 16. Suffix distribution on the basis of item gender and word-final phonology: tokens in Hebrew CDS......Page 80
4.2.1 German......Page 81
Table 17. Suffix distribution on the basis of item gender and word-final phonology: types in German CS......Page 82
4.2.2 Danish......Page 83
Table 20. Suffix distribution on the basis of item gender and word-final phonology: tokens in Danish CS......Page 84
Table 22. Suffix distribution on the basis of item gender and word-final phonology: tokens in Hebrew CS......Page 85
5. General discussion......Page 86
5.1 CDS compared with adult directed speech (ADS)......Page 87
Figure I. Predictability of the plural suffix βen in Dutch ADS and CDS according to the form of the final rhyme (wordtypes)......Page 88
Figure II. Predictability of the plural suffix βen in Dutch ADS and CDS according to the form of the final rhyme (wordtokens)......Page 89
6. Conclusions......Page 90
1. Introduction......Page 92
1.2 Generativist accounts of auxiliary development......Page 93
1.3 Usage-based approaches......Page 95
1.4 Different approaches to accounting for children's auxiliary errors......Page 97
1.5 Productivity......Page 98
2. The present study......Page 100
2.1.2 Data collection......Page 101
Table 1. Number of multi-verb utterances......Page 102
2.3 Analyses......Page 103
Table 2. Age and MLU in words at the start and end of the study......Page 104
Table 3. Number of frames and the percentage of utterances accounted for by frames......Page 105
Table 4. Frames produced by at least 5 children and rank order of emergence......Page 106
Table 5. Frames produced by fewer than 5 children and order of emergence......Page 107
2.4.3 Evidence for developing schematicity and generalisation......Page 108
Table 6. The childrenβs non-tag question errors......Page 110
Table 7. Age at which different structures are attested......Page 116
Table 8. The first two examples of ellipsis for each child......Page 117
2.5 Relationship to input......Page 120
Table 9. Frames used by the mothers in the Manchester CHILDES corpus and not produced by the children in the present study......Page 121
3.1 Frequency and sampling......Page 122
3.2 How abstract is the child's knowledge of auxiliaries?......Page 123
3.3 Using different methodologies......Page 124
3.4 Individual differences......Page 125
4. Conclusion......Page 126
Appendix A. The children's tag questions......Page 127
Appendix B. Mean rank order of frequency of mothers' frames (Manchester corpus)......Page 129
1. Introduction......Page 130
2. The effect of information flow on argument realization in adult speech......Page 133
3. The effect of information flow on argument realization in child speech......Page 136
4. Individual accessibility features......Page 139
4.1 Newness......Page 140
4.2 Topicality......Page 141
4.4 Query......Page 142
4.5 Disambiguation / contrast / interference......Page 143
4.6 Explicit contrast / emphasis......Page 144
4.7 Person......Page 145
4.8 Animacy......Page 146
4.9 Attention......Page 147
4.10 Developmental trends......Page 148
4.11 Summary......Page 151
5.1 Several features in one coding category......Page 153
5.2 Threshold approach......Page 154
5.4 Independent contribution......Page 155
5.5 Case study of interaction between two features......Page 156
6.1 Preferred argument structure......Page 158
6.2 Conversational sequences......Page 159
6.3 Managing miscommunication......Page 160
6.4 Summary......Page 161
7. Experimental studies......Page 162
7.1 Strengths of production studies......Page 163
7.2 Difficulties with production studies......Page 164
7.3 Summary......Page 165
8. Discussion and conclusion......Page 166
2. The chicken and egg problem of syntax acquisition......Page 170
3. Solutions to the chicken and egg problem - innate categories don't help......Page 171
4. Intra-linguistic cues in the utterance: from statistics to structure......Page 173
4.1 Measuring potential information in the corpus......Page 174
4.2 Deriving syntactic structure from the corpus......Page 176
Table 1. Phonological and prosodic cues found to distinguish grammatical categories in English......Page 178
5.1 Individual cues in categorisation......Page 180
5.2 Combined cues for categorisation......Page 181
6. Combining intra-linguistic cues......Page 182
7.1 Learning to segment artificial language with multiple cues......Page 184
7.2 Learning to categorise artificial language with multiple cues......Page 186
8. How are multiple cues integrated?......Page 187
Figure 1. Classifications of nouns and verbs based on distributional cues alone (horizontal dotted line), phonological cues alone (vertical dotted line), and combined cues (oblique dashed line)......Page 188
9. Extra-linguistic cues and language learning......Page 189
10. Future directions for multiple cue research......Page 190
10.1 Quantifying new cues......Page 191
10.2 Cues for different levels of language learning......Page 192
10.3 Computational and developmental approaches to multiple cues......Page 193
11. Conclusion......Page 194
1. Introduction......Page 196
2. Analysis by transcript scanning......Page 198
3. Analysis by lexical tracking......Page 199
4. Measures of morphosyntactic development......Page 200
5. Generative frameworks......Page 202
6. Analysis based on automatic morphosyntactic coding......Page 203
6.1. MOR and FST......Page 204
6.2. Understanding MOR......Page 205
6.3 Compounds and complex forms......Page 209
6.4 Lemmatization......Page 211
7. Using MOR with a new corpus......Page 212
8. Affixes and control features......Page 214
9. MOR for bilingual corpora......Page 215
11. Difficult decisions......Page 217
12. Building MOR grammars......Page 218
13. Chinese MOR......Page 223
14. GRASP......Page 224
15. Research using the new infrastructure......Page 226
16. Next steps......Page 227
17. Conclusion......Page 228
1. Introduction......Page 230
4. Longitudinal case studies......Page 231
6. Nature of the input and learnability issues......Page 232
7. Discourse context and the structure of language......Page 233
9. Areas ripe for further corpus research......Page 234
11. Converging evidence from corpus and experimental studies......Page 235
References......Page 238
Index......Page 262
The series Trends in Language Acquisition Research......Page 266
π SIMILAR VOLUMES
Experimental Methods in Language Acquisition Research provides students and researchers interested in language acquisition with comprehensible and practical information on the most frequently used methods in language acquisition research. It includes contributions on first and child/adult second lan
<p><span>This timely text provides a comprehensive overview of the research methods used by the Generative Second Language Acquisition framework.</span></p><p><span>The authors lay out the history and state of the art in the field, explain the theoretical underpinnings of this work, and offer practi
This volume assembles controversial research in restricted areas of SLA and attempts to convey the richness of methodology and variety of thematic scope in European SLA, as discussed at an international workshop: "Current Trends in European Second Language Acquisition."