This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forwardIncludes a
The Handbook of Computational Linguistics and Natural Language Processing
β Scribed by Alexander Clark, Chris Fox, Shalom Lappin, (Editors)
- Publisher
- Wiley-Blackwell
- Year
- 2010
- Tongue
- English
- Leaves
- 801
- Series
- Blackwell Handbooks in Linguistics
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forwardIncludes an introduction to the major theoretical issues in these fields, as well as the central engineering applications that the work has producedPresents the major developments in an accessible way, explaining the close connection between scientific understanding of the computational properties of natural language and the creation of effective language technologiesServes as an invaluable state-of-the-art reference source for computational linguists and software engineers developing NLP applications in industrial research and development labs of software companies
β¦ Table of Contents
Cover......Page 1
Title Page......Page 5
Contents......Page 9
List of Figures......Page 11
List of Tables......Page 16
Notes on Contributors......Page 17
Preface......Page 25
Introduction......Page 27
Part I: Formal Foundations......Page 35
2 Basic Notions......Page 37
3 Language Classes and Linguistic Formalisms......Page 40
4.1 Regular expressions......Page 41
4.2 Properties of regular languages......Page 42
4.3 Finite state automata......Page 43
4.4 Minimization and determinization......Page 45
4.5 Operations on finite state automata......Page 47
4.6 Applications of finite state automata in natural language processing......Page 48
4.7 Regular relations......Page 49
4.8 Finite state transducers......Page 51
4.9 Properties of regular relations......Page 52
5.1 Where regular languages fail......Page 54
5.2 Grammars......Page 55
5.3 Derivation......Page 56
5.4 Derivation trees......Page 58
5.5 Expressiveness......Page 60
5.6 Formal properties of context-free languages......Page 62
6.1 A hierarchy of language classes......Page 63
6.2 The location of natural languages in the hierarchy......Page 64
6.3 Weak and strong generative capacity......Page 65
7 Mildly Context-Sensitive Languages......Page 66
8 Further Reading......Page 67
1 A Brief Review of Complexity Theory......Page 69
1.1 Turing machines and models of computation......Page 70
1.2 Decision problems......Page 74
1.3 Relations between complexity classes......Page 78
1.4 Lower bounds......Page 79
2 Parsing and Recognition......Page 81
2.1 Regular languages......Page 82
2.2 Context-free languages......Page 83
2.3 More expressive grammar frameworks......Page 85
2.4 Model-theoretic semantics......Page 90
3 Complexity and Semantics......Page 91
4 Determining Logical Relationships between Sentences......Page 93
1 Introduction to Statistical Language Modeling......Page 100
1.1.1 Perplexity......Page 101
1.1.2 Task-specific measures......Page 102
1.2 Smoothing......Page 103
1.3 Language model representation in practice......Page 105
2 Structured Language Model......Page 107
2.1 Basic idea and terminology......Page 108
2.1.1 Word sequence and parse encoding......Page 109
2.2 Probabilistic model......Page 112
2.2.2 Modeling tool......Page 113
2.4 Left-to-right perplexity......Page 114
2.5 Separate left-to-right word predictor in the language model......Page 115
2.5.1.1 Headword percolation......Page 116
2.6 Model parameter re-estimation......Page 118
2.6.2 Second stage parameter re-estimation......Page 120
2.7 UPenn Treebank perplexity results......Page 121
2.7.1 Maximum depth factorization of the model......Page 122
2.7.1.1 Non-causal βPerplexityβ......Page 123
3 Speech Recognition Lattice Rescoring Using the Structured Language Model......Page 125
4 Richer Syntactic Dependencies......Page 126
6 Conclusion......Page 128
Acknowledgment......Page 129
Notes......Page 130
1 Introduction......Page 131
2 Context-Free Grammars and Recognition......Page 133
3 Context-Free Parsing......Page 137
4 Probabilistic Parsing......Page 140
5 Lexicalized Context-Free Grammars......Page 142
6 Dependency Grammars......Page 146
7 Tree Adjoining Grammars......Page 149
8 Translation......Page 151
9 Further Reading......Page 155
Note......Page 156
Part II: Current Methods......Page 157
1 Introduction......Page 159
2 Maximum Entropy and Exponential Distributions......Page 162
3 Parameter Estimation......Page 164
3.1 Iterative scaling......Page 166
3.2 First-order methods......Page 167
3.3 Second-order methods......Page 168
3.4 Comparing parameter estimation methods......Page 169
4 Regularization......Page 171
5.1 Classification......Page 173
5.2 Sequence models......Page 175
5.3 Parsing models......Page 177
6 Prospects......Page 178
Note......Page 179
1 Introduction......Page 180
2 Memory-Based Language Processing......Page 181
2.1 MBLP: an operationalization of MBL......Page 182
3.1 Morpho-phonology......Page 186
3.3 Text analysis......Page 187
3.5 Generation, language modeling, and translation......Page 188
4 Exemplar-Based Computational Psycholinguistics......Page 189
5 Generalization and Abstraction......Page 191
6.1 Careful abstraction in memory-based learning......Page 193
6.2 Fambl: merging example families......Page 195
6.3 Experiments with Fambl......Page 198
7 Further Reading......Page 204
Notes......Page 205
1 NLP and Classification......Page 206
2 Induction of Decision Trees......Page 207
2.1 The splitting criterion......Page 208
2.3 Feature tests......Page 211
2.4 Pruning......Page 212
2.5.1 Bagging......Page 214
2.5.3 Random forests......Page 215
2.6 Decision trees for probability estimation......Page 216
3 NLP Applications......Page 217
3.2.1 Part-of-speech tagging......Page 218
3.2.3 Parameter estimation......Page 219
3.2.4 Estimation of contextual probabilities with decision trees......Page 220
5 Further Reading......Page 221
Notes......Page 222
1.1 Machine learning in natural language processing and computational linguistics......Page 223
1.2 Grammar induction as a machine learning problem......Page 224
1.3 Supervised learning......Page 225
2 Computational Learning Theory......Page 227
2.1 Summary......Page 232
3 Empirical Learning......Page 234
3.1 Learning word classes......Page 236
3.2 Unsupervised parsing......Page 237
3.3 Accuracy vs. cost in supervised, unsupervised, and semi-supervised learning......Page 240
4 Unsupervised Grammar Induction and Human Language Acquisition......Page 241
Notes......Page 245
2 Background......Page 247
2.1.1 The MLP architecture......Page 248
2.1.2 Learning in MLPs......Page 249
2.1.3 Probability estimation......Page 251
2.2 Recurrent MLPs......Page 252
3.1 Language modeling......Page 255
3.1.2 Neural syntactic language models......Page 256
3.2.1 Constituency parsing......Page 257
3.2.2 Dependency parsing......Page 258
3.2.3 Functional and semantic role parsing......Page 259
3.2.4 Semantic role tagging......Page 260
4 Further Reading......Page 261
Notes......Page 262
1 Introduction......Page 264
2 Review of Selected Annotation Schemes......Page 265
2.1.1 Phrase-structure treebanks......Page 267
2.1.2 Dependency treebanks......Page 268
2.2 Semantic classification, e.g., sense tagging......Page 270
2.2.1 Choosing the sense inventory......Page 271
2.3.1 The Proposition Bank......Page 272
2.3.3 VerbNet......Page 274
2.3.5 ACE relations......Page 275
2.4 TimeBank......Page 276
2.5.2 The Penn Discourse Treebank......Page 278
2.5.3 A comparison of the two approaches......Page 279
2.7 Opinion annotation......Page 281
2.8 Multi-layered annotation projects......Page 282
3.1 The phenomena to be annotated......Page 284
3.2 Choosing a target corpus......Page 285
3.2.1 Isolated sentences......Page 286
3.2.2 Parallel corpora......Page 287
3.3.2 Alternative guideline styles......Page 288
3.3.3 The annotation process......Page 289
3.4 Annotation infrastructure and tools......Page 290
3.4.3 User-friendly annotation tools......Page 291
3.5 Annotation evaluation......Page 292
3.6 Pre-processing......Page 294
4 Conclusion......Page 295
Notes......Page 296
1 Introduction......Page 297
2.1 Automatic and manual evaluations......Page 299
2.3 Intrinsic and extrinsic evaluations......Page 300
2.4 Component and end-to-end evaluations......Page 301
2.5 Inter-annotator agreement and upper bounds......Page 302
2.6 Partitioning of data used in evaluations......Page 303
2.7 Cross validation......Page 304
2.8 Summarizing and comparing performance......Page 305
3 Evaluation Paradigms in Common Evaluation Settings......Page 307
3.1 One output per input......Page 308
3.2 Multiple outputs per input......Page 309
3.3 Text output for each input......Page 310
3.4 Structured outputs......Page 311
3.5 Output values on a scale......Page 312
4.1 Pre-Senseval WSD evaluation......Page 314
4.2 Senseval......Page 315
5 Case Study: Evaluation of Question Answering Systems......Page 316
Notes......Page 319
Part III: Domains of Application......Page 323
1.1......Page 325
1.1.3 Style......Page 326
1.2 Learning from data......Page 327
1.3 Corpora and evaluation......Page 328
2.1 Acoustic features......Page 330
2.2 HMM/GMM framework......Page 332
2.3 Subword modeling......Page 336
2.4 Discriminative training......Page 338
2.5 Speaker adaptation......Page 341
3 Search......Page 344
4 Case Study: The AMI System......Page 347
5.1 Robustness......Page 350
5.2 Multiple knowledge sources......Page 353
5.2 Richer sequence models......Page 354
5.4 Large scale......Page 356
6 Conclusions......Page 357
Notes......Page 358
1 Introduction......Page 359
2 History......Page 362
3 Generative Parsing Models......Page 363
3.1 Collins - Models 1, 2, and 3......Page 364
3.2 Parameter estimation......Page 367
3.3 Parsing algorithm and search......Page 368
4 Discriminative Parsing Models......Page 369
4.1 Conditional log-linear models......Page 370
4.2 Discriminative dependency parsing......Page 371
5 Transition-Based Approaches......Page 375
5.1 Transition-based dependency parsing......Page 377
6 Statistical Parsing with CCG......Page 378
6.1 Combinatory categorial grammar......Page 379
6.2 Log-linear parsing model for CCG......Page 381
6.3 Efficient estimation......Page 383
6.4 Parsing in practice......Page 384
7 Other Work......Page 386
8 Conclusion......Page 387
Notes......Page 388
1.1 General remarks......Page 390
1.2 Morphology......Page 392
1.3 Static and dynamic metaphors......Page 395
2.1 The two problems of word segmentation......Page 396
2.2.1 Olivier......Page 399
2.2.3 Sequitur......Page 400
2.2.4 MDL approaches......Page 401
2.2.5 Hierarchical Bayesian models......Page 404
2.3 Word boundary detectors......Page 405
2.4 Successes and failures in word segmentation......Page 406
3.1 Zellig Harris......Page 407
3.2 Using description length......Page 408
3.3 Work in the field......Page 411
4 Implementing Computational Morphologies......Page 412
4.1 Finite state transducers......Page 413
4.2 Morphophonology......Page 416
5 Conclusions......Page 417
Notes......Page 418
1 Introduction......Page 420
1.1 Outline......Page 421
2 Background......Page 422
2.1 A standard approach......Page 423
2.2 Basic types......Page 426
2.3 Model theory and proof theory......Page 427
2.4 Lexical semantics......Page 428
3.1 Discourse......Page 429
3.1.1 Discourse representation theory......Page 431
3.1.3 Type theoretic approaches......Page 433
3.2.1 Cooper storage......Page 434
3.2.2 Other treatments of scope ambiguity......Page 436
4.1 Type theory......Page 437
4.1.2 First-order sorts......Page 438
4.1.4 Constructive theories and dependent types......Page 439
4.2 Intensionality......Page 440
4.2.2 Other approaches......Page 441
4.3.1 Questions and answers......Page 442
4.3.2 Imperatives......Page 445
4.4 Expressiveness, formal power, and computability......Page 447
5 Corpus-Based and Machine Learning Methods......Page 448
5.1 Latent semantic analysis......Page 449
5.3 Word-sense disambiguation......Page 450
5.4 Textual entailment......Page 451
Notes......Page 453
1 Introduction......Page 455
2 The Challenges of Dialogue......Page 456
2.1.1 Move classification......Page 458
2.1.2 Move characterization: queries and assertions......Page 459
2.1.3 Move characterization: meta-communication......Page 462
2.2.1 Sentential fragments......Page 466
2.2.2 Disfluencies......Page 468
3.1 Basic architecture of dialogue systems......Page 469
3.2.1 Finite state dialogue management......Page 470
3.2.2 Frame-based dialogue management......Page 471
3.2.3 Inference-based dialogue management......Page 472
3.3.1 Query and assertion benchmarks......Page 473
3.3.2 Meta-communication benchmarks......Page 475
3.3.3 Fragment understanding benchmarks......Page 476
3.4 The information state update framework......Page 477
4.1 Type theory with records: the basics......Page 479
4.2 Information states......Page 484
4.4 Move coherence......Page 485
4.5 Querying and assertion......Page 487
4.6 Domain specificity......Page 490
4.7 Meta-communicative interaction......Page 492
4.8 Disfluencies......Page 496
4.9 Sentential fragments......Page 498
4.9.2 Short answers......Page 499
4.9.4 Reprise fragments: intended content reading......Page 500
5.1 Automatic learning of dialogue management......Page 502
5.2 Multiparty dialogue......Page 503
6 Conclusions......Page 504
Acknowledgment......Page 505
Notes......Page 506
1 Introduction......Page 508
2 Computational Models of Human Language Processing......Page 509
2.1 Theories and models......Page 511
2.2 Experimental data......Page 512
3 Symbolic Models......Page 515
3.1 Ambiguity resolution......Page 516
3.2 Working memory......Page 518
4 Probabilistic Models......Page 519
4.1 Lexical processing......Page 520
4.2 Syntactic processing......Page 522
4.4 Information-theoretic models......Page 526
4.5 Probabilistic semantics......Page 527
5.1 Simple recurrent networks......Page 531
6 Hybrid Models......Page 535
7 Concluding Remarks......Page 537
Notes......Page 539
Part IV: Applications......Page 541
1 Introduction......Page 543
3 Name Extraction......Page 544
3.1 Hand coded rules......Page 545
3.2 Supervised learning......Page 546
3.4 Evaluation......Page 547
4 Entity Extraction......Page 548
5 Relation Extraction......Page 549
5.1 Hand coded rules and supervised methods......Page 550
5.2 Weakly supervised and unsupervised methods......Page 551
6 Event Extraction......Page 552
6.1 Hand coded rules......Page 553
6.3 Weakly supervised systems......Page 554
7 Concluding Remarks......Page 555
Notes......Page 556
1 Introduction......Page 557
2.1.1 Data......Page 558
2.1.2 Corpus clean-up, segmentation, and tokenization......Page 560
2.1.3 Word alignment......Page 561
2.1.4.1 Motivation for phrase-based models......Page 563
2.1.4.3 Refined word alignments for phrase extraction......Page 564
2.2 Reordering models......Page 565
2.2.1.2 Language model smoothing......Page 567
2.3.1 Minimum error rate training......Page 568
2.4.1 Translation options selection......Page 569
2.4.2 Future cost estimation......Page 570
2.4.4 n-best list generation......Page 571
2.5 MT evaluation......Page 572
2.6 Re-ranking......Page 573
3.1.1 Model......Page 574
3.1.3.1 Incorporating the language model......Page 575
3.2.2 Unsupervised tree-to-tree models......Page 576
3.2.3 Supervised tree-to-tree models......Page 577
3.2.4 Supervised tree-to-tree and tree-to-string model......Page 578
3.3 Example-based machine translation......Page 579
3.4 Rule-based machine translation......Page 580
3.5.1 Multi-engine MT......Page 582
4 MT Applications......Page 583
4.3 Spokenn language translation......Page 584
5 Machine Translation at DCU......Page 585
5.1.1 Adding source-language context into PB-SMT......Page 586
5.2.2 Combining EBMT and PB-SMT chunks......Page 587
5.2.4 Tree-based translation......Page 588
5.2.5 Augmenting PB-SMT with subtree pairs......Page 590
5.3.1 Incorporating supertags into PB-SMT......Page 591
5.5 Future research directions......Page 593
6 Concluding Remarks and Future Directions......Page 594
7 Further Reading......Page 595
Notes......Page 597
1 High-Level Perspective: Making Choices about Language......Page 600
2.1 SumTime: Weather Forecasts......Page 601
2.2 Example NLG system: SkillSum......Page 603
2.3 Other NLG applications......Page 604
3.1 Document planner......Page 605
3.2 Microplanning......Page 607
3.3 Realization......Page 610
4 NLG Evaluation......Page 612
4.2 Non-task-based human evaluations......Page 613
4.3 Metric-based corpus evaluations......Page 614
5.1 Statistical approaches to NLG......Page 615
5.2.1 Language and the world: what do words mean?......Page 617
5.2.2 Data analysis for linguistic communication......Page 618
5.3 NLG outputs, the role of language in human-computer interaction......Page 619
5.3.1 Text and graphics......Page 620
5.3.3 User modeling......Page 621
5.4.1 Motivation and persuasion......Page 622
5.4.3 Entertainment......Page 623
Notes......Page 624
1 Discourse: Basic Notions and Terminology......Page 625
2.1 Text organization......Page 627
2.2 Text segmentation algorithm......Page 628
3.1 Hobb's theory of coherence relations......Page 631
3.2 Rhetorical structure theory......Page 632
3.3 Centering......Page 633
4.1 Anaphora: linguistic fundamentals......Page 637
4.2 Anaphora resolution......Page 640
4.2.2 Location of the candidates for antecedents......Page 641
4.2.3 The resolution algorithm: factors in anaphora resolution......Page 642
4.3.1.1 Hobbs's naΓ―ve algorithm......Page 643
4.3.2.3 Baldwin's CogNIAC......Page 644
4.3.3 Comparing pronoun resolution algorithms......Page 645
5.1 Text organization and discourse segmentation applications......Page 646
5.2 Applications of discourse coherence theories......Page 648
5.3 Anaphora resolution applications......Page 650
6 Further Reading......Page 651
Acknowledgment......Page 653
Notes......Page 654
1 What is Question Answering?......Page 656
1.1 Early question answering systems......Page 657
1.2 Question answering in dialogue systems......Page 658
1.3 Question answering in TREC......Page 659
2.1 Question typing......Page 660
2.2.1 Relevance-based retrieval......Page 662
2.2.2 Pattern-based retrieval......Page 664
2.3 Processing answer candidates......Page 666
2.4 Evaluating performance in QA......Page 667
3.1 Extending the relation between question and corpus......Page 669
3.2 Broadening the range of answerable questions......Page 670
3.3 Relation between user and system......Page 673
3.4 Evaluating extended QA capabilities......Page 677
Note......Page 680
A......Page 681
B......Page 684
C......Page 692
D......Page 701
E......Page 704
F......Page 705
G......Page 708
H......Page 713
J......Page 718
K......Page 720
L......Page 725
M......Page 729
N......Page 737
O......Page 738
P......Page 740
R......Page 745
S......Page 749
T......Page 757
V......Page 760
W......Page 762
Y......Page 765
Z......Page 766
B......Page 768
C......Page 770
D......Page 771
F......Page 772
G......Page 773
H......Page 774
I......Page 775
K......Page 776
L......Page 777
M......Page 778
N......Page 780
P......Page 781
R......Page 782
S......Page 783
T......Page 785
W......Page 786
Z......Page 787
Subject Index......Page 789
π SIMILAR VOLUMES
Work with Python and powerful open source tools such as Gensim and spaCy to perform modern text analysis, natural language processing, and computational linguistics algorithms. About This Book Discover the open source Python text analysis ecosystem, using spaCy, Gensim, scikit-learn, and Keras Hands
Natural language processing (NLP) is a scientific discipline which is found at the interface of computer science, artificial intelligence and cognitive psychology. Providing an overview of international work in this interdisciplinary field, this book gives the reader a panoramic view of both early a
This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter