<p><p></p><p>This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, i
Representation Learning for Natural Language Processing
โ Scribed by Zhiyuan Liu (editor), Yankai Lin (editor), Maosong Sun (editor)
- Publisher
- Springer
- Year
- 2023
- Tongue
- English
- Leaves
- 535
- Edition
- 2nd ed. 2023
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
This book provides an overview of the recent advances in representation learning theory, algorithms, and applications for natural language processing (NLP), ranging from word embeddings to pre-trained language models. It is divided into four parts. Part I presents the representation learning techniques for multiple language entries, including words, sentences and documents, as well as pre-training techniques. Part II then introduces the related representation techniques to NLP, including graphs, cross-modal entries, and robustness. Part III then introduces the representation techniques for the knowledge that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, legal domain knowledge and biomedical domain knowledge. Lastly, Part IV discusses the remaining challenges and future research directions.
The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing.
As compared to the first edition, the second edition (1) provides a more detailed introduction to representation learning in Chapter 1; (2) adds four new chapters to introduce pre-trained language models, robust representation learning, legal knowledge representation learning and biomedical knowledge representation learning; (3) updates recent advances in representation learning in all chapters; and (4) corrects some errors in the first edition. The new contents will be approximately 50%+ compared to the first edition.
This is an open access book.
โฆ Table of Contents
Preface
Book Organization
Book Cover
Note for the Second Edition
Prerequisites
Contact Information
Acknowledgments
Acknowledgments for the Second Edition
Acknowledgments for the First Edition
Contents
Contributors
Acronyms
Symbols and Notations
1 Representation Learning and NLP
1.1 Motivation
1.2 Why Representation Learning Is Important for NLP
1.2.1 Multiple Granularities
1.2.2 Multiple Knowledge
1.2.3 Multiple Tasks
1.2.4 Multiple Domains
1.3 Development of Representation Learning for NLP
1.3.1 Symbolic Representation and Statistical Learning
1.3.2 Distributed Representation and Deep Learning
1.3.3 Going Deeper and Larger with Pre-training on Big Data
1.4 Intellectual Origins of Distributed Representation
1.4.1 Representation Debates in Cognitive Neuroscience
1.4.2 Knowledge Representation in AI
1.4.3 Feature Engineering in Machine Learning
1.4.4 Linguistics
1.5 Representation Learning Approaches in NLP
1.5.1 Feature Engineering
1.5.2 Supervised Representation Learning
1.5.3 Self-supervised Representation Learning
1.6 How to Apply Representation Learning to NLP
1.6.1 Input Augmentation
1.6.2 Architecture Reformulation
1.6.3 Objective Regularization
1.6.4 Parameter Transfer
1.7 Advantages of Distributed Representation Learning
1.8 The Organization of This Book
References
2 Word Representation Learning
2.1 Introduction
2.2 Symbolic Word Representation
2.2.1 One-Hot Word Representation
2.2.2 Linguistic KB-based Word Representation
2.2.3 Corpus-based Word Representation
2.3 Distributed Word Representation
2.3.1 Preliminary: Interpreting the Representation
2.3.2 Matrix Factorization-based Word Representation
2.3.3 Word2vec and GloVe
2.3.4 Contextualized Word Representation
2.4 Advanced Topics
2.4.1 Informative Word Representation
2.4.2 Interpretable Word Representation
2.5 Applications
2.5.1 NLP
2.5.2 Cognitive Psychology
2.5.3 History and Social Science
2.6 Summary and Further Readings
References
3 Representation Learning for Compositional Semantics
3.1 Introduction
3.2 Binary Composition
3.2.1 Additive Model
3.2.2 Multiplicative Model
3.3 N-ary Composition
3.4 Summary and Further Readings
References
4 Sentence and Document Representation Learning
4.1 Introduction
4.2 Symbolic Sentence Representation
4.2.1 Bag-of-Words Model
4.2.2 Probabilistic Language Model
4.3 Neural Language Models
4.3.1 Feed-Forward Neural Network
4.3.2 Convolutional Neural Network
4.3.3 Recurrent Neural Network
4.3.4 Transformer
4.3.5 Enhancing Neural Language Models
4.4 From Sentence to Document Representation
4.4.1 Memory-Based Document Representation
4.4.2 Hierarchical Document Representation
4.5 Applications
4.5.1 Text Classification
4.5.2 Information Retrieval
4.5.3 Reading Comprehension
4.5.4 Open-Domain Question Answering
4.5.5 Sequence Labeling
4.5.6 Sequence-to-Sequence Generation
4.6 Summary and Further Readings
References
5 Pre-trained Models for Representation Learning
5.1 Introduction
5.2 Pre-training Tasks
5.2.1 Word-Level Pre-training
5.2.2 Sentence-Level Pre-training
5.3 Model Adaptation
5.3.1 Full-Parameter Fine-Tuning
5.3.2 Delta Tuning
5.3.3 Prompt Learning
5.4 Advanced Topics
5.4.1 Better Model Architecture
5.4.2 Multilingual Representation
5.4.3 Multi-Task Representation
5.4.4 Efficient Representation
5.4.5 Chain-of-Thought Reasoning
5.5 Summary and Further Readings
References
6 Graph Representation Learning
6.1 Introduction
6.2 Symbolic Graph Representation
6.3 Shallow Node Representation Learning
6.3.1 Spectral Clustering
6.3.2 Shallow Neural Networks
6.3.3 Matrix Factorization
6.4 Deep Node Representation Learning
6.4.1 Autoencoder-Based Methods
6.4.2 Graph Convolutional Networks
6.4.3 Graph Attention Networks
6.4.4 Graph Recurrent Networks
6.4.5 Graph Transformers
6.4.6 Extensions
6.5 From Node Representation to Graph Representation
6.5.1 Flat Pooling
6.5.2 Hierarchical Pooling
6.6 Self-Supervised Graph Representation Learning
6.7 Applications
6.8 Summary and Further Readings
References
7 Cross-Modal Representation Learning
7.1 Introduction
7.2 Cross-Modal Capabilities
7.3 Shallow Cross-Modal Representation Learning
7.4 Deep Cross-Modal Representation Learning
7.4.1 Cross-Modal Understanding
7.4.2 Cross-Modal Retrieval
7.4.3 Cross-Modal Generation
7.5 Deep Cross-Modal Pre-training
7.5.1 Input Representations
7.5.2 Model Architectures
7.5.3 Pre-training Tasks
7.5.4 Adaptation Approaches
7.6 Applications
7.7 Summary and Further Readings
References
8 Robust Representation Learning
8.1 Introduction
8.2 Backdoor Robustness
8.2.1 Backdoor Attack on Supervised Representation Learning
8.2.2 Backdoor Attack on Self-Supervised Representation Learning
8.2.3 Backdoor Defense
8.2.4 Toolkits
8.3 Adversarial Robustness
8.3.1 Adversarial Attack
8.3.2 Adversarial Defense
8.3.3 Toolkits
8.4 Out-of-Distribution Robustness
8.4.1 Spurious Correlation
8.4.2 Domain Shift
8.4.3 Subpopulation Shift
8.5 Interpretability
8.5.1 Understanding Model Functionality
8.5.2 Explaining Model Mechanism
8.6 Summary and Further Readings
References
9 Knowledge Representation Learning and Knowledge-Guided NLP
9.1 Introduction
9.2 Symbolic Knowledge and Model Knowledge
9.2.1 Symbolic Knowledge
9.2.2 Model Knowledge
9.2.3 Integrating Symbolic Knowledge and Model Knowledge
9.3 Knowledge Representation Learning
9.3.1 Linear Representation
9.3.2 Translation Representation
9.3.3 Neural Representation
9.3.4 Manifold Representation
9.3.5 Contextualized Representation
9.3.6 Summary
9.4 Knowledge-Guided NLP
9.4.1 Knowledge Augmentation
9.4.2 Knowledge Reformulation
9.4.3 Knowledge Regularization
9.4.4 Knowledge Transfer
9.4.5 Summary
9.5 Knowledge Acquisition
9.5.1 Sentence-Level Relation Extraction
9.5.2 Bag-Level Relation Extraction
9.5.3 Document-Level Relation Extraction
9.5.4 Few-Shot Relation Extraction
9.5.5 Open-Domain Relation Extraction
9.5.6 Contextualized Relation Extraction
9.5.7 Summary
9.6 Summary and Further Readings
References
10 Sememe-Based Lexical Knowledge Representation Learning
10.1 Introduction
10.2 Linguistic and Commonsense Knowledge Bases
10.2.1 WordNet and ConceptNet
10.2.2 HowNet
10.2.3 HowNet and Deep Learning
10.3 Sememe Knowledge Representation
10.3.1 Sememe-Encoded Word Representation
10.3.2 Sememe-Regularized Word Representation
10.4 Sememe-Guided Natural Language Processing
10.4.1 Sememe-Guided Semantic Compositionality Modeling
10.4.2 Sememe-Guided Language Modeling
10.4.3 Sememe-Guided Recurrent Neural Networks
10.5 Automatic Sememe Knowledge Acquisition
10.5.1 Embedding-Based Sememe Prediction
10.5.2 Sememe Prediction with Internal Information
10.5.3 Cross-lingual Sememe Prediction
10.5.4 Connecting HowNet with BabelNet
10.5.5 Summary and Discussion
10.6 Applications
10.6.1 Chinese LIWC Lexicon Expansion
10.6.2 Reverse Dictionary
10.7 Summary and Further Readings
References
11 Legal Knowledge Representation Learning
11.1 Introduction
11.2 Typical Tasks and Real-World Applications
11.3 Legal Knowledge Representation and Acquisition
11.3.1 Legal Textual Knowledge
11.3.2 Legal Structured Knowledge
11.3.3 Discussion
11.4 Knowledge-Guided Legal NLP
11.4.1 Input Augmentation
11.4.2 Architecture Reformulation
11.4.3 Objective Regularization
11.4.4 Parameter Transfer
11.5 Outlook
11.6 Ethical Consideration
11.7 Open Competitions and Benchmarks
11.8 Summary and Further Readings
References
12 Biomedical Knowledge Representation Learning
12.1 Introduction
12.1.1 Perspectives for Biomedical NLP
12.1.2 Role of Knowledge in Biomedical NLP
12.2 Biomedical Knowledge Representation and Acquisition
12.2.1 Biomedical Knowledge from Natural Language
12.2.2 Biomedical Knowledge from Biomedical Language Materials
12.3 Knowledge-Guided Biomedical NLP
12.3.1 Input Augmentation
12.3.2 Architecture Reformulation
12.3.3 Objective Regularization
12.3.4 Parameter Transfer
12.4 Typical Applications
12.4.1 Literature Processing
12.4.2 Retrosynthetic Prediction
12.4.3 Diagnosis Assistance
12.5 Advanced Topics
12.6 Summary and Further Readings
References
13 OpenBMB: Big Model Systems for Large-Scale Representation Learning
13.1 Introduction
13.2 BMTrain: Efficient Training Toolkit for Big Models
13.2.1 Data Parallelism
13.2.2 ZeRO Optimization
13.2.3 Quickstart of BMTrain
13.3 OpenPrompt and OpenDelta: Efficient Tuning Toolkit for Big Models
13.3.1 Serving Multiple Tasks with a Unified Big Model
13.3.2 Quickstart of OpenPrompt
13.3.3 QuickStart of OpenDelta
13.4 BMCook: Efficient Compression Toolkit for Big Models
13.4.1 Model Quantization
13.4.2 Model Distillation
13.4.3 Model Pruning
13.4.4 Model MoEfication
13.4.5 QuickStart of BMCook
13.5 BMInf: Efficient Inference Toolkit for Big Models
13.5.1 Accelerating Big Model Inference
13.5.2 Reducing the Memory Footprint of Big Models
13.5.3 QuickStart of BMInf
13.6 Summary and Further Readings
References
14 Ten Key Problems of Pre-trained Models: An Outlook of Representation Learning
14.1 Pre-trained Models: New Era of Representation Learning
14.2 Ten Key Problems of Pre-trained Models
14.2.1 P1: Theoretical Foundation of Pre-trained Models
14.2.2 P2: Next-Generation Model Architecture
14.2.3 P3: High-Performance Computing of Big Models
14.2.4 P4: Effective and Efficient Adaptation
14.2.5 P5: Controllable Generation with Pre-trained Models
14.2.6 P6: Safe and Ethical Big Models
14.2.7 P7: Cross-Modal Computation
14.2.8 P8: Cognitive Learning
14.2.9 P9: Innovative Applications of Big Models
14.2.10 P10: Big Model Systems Accessible to Users
14.3 Summary
References
๐ SIMILAR VOLUMES
This book provides an overview of the recent advances in representation learning theory, algorithms, and applications for natural language processing (NLP), ranging from word embeddings to pre-trained language models. It is divided into four parts. Part I presents the representation learning techniq
Transfer Learning for Natural Language Processing gets you up to speed with the relevant ML concepts before diving into the cutting-edge advances that are defining the future of NLP.Building and training deep learning models from scratch is costly, time-consuming, and requires massive amounts of dat
<b>Build custom NLP models in record time by adapting pre-trained machine learning models to solve specialized problems.</b> Summary In <i>Transfer Learning for Natural Language Processing</i> you will learn: ย ย ย Fine tuning pretrained models with new domain data ย ย ย Picking the right mod
<p><span>Humans do a great job of reading text, identifying key ideas, summarizing, making connections, and other tasks that require comprehension and context. Recent advances in deep learning make it possible for computer systems to achieve similar results. </span></p><p><span>Deep Learning for Nat