Transfer Learning for Natural Language Processing gets you up to speed with the relevant ML concepts before diving into the cutting-edge advances that are defining the future of NLP.Building and training deep learning models from scratch is costly, time-consuming, and requires massive amounts of dat
Transfer Learning for Natural Processing
β Scribed by Paul Azunre
- Publisher
- Manning Publications
- Year
- 2021
- Tongue
- English
- Leaves
- 266
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
about the technology
Transfer learning enables machine learning models to be initialized with existing prior knowledge. Initially pioneered in computer vision, transfer learning techniques have been revolutionising Natural Language Processing with big reductions in the training time and computation power needed for a model to start delivering results. Emerging pretrained language models such as ELMo and BERT have opened up new possibilities for NLP developers working in machine translation, semantic analysis, business analytics, and natural language generation.about the book
Transfer Learning for Natural Language Processing Β is a practical primer to transfer learning techniques capable of delivering huge improvements to your NLP models. Written by DARPA researcher Paul Azunre, this practical book gets you up to speed with the relevant ML concepts before diving into the cutting-edge advances that are defining the future of NLP. Youβll learn how to adapt existing state-of-the art models into real-world applications, including building a spam email classifier, a movie review sentiment analyzer, an automated fact checker, a question-answering system and a translation system for low-resource languages.what's inside
- Fine tuning pretrained models with new domain data
- Picking the right model to reduce resource usage
- Transfer learning for neural network architectures
- Foundations for exploring NLP academic literature
about the reader
For machine learning engineers and data scientists with some experience in NLP.about the author
Paul Azunre Β holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA research programs. He founded Algorine Inc., a Research Lab dedicated to advancing AI/ML and identifying scenarios where they can have a significant social impact. Paul also co-founded Ghana NLP, an open source initiative focused using NLP and Transfer Learning with Ghanaian and other low-resource languages. He frequently contributes to major peer-reviewed international research journals and serves as a program committee member at top conferences in the field.β¦ Table of Contents
Transfer Learning for Natural Language Processing
contents
preface
acknowledgments
about this book
Who should read this book?
Road map
Software requirements
About the code
liveBook discussion forum
about the author
about the cover illustration
Part 1βIntroduction and overview
1 What is transfer learning?
1.1 Overview of representative NLP tasks
1.2 Understanding NLP in the context of AI
1.2.1 Artificial intelligence (AI)
1.2.2 Machine learning
1.2.3 Natural language processing (NLP)
1.3 A brief history of NLP advances
1.3.1 General overview
1.3.2 Recent transfer learning advances
1.4 Transfer learning in computer vision
1.4.1 General overview
1.4.2 Pretrained ImageNet models
1.4.3 Fine-tuning pretrained ImageNet models
1.5 Why is NLP transfer learning an exciting topic to study now?
Summary
2 Getting started with baselines: Data preprocessing
2.1 Preprocessing email spam classification example data
2.1.1 Loading and visualizing the Enron corpus
2.1.2 Loading and visualizing the fraudulent email corpus
2.1.3 Converting the email text into numbers
2.2 Preprocessing movie sentiment classification example data
2.3 Generalized linear models
2.3.1 Logistic regression
2.3.2 Support vector machines (SVMs)
Summary
3 Getting started with baselines: Benchmarking and optimization
3.1 Decision-tree-based models
3.1.1 Random forests (RFs)
3.1.2 Gradient-boosting machines (GBMs)
3.2 Neural network models
3.2.1 Embeddings from Language Models (ELMo)
3.2.2 Bidirectional Encoder Representations from Transformers (BERT)
3.3 Optimizing performance
3.3.1 Manual hyperparameter tuning
3.3.2 Systematic hyperparameter tuning
Summary
Part 2βShallow transfer learning and deep transfer learning with recurrent neural networks (RNNs)
4 Shallow transfer learning for NLP
4.1 Semisupervised learning with pretrained word embeddings
4.2 Semisupervised learning with higher-level representations
4.3 Multitask learning
4.3.1 Problem setup and a shallow neural single-task baseline
4.3.2 Dual-task experiment
4.4 Domain adaptation
Summary
5 Preprocessing data for recurrent neural network deep transfer learning experiments
5.1 Preprocessing tabular column-type classification data
5.1.1 Obtaining and visualizing tabular data
5.1.2 Preprocessing tabular data
5.1.3 Encoding preprocessed data as numbers
5.2 Preprocessing fact-checking example data
5.2.1 Special problem considerations
5.2.2 Loading and visualizing fact-checking data
Summary
6 Deep transfer learning for NLP with recurrent neural networks
6.1 Semantic Inference for the Modeling of Ontologies (SIMOn)
6.1.1 General neural architecture overview
6.1.2 Modeling tabular data
6.1.3 Application of SIMOn to tabular column-type classification data
6.2 Embeddings from Language Models (ELMo)
6.2.1 ELMo bidirectional language modeling
6.2.2 Application to fake news detection
6.3 Universal Language Model Fine-Tuning (ULMFiT)
6.3.1 Target task language model fine-tuning
6.3.2 Target task classifier fine-tuning
Summary
Part 3βDeep transfer learning with transformers and adaptation strategies
7 Deep transfer learning for NLP with the transformer and GPT
7.1 The transformer
7.1.1 An introduction to the transformers library and attention visualization
7.1.2 Self-attention
7.1.3 Residual connections, encoder-decoder attention, and positional encoding
7.1.4 Application of pretrained encoder-decoder to translation
7.2 The Generative Pretrained Transformer
7.2.1 Architecture overview
7.2.2 Transformers pipelines introduction and application to text generation
7.2.3 Application to chatbots
Summary
8 Deep transfer learning for NLP with BERT and multilingual BERT
8.1 Bidirectional Encoder Representations from Transformers (BERT)
8.1.1 Model architecture
8.1.2 Application to question answering
8.1.3 Application to fill in the blanks and next-sentence prediction tasks
8.2 Cross-lingual learning with multilingual BERT (mBERT)
8.2.1 Brief JW300 dataset overview
8.2.2 Transfer mBERT to monolingual Twi data with the pretrained tokenizer
8.2.3 mBERT and tokenizer trained from scratch on monolingual Twi data
Summary
9 ULMFiT and knowledge distillation adaptation strategies
9.1 Gradual unfreezing and discriminative fine-tuning
9.1.1 Pretrained language model fine-tuning
9.1.2 Target task classifier fine-tuning
9.2 Knowledge distillation
9.2.1 Transfer DistilmBERT to monolingual Twi data with pretrained tokenizer
Summary
10 ALBERT, adapters, and multitask adaptation strategies
10.1 Embedding factorization and cross-layer parameter sharing
10.1.1 Fine-tuning pretrained ALBERT on MDSD book reviews
10.2 Multitask fine-tuning
10.2.1 General Language Understanding Dataset (GLUE)
10.2.2 Fine-tuning on a single GLUE task
10.2.3 Sequential adaptation
10.3 Adapters
Summary
11 Conclusions
11.1 Overview of key concepts
11.2 Other emerging research trends
11.2.1 RoBERTa
11.2.2 GPT-3
11.2.3 XLNet
11.2.4 BigBird
11.2.5 Longformer
11.2.6 Reformer
11.2.7 T5
11.2.8 BART
11.2.9 XLM
11.2.10 TAPAS
11.3 Future of transfer learning in NLP
11.4 Ethical and environmental considerations
11.5 Staying up-to-date
11.5.1 Kaggle and Zindi competitions
11.5.2 arXiv
11.5.3 News and social media (Twitter)
11.6 Final words
Summary
Appendix AβKaggle primer
A.1 Free GPUs with Kaggle kernels
A.2 Competitions, discussion, and blog
Appendix BβIntroduction to fundamental deep learning tools
B.1 Stochastic gradient descent
B.2 TensorFlow
B.3 PyTorch
B.4 Keras, fast.ai, and Transformers by Hugging Face
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Z
π SIMILAR VOLUMES
<b>Build custom NLP models in record time by adapting pre-trained machine learning models to solve specialized problems.</b> Summary In <i>Transfer Learning for Natural Language Processing</i> you will learn: Β Β Β Fine tuning pretrained models with new domain data Β Β Β Picking the right mod
<p><span>Humans do a great job of reading text, identifying key ideas, summarizing, making connections, and other tasks that require comprehension and context. Recent advances in deep learning make it possible for computer systems to achieve similar results. </span></p><p><span>Deep Learning for Nat
<p><p></p><p>This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, i
This book provides an overview of the recent advances in representation learning theory, algorithms, and applications for natural language processing (NLP), ranging from word embeddings to pre-trained language models. It is divided into four parts. Part I presents the representation learning techniq