𝔖 Scriptorium

✦ LIBER ✦

📁

Natural Language Processing with Transformers

✍ Scribed by Lewis Tunstall, Leandro von Werra, Thomas Wolf

Publisher: O'Reilly Media, Inc.
Year: 2021
Tongue: English
Leaves: 417
Category: Library

⬇ Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis

Since their introduction in 2017, Transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or machine learning engineer, this practical book shows you how to train and scale these large models using HuggingFace Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf use a hands-on approach to teach you how Transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve.

Build, debug, and optimize Transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
Learn how Transformers can be used for cross-lingual transfer learning
Apply Transformers in real-world scenarios where labeled data is scarce
Make Transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
Train Transformers from scratch and learn how to scale to multiple GPUs and distributed environments

✦ Table of Contents

Hello Transformers
The Transformers Origin Story
The Encoder-Decoder Framework
Attention Mechanisms
Transfer Learning in NLP
Hugging Face Transformers: Bridging the Gap
A Tour of Transformer Applications
Text Classification
Named Entity Recognition
Question Answering
Summarization
Translation
Text Generation
The Hugging Face Ecosystem
The Hugging Face Hub
Hugging Face Tokenizers
Hugging Face Datasets
Hugging Face Accelerate
Main Challenges With Transformers
Conclusion
Text Classification
The Dataset
A First Look at Hugging Face Datasets
From Datasets to DataFrames
Look at the Class Distribution
How Long Are Our Tweets?
From Text to Tokens
Character Tokenization
Word Tokenization
Subword Tokenization
Using Pretrained Tokenizers
Training a Text Classifier
Transformers as Feature Extractors
Fine-tuning Transformers
Further Improvements
Conclusion
Transformer Anatomy
The Transformer
Transformer Encoder
Self-Attention
Feed Forward Layer
Putting It All Together
Positional Embeddings
Bodies and Heads
Transformer Decoder
Meet the Transformers
The Transformer Tree of Life
The Encoder Branch
The Decoder Branch
The Encoder-Decoder Branch
Conclusion
Question Answering
Building a Review-Based QA System
The Dataset
Extracting Answers from Text
Using Haystack to Build a QA Pipeline
Improving Our QA Pipeline
Evaluating the Retriever
Evaluating the Reader
Domain Adaptation
Evaluating the Whole QA Pipeline
Going Beyond Extractive QA
Retrieval Augmented Generation
Conclusion
Making Transformers Efficient in Production
Intent Detection as a Case Study
Creating a Performance Benchmark
Benchmarking Our Baseline Model
Making Models Smaller via Knowledge Distillation
Knowledge Distillation for Fine-tuning
Knowledge Distillation for Pretraining
Creating a Knowledge Distillation Trainer
Choosing a Good Student Initialization
Finding Good Hyperparameters with Optuna
Benchmarking Our Distilled Model
Making Models Faster with Quantization
Quantization Strategies
Quantizing Transformers in PyTorch
Benchmarking Our Quantized Model
Optimizing Inference with ONNX and the ONNX Runtime
Optimizing for Transformer Architectures
Making Models Sparser with Weight Pruning
Sparsity in Deep Neural Networks
Weight Pruning Methods
Creating Masked Transformers
Creating a Pruning Trainer
Fine-Pruning With Increasing Sparsity
Counting the Number of Pruned Weights
Pruning Once and For All
Quantizing and Storing in Sparse Format
Conclusion
Multilingual Named Entity Recognition
The Dataset
Multilingual Transformers
mBERT
XLM
XLM-R
Training a Named Entity Recognition Tagger
SentencePiece Tokenization
The Anatomy of the Transformers Model Class
Bodies and Heads
Creating Your Own XLM-R Model for Token Classification
Loading a Custom Model
Tokenizing and Encoding the Texts
Performance Measures
Fine-tuning XLM-RoBERTa
Error Analysis
Evaluating Cross-Lingual Transfer
When Does Zero-Shot Transfer Make Sense?
Fine-tuning on Multiple Languages at Once
Building a Pipeline for Inference
Conclusion
Dealing With Few to No Labels
Building a GitHub Issues Tagger
Getting the Data
Preparing the Data
Creating Training Sets
Creating Training Slices
Implementing a Bayesline
Working With No Labeled Data
Zero-Shot Classification
Working With A Few Labels
Data Augmentation
Using Embeddings as a Lookup Table
Fine-tuning a Vanilla Transformer
In-context and Few-shot Learning with Prompts
Levaraging Unlabelled Data
Fine-tuning a Language Model
Fine-tuning a Classifier
Advanced Methods
Conclusion
Text Generation
The Challenge With Generating Coherent Text
Greedy Search Decoding
Beam Search Decoding
Sampling Methods
Which Decoding Method is Best?
Conclusion
Summarization
The CNN/DailyMail Dataset
Text Summarization Pipelines
Summarization Baseline
GPT-2
T5
BART
PEGASUS
Comparing Different Summaries
Measuring the Quality of Generated Text
BLEU
ROUGE
Evaluating PEGASUS on the CNN/DailyMail Dataset
Training Your Own Summarization Model
Evaluating PEGASUS on SAMSum
Fine-Tuning PEGASUS
Generating Dialogue Summaries
Conclusion
Training Transformers from Scratch
Large Datasets and Where to Find Them
Challenges with Building a Large Scale Corpus
Building a Custom Code Dataset
Working with Large Datasets
Memory-mapping
Streaming
Adding Datasets to the Hugging Face Hub
A Tale of Pretraining Objectives
Building a Tokenizer
The Tokenizer Pipeline
The Tokenizer Model
A Tokenization Pipeline for Python
Training a Tokenizer
Saving a Custom Tokenizer on the Hub
Training a Model from Scratch
Initialize Model
Data Loader
Training Loop with Accelerate
Training Run
Model Analysis
Conclusion
Future Directions
Scaling Transformers
Scaling Laws
Challenges With Scaling
Attention Please!
Sparse Attention
Linearized Attention
Going Beyond Text
Vision
Tables
Multimodal Transformers
Speech-to-Text
Vision and Text
Where To From Here?

📜 SIMILAR VOLUMES

Natural Language Processing with Transfo

📁 Natural Language Processing with Transformers

✍ Lewis Tunstall, Leandro von Werra, Thomas Wolf 📂 Library 📅 2021 🏛 O'Reilly Media, Inc. 🌐 English

Since their introduction in 2017, Transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or machine learning engineer, this practical book shows you how to train and scale these l

Natural Language Processing with Transfo

📁 Natural Language Processing with Transformers

✍ Tunstall, Lewis; Werra, Leandro von; Wolf, Thomas 📂 Library 📅 2022 🏛 O'Reilly Media 🌐 English

Natural Language Processing with Transfo

📁 Natural Language Processing with Transformers

✍ Lewis Tunstall, Leandro von Werra, Thomas Wolf 📂 Library 📅 2022 🏛 O'Reilly Media 🌐 English

<p><span>Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train

Natural Language Processing with Transfo

📁 Natural Language Processing with Transformers: Building Language Applications with Hugging Face

✍ Lewis Tunstall, Leandro von Werra, Thomas Wolf 📂 Library 📅 2022 🏛 O'Reilly Media 🌐 English

<span><div><p>Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book shows you how to train and scale these large m

Natural Language Processing with Transfo

📁 Natural Language Processing with Transformers: Building Language Applications with Hugging Face

✍ Lewis Tunstall, Leandro von Werra, Thomas Wolf 📂 Library 📅 2022 🏛 O'Reilly Media 🌐 English

<span><div><p>Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book shows you how to train and scale these large m

Natural Language Processing Practical us

📁 Natural Language Processing Practical using Transformers with Python

✍ Tony Snake 📂 Library 📅 2022 🏛 Independently published 🌐 English

Learn how you can perform named entity recognition using HuggingFace Transformers and spaCy libraries in Python. Named Entity Recognition (NER) is a typical natural language processing (NLP) task that automatically identifies and recognizes predefined entities in a given text. Entities like person n