Summary Natural Language Processing in Action is your guide to creating machines that understand human language using the power of Python with its ecosystem of packages dedicated to NLP and AI. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publi
Natural Language Processing in Action
β Scribed by Hobson Lane, Maria Dyshel
- Publisher
- Manning Publications
- Year
- 2023
- Tongue
- English
- Leaves
- 406
- Edition
- 2
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Table of Contents
Natural Language Processing in Action, Second Edition MEAP V08
Copyright
Welcome
Brief contents
Chapter 1: Machines that read and write (NLP overview)
1.1 Programming language vs. natural language
1.1.1 Natural Language Understanding (NLU)
1.1.2 Natural Language Generation (NLG)
1.1.3 Plumbing it all together for positive impact
1.2 The magic
1.2.1 Language and thought
1.2.2 Machines that converse
1.2.3 The math
1.3 Applications
1.3.1 Processing programming languages with NLP
1.4 Language through a computerβs "eyes"
1.4.1 The language of locks
1.4.2 Regular expressions
1.5 A simple chatbot
1.6 Keyword-based greeting recognizer
1.6.1 Pattern-based intent recognition
1.6.2 Another way
1.7 A brief overflight of hyperspace
1.8 Word order and grammar
1.9 A chatbot natural language pipeline
1.10 Processing in depth
1.11 Natural language IQ
1.12 Review
1.13 Summary
Chapter 2: Tokens of thought (natural language words)
2.1 Tokens of emotion
2.2 What is a token?
2.2.1 Alternative tokens
2.3 Challenges (a preview of stemming)
2.3.1 Tokenization
2.4 Your tokenizer toolbox
2.4.1 The simplest tokenizer
2.4.2 Rule-based tokenization
2.4.3 SpaCy
2.4.4 Tokenizer race
2.5 Wordpiece tokenizers
2.5.1 Clumping characters into sentence pieces
2.6 Vectors of tokens
2.6.1 One-hot Vectors
2.6.2 BOW (Bag-of-Words) Vectors
2.6.3 Dot product
2.7 Challenging tokens
2.7.1 A complicated picture
2.7.2 Extending your vocabulary with n-grams
2.7.3 Normalizing your vocabulary
2.8 Sentiment
2.8.1 VADERβA rule-based sentiment analyzer
2.8.2 Closeness of vectors
2.8.3 Count vectorizing
2.8.4 Naive Bayes
2.9 Review
2.10 Summary
Chapter 3: Math with words (TF-IDF vectors)
3.1 Bag of words
3.2 Vectorizing text
3.2.1 An easier way to vectorize text
3.2.2 Vectorize your code
3.2.3 Vector spaces
3.3 Bag of n-grams
3.3.1 Analyzing this
3.4 Zipfβs Law
3.5 Inverse Document Frequency
3.5.1 Return of Zipf
3.5.2 Relevance ranking
3.5.3 Another vectorizer
3.5.4 Alternatives
3.5.5 Okapi BM25
3.6 Using TF-IDF for your bot
3.7 Whatβs next
3.8 Review
3.9 Summary
Chapter 4: Finding meaning in word counts (semantic analysis)
4.1 From word counts to topic scores
4.1.1 The limitations of TF-IDF vectors and lemmatization
4.1.2 Topic vectors
4.1.3 Thought experiment
4.1.4 Algorithms for scoring topics
4.2 The challenge: detecting toxicity
4.2.1 Latent Discriminant Analysis classifier
4.2.2 Going beyond linear
4.3 Reducing dimensions
4.3.1 Enter Principal Component Analysis
4.3.2 Singular Value Decomposition
4.4 Latent Semantic Analysis
4.4.1 Diving into semantic analysis
4.4.2 TruncatedSVD or PCA?
4.4.3 How well LSA performs for toxicity detection?
4.4.4 Other ways to reduce dimensions
4.5 Latent Dirichlet allocation (LDiA)
4.5.1 The LDiA idea
4.5.2 LDiA topic model for comments
4.5.3 Detecting toxicity with LDiA
4.5.4 A fairer comparison: 32 LDiA topics
4.6 Distance and similarity
4.7 Steering with feedback
4.8 Topic vector power
4.8.1 Semantic search
4.9 Equipping your bot with semantic search
4.10 Whatβs Next?
4.11 Review
4.12 Summary
Chapter 5: Word brain (neural networks)
5.1 Why neural networks?
5.1.1 Neural networks for words
5.1.2 Neurons as feature engineers
5.1.3 Biological neurons
5.1.4 Perceptron
5.1.5 A Python perceptron
5.2 Example logistic neuron
5.2.1 The logistics of clickbait
5.2.2 Sex education
5.2.3 Pronouns and gender vs sex
5.2.4 Sex logistics
5.2.5 A sleek sexy PyTorch neuron
5.3 Skiing down the error surface
5.3.1 Off the chair lift, onto the slope - gradient descent and local minima
5.3.2 Shaking things up: stochastic gradient descent
5.3.3 PyTorch: Neural networks in Python
5.4 Review
5.5 Summary
Chapter 6: Reasoning with word embeddings (word vectors)
6.1 This is your brain on words
6.2 Applications
6.2.1 Search for meaning
6.2.2 Combining word embeddings
6.2.3 Analogy questions
6.2.4 Word2Vec Innovation
6.3 Artificial Intelligence Relies on Embeddings
6.4 Word2Vec
6.4.1 Analogy reasoning
6.4.2 Learning word embeddings
6.4.3 Contextualized embeddings
6.4.4 Learning meaning without a dictionary
6.4.5 Computational tricks of Word2Vec
6.4.6 Using the gensim.word2vec module
6.4.7 Generating your own Word vector representations
6.4.8 Word2Vec vs GloVe (Global Vectors)
6.4.9 fastText
6.4.10 Word2Vec vs LSA
6.4.11 Visualizing word relationships
6.4.12 Making Connections
6.4.13 Unnatural words
6.5 Summary
6.6 Review
Chapter 7: Finding kernels of knowledge in text with Convolutional Neural Networks (CNNs)
7.1 Patterns in sequences of words
7.2 Scale and Translation Invariance
7.3 Convolution
7.3.1 Stencils for natural language text
7.3.2 A bit more stenciling
7.3.3 Correlation vs. convolution
7.3.4 Convolution as a mapping function
7.3.5 Python convolution example
7.3.6 PyTorch 1-D CNN on 4-D embedding vectors
7.3.7 Natural examples
7.4 Morse code
7.4.1 Decoding Morse with convolution
7.5 Building a CNN with PyTorch
7.5.1 Clipping and Padding
7.5.2 Better representation with word embeddings
7.5.3 Transfer learning
7.5.4 Robustifying your CNN with dropout
7.6 PyTorch CNN to process disaster toots
7.6.1 Network architecture
7.6.2 Pooling
7.6.3 Linear layer
7.6.4 Getting fit
7.6.5 Hyperparameter Tuning
7.7 Review
7.8 Summary
Chapter 8: Reduce, reuse, recycle your words (RNNs and LSTMs)
8.1 What are RNNs good for?
8.1.1 RNNs remember everything you tell them
8.1.2 RNNs hide their understanding
8.1.3 RNNs remember everything you tell them
8.2 Predict someoneβs nationality from only their last name
8.2.1 Build an RNN from scratch
8.2.2 Training an RNN, one token at a time
8.2.3 Understanding the results
8.2.4 Multiclass classifiers vs multi-label taggers
8.3 Backpropagation through time
8.3.1 Initializing the hidden layer in an RNN
8.4 Remembering with recurrent networks
8.4.1 Word-level Language Models
8.4.2 Gated Recurrent Units (GRUs)
8.4.3 Long and Short-Term Memory (LSTM)
8.4.4 Give your RNN a tuneup
8.5 Predicting
8.6 Review
8.7 Summary
Notes
π SIMILAR VOLUMES
Though natural language processing has come far in the past twenty years, the technology has not achieved a major impact on society. Is this because of some fundamental limitation that cannot be overcome? Or because there has not been enough time to refine and apply theoretical work already done?
Develop your NLP skills from scratch! This revised bestseller now includes coverage of the latest Python packages, Transformers, the HuggingFace packages, and chatbot frameworks. In Natural Language Processing in Action, Second Edition you will learn how to Process, analyze, understand, and gene
Natural Language Processing in Action is your guide to creating machines that understand human language using the power of Python with its ecosystem of packages dedicated to NLP and AI. About the Technology Recent advances in deep learning empower applications to understand text and speech with