Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision: Techniques and Use Cases

✍ Scribed by L. Ashok Kumar, D. Karthika Renuka

Publisher: CRC Press
Year: 2023
Tongue: English
Leaves: 246
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of natural language processing (NLP), speech and computer vision tasks. It simplifies and presents the concepts of deep learning in a comprehensive manner, with suitable, full-fledged examples of deep learning models, with an aim to bridge the gap between the theory and the applications using case studies with code, experiments, and supporting analysis.

Features:

Covers latest developments in deep learning techniques as applied to audio analysis, computer vision, and NLP
Introduces contemporary applications of deep learning techniques as applied to audio, textual, and visual processing
Discovers deep learning frameworks and libraries for NLP, speech and computer vision in Python
Gives insights into using the tools and libraries in Python for real-world applications.
Provides easily accessible tutorials, and real-world case studies with codes to provide hands-on experience.

This book is aimed at researchers and graduate students in computer engineering, image, speech, and text processing.

✦ Table of Contents

Cover
Half Title
Title
Copyright
Dedication
Contents
About the Authors
Preface
Acknowledgments
Chapter 1 Introduction
Learning Outcomes
1.1 Introduction
1.1.1 Subsets of Artificial Intelligence
1.1.2 Three Horizons of Deep Learning Applications
1.1.3 Natural Language Processing
1.1.4 Speech Recognition
1.1.5 Computer Vision
1.2 Machine Learning Methods for NLP, Computer Vision (CV), and Speech
1.2.1 Support Vector Machine (SVM)
1.2.2 Bagging
1.2.3 Gradient-boosted Decision Trees (GBDTs)
1.2.4 Naïve Bayes
1.2.5 Logistic Regression
1.2.6 Dimensionality Reduction Techniques
1.3 Tools, Libraries, Datasets, and Resources for the Practitioners
1.3.1 TensorFlow
1.3.2 Keras
1.3.3 Deeplearning4j
1.3.4 Caffe
1.3.5 ONNX
1.3.6 PyTorch
1.3.7 scikit-learn
1.3.8 NumPy
1.3.9 Pandas
1.3.10 NLTK
1.3.11 Gensim
1.3.12 Datasets
1.4 Summary
Bibliography
Chapter 2 Natural Language Processing
Learning Outcomes
2.1 Natural Language Processing
2.2 Generic NLP Pipeline
2.2.1 Data Acquisition
2.2.2 Text Cleaning
2.3 Text Pre-processing
2.3.1 Noise Removal
2.3.2 Stemming
2.3.3 Tokenization
2.3.4 Lemmatization
2.3.5 Stop Word Removal
2.3.6 Parts of Speech Tagging
2.4 Feature Engineering
2.5 Modeling
2.5.1 Start with Simple Heuristics
2.5.2 Building Your Model
2.5.3 Metrics to Build Model
2.6 Evaluation
2.7 Deployment
2.8 Monitoring and Model Updating
2.9 Vector Representation for NLP
2.9.1 One Hot Vector Encoding
2.9.2 Word Embeddings
2.9.3 Bag of Words
2.9.4 TF-IDF
2.9.5 N-gram
2.9.6 Word2Vec
2.9.7 Glove
2.9.8 ElMo
2.10 Language Modeling with n-grams
2.10.1 Evaluating Language Models
2.10.2 Smoothing
2.10.3 Kneser-Ney Smoothing
2.11 Vector Semantics and Embeddings
2.11.1 Lexical Semantics
2.11.2 Vector Semantics
2.11.3 Cosine for Measuring Similarity
2.11.4 Bias and Embeddings
2.12 Summary
Bibliography
Chapter 3 State-of-the-Art Natural Language Processing
Learning Outcomes
3.1 Introduction
3.2 Sequence-to-Sequence Models
3.2.1 Sequence
3.2.2 Sequence Labeling
3.2.3 Sequence Modeling
3.3 Recurrent Neural Networks
3.3.1 Unrolling RNN
3.3.2 RNN-based POS Tagging Use Case
3.3.3 Challenges in RNN
3.4 Attention Mechanisms
3.4.1 Self-attention Mechanism
3.4.2. Multi-head Attention Mechanism
3.4.3 Bahdanau Attention
3.4.4 Luong Attention
3.4.5 Global Attention versus Local Attention
3.4.6 Hierarchical Attention
3.5 Transformer Model
3.5.1 Bidirectional Encoder, Representations, and Transformers (BERT)
3.5.2 GPT3
3.6 Summary
Bibliography
Chapter 4 Applications of Natural Language Processing
Learning Outcomes
4.1 Introduction
4.2 Word Sense Disambiguation
4.2.1 Word Senses
4.2.2 WordNet: A Database of Lexical Relations
4.2.3 Approaches to Word Sense Disambiguation
4.2.4 Applications of Word Sense Disambiguation
4.3 Text Classification
4.3.1 Building the Text Classification Model
4.3.2 Applications of Text Classification
4.3.3 Other Applications
4.4 Sentiment Analysis
4.4.1 Types of Sentiment Analysis
4.5 Spam Email Classification
4.5.1 History of Spam
4.5.2 Spamming Techniques
4.5.3 Types of Spams
4.6 Question Answering
4.6.1 Components of Question Answering System
4.6.2 Information Retrieval-based Factoid Question and Answering
4.6.3 Entity Linking
4.6.4 Knowledge-based Question Answering
4.7 Chatbots and Dialog Systems
4.7.1 Properties of Human Conversation
4.7.2 Chatbots
4.7.3 The Dialog-state Architecture
4.8 Summary
Bibliography
Chapter 5 Fundamentals of Speech Recognition
Learning Outcomes
5.1 Introduction
5.2 Structure of Speech
5.3 Basic Audio Features
5.3.1 Pitch
5.3.2 Timbral Features
5.3.3 Rhythmic Features
5.3.4 MPEG-7 Features
5.4 Characteristics of Speech Recognition System
5.4.1 Pronunciations
5.4.2 Vocabulary
5.4.3 Grammars
5.4.4 Speaker Dependence
5.5 The Working of a Speech Recognition System
5.5.1 Input Speech
5.5.2 Audio Pre-processing
5.5.3 Feature Extraction
5.6 Audio Feature Extraction Techniques
5.6.1 Spectrogram
5.6.2 MFCC
5.6.3 Short-Time Fourier Transform
5.6.4 Linear Prediction Coefficients (LPCC)
5.6.5 Discrete Wavelet Transform (DWT)
5.6.6 Perceptual Linear Prediction (PLP)
5.7 Statistical Speech Recognition
5.7.1 Acoustic Model
5.7.2 Pronunciation Model
5.7.3 Language Model
5.7.4 Conventional ASR Approaches
5.8 Speech Recognition Applications
5.8.1 In Banking
5.8.2 In-Car Systems
5.8.3 Health Care
5.8.4 Experiments by Different Speech Groups for Large-Vocabulary Speech Recognition
5.8.5 Measure of Performance
5.9 Challenges in Speech Recognition
5.9.1 Vocabulary Size
5.9.2 Speaker-Dependent or -Independent
5.9.3 Isolated, Discontinuous, and Continuous Speech
5.9.4 Phonetics
5.9.5 Adverse Conditions
5.10 Open-source Toolkits for Speech Recognition
5.10.1 Frameworks
5.10.2 Additional Tools and Libraries
5.11 Summary
Bibliography
Chapter 6 Deep Learning Models for Speech Recognition
Learning Outcomes
6.1 Traditional Methods of Speech Recognition
6.1.1 Hidden Markov Models (HMMs)
6.1.2 Gaussian Mixture Models (GMMs)
6.1.3 Artificial Neural Network (ANN)
6.1.4 HMM and ANN Acoustic Modeling
6.1.5 Deep Belief Neural Network (DBNN) for Acoustic Modelling
6.2 RNN-based Encoder–Decoder Architecture
6.3 Encoder
6.4 Decoder
6.5 Attention-based Encoder–Decoder Architecture
6.6 Challenges in Traditional ASR and the Motivation for End-to-End ASR
6.7 Summary
Bibliography
Chapter 7 End-to-End Speech Recognition Models
Learning Outcomes
7.1 End-to-End Speech Recognition Models
7.1.1 Definition of End-to-End ASR System
7.1.2 Connectionist Temporal Classification (CTC)
7.1.3 Deep Speech
7.1.4 Deep Speech 2
7.1.5 Listen, Attend, Spell (LAS) Model
7.1.6 JASPER
7.1.7 QuartzNet
7.2 Self-supervised Models for Automatic Speech Recognition
7.2.1 Wav2Vec
7.2.2 Data2Vec
7.2.3 HuBERT
7.3 Online/Streaming ASR
7.3.1 RNN-transducer-Based Streaming ASR
7.3.2 Wav2Letter for Streaming ASR
7.3.3 Conformer Model
7.4 Summary
Bibliography
Chapter 8 Computer Vision Basics
Learning Outcomes
8.1 Introduction
8.1.1 Fundamental Steps for Computer Vision
8.1.2 Fundamental Steps in Digital Image Processing
8.2 Image Segmentation
8.2.1 Steps in Image Segmentation
8.3 Feature Extraction
8.4 Image Classification
8.4.1 Image Classification Using Convolutional Neural Network (CNN)
8.4.2 Convolution Layer
8.4.3 Pooling or Down Sampling Layer
8.4.4 Flattening Layer
8.4.5 Fully Connected Layer
8.4.6 Activation Function
8.5 Tools and Libraries for Computer Vision
8.5.1 OpenCV
8.5.2 MATLAB
8.6 Applications of Computer Vision
8.6.1 Object Detection
8.6.2 Face Recognition
8.6.3 Number Plate Identification
8.6.4 Image-based Search
8.6.5 Medical Imaging
8.7 Summary
Bibliography
Chapter 9 Deep Learning Models for Computer Vision
Learning Outcomes
9.1 Deep Learning for Computer Vision
9.2 Pre-trained Architectures for Computer Vision
9.2.1 LeNet
9.2.2 AlexNet
9.2.3 VGG
9.2.4 Inception
9.2.5 R-CNN
9.2.6 Fast R-CNN
9.2.7 Faster R-CNN
9.2.8 Mask R-CNN
9.2.9 YOLO
9.3 Summary
Bibliography
Chapter 10 Applications of Computer Vision
Learning Outcomes
10.1 Introduction
10.2 Optical Character Recognition
10.2.1 Code Snippets
10.2.2 Result Analysis
10.3 Face and Facial Expression Recognition
10.3.1 Face Recognition
10.3.2 Facial Recognition System
10.3.3 Major Challenges in Recognizing Face Expression
10.3.4 Result Analysis
10.4 Visual-based Gesture Recognition
10.4.1 Framework Used
10.4.2 Code Snippets
10.4.3 Result Analysis
10.4.4 Major Challenges in Gesture Recognition
10.5 Posture Detection and Correction
10.5.1 Framework Used
10.5.2 Squats
10.5.3 Result Analysis
10.6 Summary
Bibliography
Index

📜 SIMILAR VOLUMES

Deep Learning Approach for Natural Langu

📁 Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision: Techniques and Use Cases

✍ L. Ashok Kumar, D. Karthika Renuka 📂 Library 📅 2023 🏛 CRC Press 🌐 English

Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of Natural Language Processing (NLP), speech and Computer Vision tasks. It simplifies and presents the concepts of Deep Learning in a com

Deep Learning Approach for Natural Langu

📁 Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision

✍ Kumar, L. Ashok;Renuka, D. Karthika; D. Karthika Renuka 📂 Library 📅 2023 🏛 Taylor & Francis Group 🌐 English

Python Natural Language Processing: Adva

📁 Python Natural Language Processing: Advanced machine learning and deep learning techniques for natural language processing

✍ Jalaj Thanaki 📂 Library 📅 2017 🏛 Packt Publishing 🌐 English

Key Features ● Implement Machine Learning and Deep Learning techniques for efficient natural language processing ● Get started with NLTK and implement NLP in your applications with ease ● Understand and interpret human languages with the power of text analysis via Python Book Description This

Python natural language processing: adva

📁 Python natural language processing: advanced machine learning and deep learning techniques for natural language processing

✍ Thanaki, Jalaj 📂 Library 📅 2017 🏛 Packt 🌐 English

Deep Learning Approaches for Spoken and

📁 Deep Learning Approaches for Spoken and Natural Language Processing (Signals and Communication Technology)

✍ Virender Kadyan (editor), Amitoj Singh (editor), Mohit Mittal (editor), Laith Ab 📂 Library 📅 2021 🏛 Springer 🌐 English

This book provides insights into how deep learning techniques impact language and speech processing applications. The authors discuss the promise, limits and the new challenges in deep learning. The book covers the major differences between the various applications of deep learning and the cla

Learning Deep Learning: Theory and Pract

📁 Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow

✍ Magnus Ekman 📂 Library 📅 2021 🏛 Addison-Wesley Professional 🌐 English

NVIDIA's Full-Color Guide to Deep Learning: All You Need to Get Started and Get Results <blockquote> "To enable everyone to be part of this historic revolution requires the democratization of AI knowledge and resources. This book is timely and relevant towards accompl