𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision: Techniques and Use Cases

✍ Scribed by L. Ashok Kumar, D. Karthika Renuka


Publisher
CRC Press
Year
2023
Tongue
English
Leaves
246
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of natural language processing (NLP), speech and computer vision tasks. It simplifies and presents the concepts of deep learning in a comprehensive manner, with suitable, full-fledged examples of deep learning models, with an aim to bridge the gap between the theory and the applications using case studies with code, experiments, and supporting analysis.

Features:

    • Covers latest developments in deep learning techniques as applied to audio analysis, computer vision, and NLP
    • Introduces contemporary applications of deep learning techniques as applied to audio, textual, and visual processing
    • Discovers deep learning frameworks and libraries for NLP, speech and computer vision in Python
    • Gives insights into using the tools and libraries in Python for real-world applications.
    • Provides easily accessible tutorials, and real-world case studies with codes to provide hands-on experience.

    This book is aimed at researchers and graduate students in computer engineering, image, speech, and text processing.

    ✦ Table of Contents


    Cover
    Half Title
    Title
    Copyright
    Dedication
    Contents
    About the Authors
    Preface
    Acknowledgments
    Chapter 1 Introduction
    Learning Outcomes
    1.1 Introduction
    1.1.1 Subsets of Artificial Intelligence
    1.1.2 Three Horizons of Deep Learning Applications
    1.1.3 Natural Language Processing
    1.1.4 Speech Recognition
    1.1.5 Computer Vision
    1.2 Machine Learning Methods for NLP, Computer Vision (CV), and Speech
    1.2.1 Support Vector Machine (SVM)
    1.2.2 Bagging
    1.2.3 Gradient-boosted Decision Trees (GBDTs)
    1.2.4 NaΓ―ve Bayes
    1.2.5 Logistic Regression
    1.2.6 Dimensionality Reduction Techniques
    1.3 Tools, Libraries, Datasets, and Resources for the Practitioners
    1.3.1 TensorFlow
    1.3.2 Keras
    1.3.3 Deeplearning4j
    1.3.4 Caffe
    1.3.5 ONNX
    1.3.6 PyTorch
    1.3.7 scikit-learn
    1.3.8 NumPy
    1.3.9 Pandas
    1.3.10 NLTK
    1.3.11 Gensim
    1.3.12 Datasets
    1.4 Summary
    Bibliography
    Chapter 2 Natural Language Processing
    Learning Outcomes
    2.1 Natural Language Processing
    2.2 Generic NLP Pipeline
    2.2.1 Data Acquisition
    2.2.2 Text Cleaning
    2.3 Text Pre-processing
    2.3.1 Noise Removal
    2.3.2 Stemming
    2.3.3 Tokenization
    2.3.4 Lemmatization
    2.3.5 Stop Word Removal
    2.3.6 Parts of Speech Tagging
    2.4 Feature Engineering
    2.5 Modeling
    2.5.1 Start with Simple Heuristics
    2.5.2 Building Your Model
    2.5.3 Metrics to Build Model
    2.6 Evaluation
    2.7 Deployment
    2.8 Monitoring and Model Updating
    2.9 Vector Representation for NLP
    2.9.1 One Hot Vector Encoding
    2.9.2 Word Embeddings
    2.9.3 Bag of Words
    2.9.4 TF-IDF
    2.9.5 N-gram
    2.9.6 Word2Vec
    2.9.7 Glove
    2.9.8 ElMo
    2.10 Language Modeling with n-grams
    2.10.1 Evaluating Language Models
    2.10.2 Smoothing
    2.10.3 Kneser-Ney Smoothing
    2.11 Vector Semantics and Embeddings
    2.11.1 Lexical Semantics
    2.11.2 Vector Semantics
    2.11.3 Cosine for Measuring Similarity
    2.11.4 Bias and Embeddings
    2.12 Summary
    Bibliography
    Chapter 3 State-of-the-Art Natural Language Processing
    Learning Outcomes
    3.1 Introduction
    3.2 Sequence-to-Sequence Models
    3.2.1 Sequence
    3.2.2 Sequence Labeling
    3.2.3 Sequence Modeling
    3.3 Recurrent Neural Networks
    3.3.1 Unrolling RNN
    3.3.2 RNN-based POS Tagging Use Case
    3.3.3 Challenges in RNN
    3.4 Attention Mechanisms
    3.4.1 Self-attention Mechanism
    3.4.2. Multi-head Attention Mechanism
    3.4.3 Bahdanau Attention
    3.4.4 Luong Attention
    3.4.5 Global Attention versus Local Attention
    3.4.6 Hierarchical Attention
    3.5 Transformer Model
    3.5.1 Bidirectional Encoder, Representations, and Transformers (BERT)
    3.5.2 GPT3
    3.6 Summary
    Bibliography
    Chapter 4 Applications of Natural Language Processing
    Learning Outcomes
    4.1 Introduction
    4.2 Word Sense Disambiguation
    4.2.1 Word Senses
    4.2.2 WordNet: A Database of Lexical Relations
    4.2.3 Approaches to Word Sense Disambiguation
    4.2.4 Applications of Word Sense Disambiguation
    4.3 Text Classification
    4.3.1 Building the Text Classification Model
    4.3.2 Applications of Text Classification
    4.3.3 Other Applications
    4.4 Sentiment Analysis
    4.4.1 Types of Sentiment Analysis
    4.5 Spam Email Classification
    4.5.1 History of Spam
    4.5.2 Spamming Techniques
    4.5.3 Types of Spams
    4.6 Question Answering
    4.6.1 Components of Question Answering System
    4.6.2 Information Retrieval-based Factoid Question and Answering
    4.6.3 Entity Linking
    4.6.4 Knowledge-based Question Answering
    4.7 Chatbots and Dialog Systems
    4.7.1 Properties of Human Conversation
    4.7.2 Chatbots
    4.7.3 The Dialog-state Architecture
    4.8 Summary
    Bibliography
    Chapter 5 Fundamentals of Speech Recognition
    Learning Outcomes
    5.1 Introduction
    5.2 Structure of Speech
    5.3 Basic Audio Features
    5.3.1 Pitch
    5.3.2 Timbral Features
    5.3.3 Rhythmic Features
    5.3.4 MPEG-7 Features
    5.4 Characteristics of Speech Recognition System
    5.4.1 Pronunciations
    5.4.2 Vocabulary
    5.4.3 Grammars
    5.4.4 Speaker Dependence
    5.5 The Working of a Speech Recognition System
    5.5.1 Input Speech
    5.5.2 Audio Pre-processing
    5.5.3 Feature Extraction
    5.6 Audio Feature Extraction Techniques
    5.6.1 Spectrogram
    5.6.2 MFCC
    5.6.3 Short-Time Fourier Transform
    5.6.4 Linear Prediction Coefficients (LPCC)
    5.6.5 Discrete Wavelet Transform (DWT)
    5.6.6 Perceptual Linear Prediction (PLP)
    5.7 Statistical Speech Recognition
    5.7.1 Acoustic Model
    5.7.2 Pronunciation Model
    5.7.3 Language Model
    5.7.4 Conventional ASR Approaches
    5.8 Speech Recognition Applications
    5.8.1 In Banking
    5.8.2 In-Car Systems
    5.8.3 Health Care
    5.8.4 Experiments by Different Speech Groups for Large-Vocabulary Speech Recognition
    5.8.5 Measure of Performance
    5.9 Challenges in Speech Recognition
    5.9.1 Vocabulary Size
    5.9.2 Speaker-Dependent or -Independent
    5.9.3 Isolated, Discontinuous, and Continuous Speech
    5.9.4 Phonetics
    5.9.5 Adverse Conditions
    5.10 Open-source Toolkits for Speech Recognition
    5.10.1 Frameworks
    5.10.2 Additional Tools and Libraries
    5.11 Summary
    Bibliography
    Chapter 6 Deep Learning Models for Speech Recognition
    Learning Outcomes
    6.1 Traditional Methods of Speech Recognition
    6.1.1 Hidden Markov Models (HMMs)
    6.1.2 Gaussian Mixture Models (GMMs)
    6.1.3 Artificial Neural Network (ANN)
    6.1.4 HMM and ANN Acoustic Modeling
    6.1.5 Deep Belief Neural Network (DBNN) for Acoustic Modelling
    6.2 RNN-based Encoder–Decoder Architecture
    6.3 Encoder
    6.4 Decoder
    6.5 Attention-based Encoder–Decoder Architecture
    6.6 Challenges in Traditional ASR and the Motivation for End-to-End ASR
    6.7 Summary
    Bibliography
    Chapter 7 End-to-End Speech Recognition Models
    Learning Outcomes
    7.1 End-to-End Speech Recognition Models
    7.1.1 Definition of End-to-End ASR System
    7.1.2 Connectionist Temporal Classification (CTC)
    7.1.3 Deep Speech
    7.1.4 Deep Speech 2
    7.1.5 Listen, Attend, Spell (LAS) Model
    7.1.6 JASPER
    7.1.7 QuartzNet
    7.2 Self-supervised Models for Automatic Speech Recognition
    7.2.1 Wav2Vec
    7.2.2 Data2Vec
    7.2.3 HuBERT
    7.3 Online/Streaming ASR
    7.3.1 RNN-transducer-Based Streaming ASR
    7.3.2 Wav2Letter for Streaming ASR
    7.3.3 Conformer Model
    7.4 Summary
    Bibliography
    Chapter 8 Computer Vision Basics
    Learning Outcomes
    8.1 Introduction
    8.1.1 Fundamental Steps for Computer Vision
    8.1.2 Fundamental Steps in Digital Image Processing
    8.2 Image Segmentation
    8.2.1 Steps in Image Segmentation
    8.3 Feature Extraction
    8.4 Image Classification
    8.4.1 Image Classification Using Convolutional Neural Network (CNN)
    8.4.2 Convolution Layer
    8.4.3 Pooling or Down Sampling Layer
    8.4.4 Flattening Layer
    8.4.5 Fully Connected Layer
    8.4.6 Activation Function
    8.5 Tools and Libraries for Computer Vision
    8.5.1 OpenCV
    8.5.2 MATLAB
    8.6 Applications of Computer Vision
    8.6.1 Object Detection
    8.6.2 Face Recognition
    8.6.3 Number Plate Identification
    8.6.4 Image-based Search
    8.6.5 Medical Imaging
    8.7 Summary
    Bibliography
    Chapter 9 Deep Learning Models for Computer Vision
    Learning Outcomes
    9.1 Deep Learning for Computer Vision
    9.2 Pre-trained Architectures for Computer Vision
    9.2.1 LeNet
    9.2.2 AlexNet
    9.2.3 VGG
    9.2.4 Inception
    9.2.5 R-CNN
    9.2.6 Fast R-CNN
    9.2.7 Faster R-CNN
    9.2.8 Mask R-CNN
    9.2.9 YOLO
    9.3 Summary
    Bibliography
    Chapter 10 Applications of Computer Vision
    Learning Outcomes
    10.1 Introduction
    10.2 Optical Character Recognition
    10.2.1 Code Snippets
    10.2.2 Result Analysis
    10.3 Face and Facial Expression Recognition
    10.3.1 Face Recognition
    10.3.2 Facial Recognition System
    10.3.3 Major Challenges in Recognizing Face Expression
    10.3.4 Result Analysis
    10.4 Visual-based Gesture Recognition
    10.4.1 Framework Used
    10.4.2 Code Snippets
    10.4.3 Result Analysis
    10.4.4 Major Challenges in Gesture Recognition
    10.5 Posture Detection and Correction
    10.5.1 Framework Used
    10.5.2 Squats
    10.5.3 Result Analysis
    10.6 Summary
    Bibliography
    Index


    πŸ“œ SIMILAR VOLUMES


    Deep Learning Approach for Natural Langu
    ✍ L. Ashok Kumar, D. Karthika Renuka πŸ“‚ Library πŸ“… 2023 πŸ› CRC Press 🌐 English

    Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of Natural Language Processing (NLP), speech and Computer Vision tasks. It simplifies and presents the concepts of Deep Learning in a com

    Deep Learning Approach for Natural Langu
    ✍ Kumar, L. Ashok;Renuka, D. Karthika; D. Karthika Renuka πŸ“‚ Library πŸ“… 2023 πŸ› Taylor & Francis Group 🌐 English

    Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of natural language processing (NLP), speech and computer vision tasks. It simplifies and presents the concepts of deep learning in a com

    Python Natural Language Processing: Adva
    ✍ Jalaj Thanaki πŸ“‚ Library πŸ“… 2017 πŸ› Packt Publishing 🌐 English

    Key Features ● Implement Machine Learning and Deep Learning techniques for efficient natural language processing ● Get started with NLTK and implement NLP in your applications with ease ● Understand and interpret human languages with the power of text analysis via Python Book Description This

    Deep Learning Approaches for Spoken and
    ✍ Virender Kadyan (editor), Amitoj Singh (editor), Mohit Mittal (editor), Laith Ab πŸ“‚ Library πŸ“… 2021 πŸ› Springer 🌐 English

    <span>This book provides insights into how deep learning techniques impact language and speech processing applications. The authors discuss the promise, limits and the new challenges in deep learning. The book covers the major differences between the various applications of deep learning and the cla

    Learning Deep Learning: Theory and Pract
    ✍ Magnus Ekman πŸ“‚ Library πŸ“… 2021 πŸ› Addison-Wesley Professional 🌐 English

    <p><strong>NVIDIA's Full-Color Guide to Deep Learning: All You Need to Get Started and Get Results</strong><br><br></p> <blockquote> "To enable everyone to be part of this historic revolution requires the democratization of AI knowledge and resources. This book is timely and relevant towards accompl