Analysis and Application of Natural Language and Speech Processing

✍ Scribed by Mourad Abbas

Publisher: Springer
Year: 2023
Tongue: English
Leaves: 217
Series: Signals and Communication Technology
Category: Library

No coin nor oath required. For personal study only.

✦ Table of Contents

Preface
Contents
ITAcotron 2: The Power of Transfer Learning in Expressive TTS Synthesis
1 Introduction
2 Background
3 Related Work
4 Aim and Experimental Hypotheses
5 Corpora
6 ITAcotron 2 Synthesis Pipeline
7 Evaluation Approach
8 Results
8.1 Speech Intelligibility and Naturalness
8.2 Speaker Similarity
9 Conclusion and Future Work
Appendix
References
Improving Automatic Speech Recognition for Non-native English with Transfer Learning and Language Model Decoding
1 Introduction
2 Related Work
3 Methods
3.1 Transfer Learning
3.2 CTC Decoding
4 Data
4.1 Corpus Information
4.2 Data Splits
5 Experiments
5.1 Baselines
5.2 Multi-Accent Models
5.3 Accent-Specific Models
5.4 Language Model Decoding
6 Error Analysis
7 Conclusion
References
Kabyle ASR Phonological Error and Network Analysis
1 Introduction
2 Background
2.1 ASR Modeling Units
2.2 Diacritization
2.3 Berber Language Tools
2.4 Phonological Networks
3 The Kabyle Language and Berber Writing Systems
4 Approach
4.1 Mozilla CommonVoice
4.2 Mozilla DeepSpeech
4.3 Transliterator
4.4 Sequence Alignment
5 Experimentation and Results
5.1 Experiments
5.2 Results
5.3 Phonemic Confusion Analysis
5.4 Phonological Network Analysis
6 Discussion
7 Future Work
8 Conclusion
References
ALP: An Arabic Linguistic Pipeline
1 Introduction
2 Ambiguity in Arabic
2.1 Ambiguity in Word Segmentation
2.2 Ambiguity in POS Tagging
2.2.1 Verb Ambiguities: Passive vs Active Voice
2.2.2 Verb Ambiguities: Past vs Present Tense
2.2.3 Verb Ambiguities: Imperative
2.2.4 Noun Ambiguities: Singular vs Plural
2.2.5 Noun Ambiguities: Dual vs Singular
2.2.6 Noun Ambiguities: Dual vs Plural
2.2.7 Noun Ambiguities: Feminine vs Masculine Singular
2.3 Ambiguity in Named Entity Recognition
2.3.1 Inherent Ambiguity in Named Entities
2.3.2 Ellipses
2.4 Ambiguity in Lemmatization
2.5 Ambiguity in Phrase Chunking
3 Pipeline Architecture
3.1 Preprocessing: POS, NER, and Word Segment Tagging
3.1.1 POS Tagging
3.1.2 Named Entity Recognition
3.1.3 Word Segmentation
3.2 Lemmatization
3.2.1 Learning-Based Lemmatizer
3.2.2 Dictionary-Based Lemmatizer
3.2.3 Fusion Lemmatizer
3.3 Base Chunker
4 Annotation Schema
4.1 Annotation of POS Tags
4.2 Annotation of Word Segments
4.3 Annotation of Named Entities
4.4 Annotation of Lemmas
4.5 Annotation of Base Chunks
5 Corpus Annotation
5.1 POS and Name Annotation Method
5.2 Lemma Annotation Method
5.2.1 Dictionary Lemmatizer
5.2.2 Machine Learning Lemmatizer
5.3 Base Chunking Annotation Method
6 Evaluation
6.1 Evaluation of POS Tagging
6.2 Evaluation of NER
6.3 Evaluation of Lemmatization and Base Chunking
7 Conclusion and Future Work
References
Arabic Anaphora Resolution System Using New Features: Pronominal and Verbal Cases
1 Introduction
2 Varieties of Anaphora in Arabic Text
2.1 Verb Anaphora
2.2 Lexical Anaphora
2.3 Pronominal Anaphora
2.3.1 Third-Person Personal Pronouns
2.3.2 Relative Pronouns
2.3.3 Demonstrative Pronouns
2.4 Comparative Anaphora
3 Related Work
4 Arabic Anaphoric Resolution Challenges
4.1 Lack of Diacritical Marks
4.2 Agglutination Phenomenon
4.3 Syntactic Flexibility (Words Free Order)
4.4 Ambiguity of the Referent
4.5 Hidden Referent
4.6 Lack of Annotated Corpora with Anaphoric Links
5 The A3T Architecture
5.1 Preprocessing
5.2 Anaphora and Candidate Identification
5.3 Anaphora Resolving
5.4 Automatic Text Annotation
6 Experiments and Results
7 Discussion
8 Conclusion
References
A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue
1 Introduction
2 Related Work
2.1 Task- and Goal-Oriented Dialogue
2.2 Dialogue State Tracking and Planning
2.3 Document-Grounded Dialogue
2.4 Commonsense-Enhanced Dialogue
2.5 Dialogue Management
3 Task2Dial
3.1 Data Collection Methodology
4 Dataset Analysis
5 The ChefBot Conversational Agent
6 Conclusions and Future Work
6.1 Future Work and Open Questions
References
BloomQDE: Leveraging Bloom's Taxonomy for Question Difficulty Estimation
1 Introduction
2 Related Work
3 Approach
3.1 Datasets
3.1.1 ARC
3.1.2 SQuAD
3.2 Data Preparation
3.2.1 Keyword Mapping
3.2.2 PoS Tagging
3.2.3 Class Binarization
3.2.4 Test Data
4 Experiments
4.1 Model Training
4.2 Parameter Optimization
4.3 Experimental Results
4.4 Room for Improvement
5 Conclusion and Future Work
References
A Comparative Study on Language Models for Dravidian Languages
1 Introduction
2 Related Work
3 Methodology
3.1 Dataset
3.2 Preprocessing
3.3 Tokenization and Vocabulary
3.4 Experimental Setup
4 Models and Evaluation
4.1 Word Embedding Models
4.2 Contextual Embedding Models
4.2.1 RoBERTa
4.2.2 DeBERTa
4.2.3 ELECTRA
5 Results
5.1 Word Similarity
5.2 News Article Classification
6 Conclusion
7 Future Work
References
Arabic Named Entity Recognition with a CRF Model Based on Transformer Architecture
1 Introduction
2 Background
2.1 AraBERT
2.2 AraELECTRA
2.3 RoBERTa
3 Related Works
3.1 Rule-Based Approach
3.2 Machine Learning Approach
3.3 Deep Learning Approach
3.4 Hybrid Approach
4 Transformer-Based CRF Model
4.1 Proposed Model Architecture
4.2 Linear Layer
4.3 CRF Tagging Algorithm
4.4 Calculating the NLL Function
5 Experiment
5.1 Tagging Types
5.2 Data Samples
5.2.1 ANERcorp Dataset
5.2.2 AQMAR Dataset
5.2.3 CANERCorpus Dataset
5.2.4 Our Arabic Legal Content (ALC) Dataset
5.3 Fine-Tuning Process
6 Results
7 Conclusion
References
Static Fuzzy Bag-of-Words: Exploring Static Universe Matrices for Sentence Embeddings
1 Introduction
2 Related Work
2.1 Word and Sentence Embeddings
2.2 Fuzzy Bag-of-Words and DynaMax for Sentence Embeddings
3 Static Fuzzy Bag-of-Words Model
3.1 Word Embeddings
3.2 Universe Matrix
3.2.1 Clustering
3.2.2 Identity
3.2.3 Multivariate Analysis
3.2.4 Vector Significance
4 Experiments
4.1 Word Embeddings
4.2 Universe Matrices
4.2.1 Clustering
4.2.2 Identity
4.2.3 Multivariate Analysis
4.2.4 Vector Significance
4.3 Data
4.4 Evaluation Approach
5 Results
5.1 Individual SFBoW Results
5.2 Comparison with Other Models
6 Conclusion
References
Index

📜 SIMILAR VOLUMES

Analysis and Application of Natural Lang

📁 Analysis and Application of Natural Language and Speech Processing

✍ Mourad Abbas 📂 Library 📅 2023 🏛 Springer 🌐 English

This book presents recent advances in Natural Language Processing (NLP) and speech technology, a topic attracting increasing interest in a variety of fields through its myriad applications, such as the demand for speech guided touchless technology. The authors present results of recent experimental

Analysis and Application of Natural Lang

📁 Analysis and Application of Natural Language and Speech Processing

✍ Mourad Abbas 📂 Library 📅 2023 🏛 Springer Nature 🌐 English

This book presents recent advances in NLP and speech technology, a topic attracting increasing interest in a variety of fields through its myriad applications, such as the demand for speech guided touchless technology during the Covid-19 pandemic. The authors present results of recent experimental r

Thai Natural Language Processing: Word S

📁 Thai Natural Language Processing: Word Segmentation, Semantic Analysis, and Application

✍ Chalermpol Tapsai, Herwig Unger, Phayung Meesad 📂 Library 📅 2020 🏛 Springer 🌐 English

<p></p><p>This book presents comprehensive solutions for readers wanting to develop their own Natural Language Processing projects for the Thai language. Starting from the fundamental principles of Thai, it discusses each step in Natural Language Processing, and the real-world applications. In addit

Speech and Language Processing: An Intro

📁 Speech and Language Processing: An Introduction to Natural Language Processing

✍ Jurafsky, Martin. 📂 Library 📅 2007 🌐 English

Speech and Language Processing: An Intro

📁 Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

✍ Daniel Jurafsky, James H. Martin 📂 Library 📅 2008 🏛 Prentice Hall 🌐 English

Speech and Language Processing: An Intro

📁 Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition

✍ Daniel Jurafsky, James H. Martin 📂 Library 📅 2000 🏛 Prentice Hall 🌐 English

This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter