๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Analysis and Application of Natural Language and Speech Processing

โœ Scribed by Mourad Abbas


Publisher
Springer
Year
2023
Tongue
English
Leaves
217
Series
Signals and Communication Technology
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Table of Contents


Preface
Contents
ITAcotron 2: The Power of Transfer Learning in Expressive TTS Synthesis
1 Introduction
2 Background
3 Related Work
4 Aim and Experimental Hypotheses
5 Corpora
6 ITAcotron 2 Synthesis Pipeline
7 Evaluation Approach
8 Results
8.1 Speech Intelligibility and Naturalness
8.2 Speaker Similarity
9 Conclusion and Future Work
Appendix
References
Improving Automatic Speech Recognition for Non-native English with Transfer Learning and Language Model Decoding
1 Introduction
2 Related Work
3 Methods
3.1 Transfer Learning
3.2 CTC Decoding
4 Data
4.1 Corpus Information
4.2 Data Splits
5 Experiments
5.1 Baselines
5.2 Multi-Accent Models
5.3 Accent-Specific Models
5.4 Language Model Decoding
6 Error Analysis
7 Conclusion
References
Kabyle ASR Phonological Error and Network Analysis
1 Introduction
2 Background
2.1 ASR Modeling Units
2.2 Diacritization
2.3 Berber Language Tools
2.4 Phonological Networks
3 The Kabyle Language and Berber Writing Systems
4 Approach
4.1 Mozilla CommonVoice
4.2 Mozilla DeepSpeech
4.3 Transliterator
4.4 Sequence Alignment
5 Experimentation and Results
5.1 Experiments
5.2 Results
5.3 Phonemic Confusion Analysis
5.4 Phonological Network Analysis
6 Discussion
7 Future Work
8 Conclusion
References
ALP: An Arabic Linguistic Pipeline
1 Introduction
2 Ambiguity in Arabic
2.1 Ambiguity in Word Segmentation
2.2 Ambiguity in POS Tagging
2.2.1 Verb Ambiguities: Passive vs Active Voice
2.2.2 Verb Ambiguities: Past vs Present Tense
2.2.3 Verb Ambiguities: Imperative
2.2.4 Noun Ambiguities: Singular vs Plural
2.2.5 Noun Ambiguities: Dual vs Singular
2.2.6 Noun Ambiguities: Dual vs Plural
2.2.7 Noun Ambiguities: Feminine vs Masculine Singular
2.3 Ambiguity in Named Entity Recognition
2.3.1 Inherent Ambiguity in Named Entities
2.3.2 Ellipses
2.4 Ambiguity in Lemmatization
2.5 Ambiguity in Phrase Chunking
3 Pipeline Architecture
3.1 Preprocessing: POS, NER, and Word Segment Tagging
3.1.1 POS Tagging
3.1.2 Named Entity Recognition
3.1.3 Word Segmentation
3.2 Lemmatization
3.2.1 Learning-Based Lemmatizer
3.2.2 Dictionary-Based Lemmatizer
3.2.3 Fusion Lemmatizer
3.3 Base Chunker
4 Annotation Schema
4.1 Annotation of POS Tags
4.2 Annotation of Word Segments
4.3 Annotation of Named Entities
4.4 Annotation of Lemmas
4.5 Annotation of Base Chunks
5 Corpus Annotation
5.1 POS and Name Annotation Method
5.2 Lemma Annotation Method
5.2.1 Dictionary Lemmatizer
5.2.2 Machine Learning Lemmatizer
5.3 Base Chunking Annotation Method
6 Evaluation
6.1 Evaluation of POS Tagging
6.2 Evaluation of NER
6.3 Evaluation of Lemmatization and Base Chunking
7 Conclusion and Future Work
References
Arabic Anaphora Resolution System Using New Features: Pronominal and Verbal Cases
1 Introduction
2 Varieties of Anaphora in Arabic Text
2.1 Verb Anaphora
2.2 Lexical Anaphora
2.3 Pronominal Anaphora
2.3.1 Third-Person Personal Pronouns
2.3.2 Relative Pronouns
2.3.3 Demonstrative Pronouns
2.4 Comparative Anaphora
3 Related Work
4 Arabic Anaphoric Resolution Challenges
4.1 Lack of Diacritical Marks
4.2 Agglutination Phenomenon
4.3 Syntactic Flexibility (Words Free Order)
4.4 Ambiguity of the Referent
4.5 Hidden Referent
4.6 Lack of Annotated Corpora with Anaphoric Links
5 The A3T Architecture
5.1 Preprocessing
5.2 Anaphora and Candidate Identification
5.3 Anaphora Resolving
5.4 Automatic Text Annotation
6 Experiments and Results
7 Discussion
8 Conclusion
References
A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue
1 Introduction
2 Related Work
2.1 Task- and Goal-Oriented Dialogue
2.2 Dialogue State Tracking and Planning
2.3 Document-Grounded Dialogue
2.4 Commonsense-Enhanced Dialogue
2.5 Dialogue Management
3 Task2Dial
3.1 Data Collection Methodology
4 Dataset Analysis
5 The ChefBot Conversational Agent
6 Conclusions and Future Work
6.1 Future Work and Open Questions
References
BloomQDE: Leveraging Bloom's Taxonomy for Question Difficulty Estimation
1 Introduction
2 Related Work
3 Approach
3.1 Datasets
3.1.1 ARC
3.1.2 SQuAD
3.2 Data Preparation
3.2.1 Keyword Mapping
3.2.2 PoS Tagging
3.2.3 Class Binarization
3.2.4 Test Data
4 Experiments
4.1 Model Training
4.2 Parameter Optimization
4.3 Experimental Results
4.4 Room for Improvement
5 Conclusion and Future Work
References
A Comparative Study on Language Models for Dravidian Languages
1 Introduction
2 Related Work
3 Methodology
3.1 Dataset
3.2 Preprocessing
3.3 Tokenization and Vocabulary
3.4 Experimental Setup
4 Models and Evaluation
4.1 Word Embedding Models
4.2 Contextual Embedding Models
4.2.1 RoBERTa
4.2.2 DeBERTa
4.2.3 ELECTRA
5 Results
5.1 Word Similarity
5.2 News Article Classification
6 Conclusion
7 Future Work
References
Arabic Named Entity Recognition with a CRF Model Based on Transformer Architecture
1 Introduction
2 Background
2.1 AraBERT
2.2 AraELECTRA
2.3 RoBERTa
3 Related Works
3.1 Rule-Based Approach
3.2 Machine Learning Approach
3.3 Deep Learning Approach
3.4 Hybrid Approach
4 Transformer-Based CRF Model
4.1 Proposed Model Architecture
4.2 Linear Layer
4.3 CRF Tagging Algorithm
4.4 Calculating the NLL Function
5 Experiment
5.1 Tagging Types
5.2 Data Samples
5.2.1 ANERcorp Dataset
5.2.2 AQMAR Dataset
5.2.3 CANERCorpus Dataset
5.2.4 Our Arabic Legal Content (ALC) Dataset
5.3 Fine-Tuning Process
6 Results
7 Conclusion
References
Static Fuzzy Bag-of-Words: Exploring Static Universe Matrices for Sentence Embeddings
1 Introduction
2 Related Work
2.1 Word and Sentence Embeddings
2.2 Fuzzy Bag-of-Words and DynaMax for Sentence Embeddings
3 Static Fuzzy Bag-of-Words Model
3.1 Word Embeddings
3.2 Universe Matrix
3.2.1 Clustering
3.2.2 Identity
3.2.3 Multivariate Analysis
3.2.4 Vector Significance
4 Experiments
4.1 Word Embeddings
4.2 Universe Matrices
4.2.1 Clustering
4.2.2 Identity
4.2.3 Multivariate Analysis
4.2.4 Vector Significance
4.3 Data
4.4 Evaluation Approach
5 Results
5.1 Individual SFBoW Results
5.2 Comparison with Other Models
6 Conclusion
References
Index


๐Ÿ“œ SIMILAR VOLUMES


Analysis and Application of Natural Lang
โœ Mourad Abbas ๐Ÿ“‚ Library ๐Ÿ“… 2023 ๐Ÿ› Springer ๐ŸŒ English

This book presents recent advances in Natural Language Processing (NLP) and speech technology, a topic attracting increasing interest in a variety of fields through its myriad applications, such as the demand for speech guided touchless technology. The authors present results of recent experimental

Analysis and Application of Natural Lang
โœ Mourad Abbas ๐Ÿ“‚ Library ๐Ÿ“… 2023 ๐Ÿ› Springer Nature ๐ŸŒ English

This book presents recent advances in NLP and speech technology, a topic attracting increasing interest in a variety of fields through its myriad applications, such as the demand for speech guided touchless technology during the Covid-19 pandemic. The authors present results of recent experimental r

Thai Natural Language Processing: Word S
โœ Chalermpol Tapsai, Herwig Unger, Phayung Meesad ๐Ÿ“‚ Library ๐Ÿ“… 2020 ๐Ÿ› Springer ๐ŸŒ English

<p></p><p>This book presents comprehensive solutions for readers wanting to develop their own Natural Language Processing projects for the Thai language. Starting from the fundamental principles of Thai, it discusses each step in Natural Language Processing, and the real-world applications. In addit

Speech and Language Processing: An Intro
โœ Daniel Jurafsky, James H. Martin ๐Ÿ“‚ Library ๐Ÿ“… 2000 ๐Ÿ› Prentice Hall ๐ŸŒ English

This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter