Multilingual Phone Recognition in Indian Languages (SpringerBriefs in Speech Technology)

✍ Scribed by K.E Manjunath

Publisher: Springer
Year: 2021
Tongue: English
Leaves: 113
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

The book presents current research and developments in multilingual speech recognition. The author presents a Multilingual Phone Recognition System (Multi-PRS), developed using a common multilingual phone-set derived from the International Phonetic Alphabets (IPA) based transcription of six Indian languages - Kannada, Telugu, Bengali, Odia, Urdu, and Assamese. The author shows how the performance of Multi-PRS can be improved using tandem features. The book compares Monolingual Phone Recognition Systems (Mono-PRS) versus Multi-PRS and baseline versus tandem system. Methods are proposed to predict Articulatory Features (AFs) from spectral features using Deep Neural Networks (DNN). Multitask learning is explored to improve the prediction accuracy of AFs. Then, the AFs are explored to improve the performance of Multi-PRS using lattice rescoring method of combination and tandem method of combination. The author goes on to develop and evaluate the Language Identification followed by Monolingual phone recognition (LID-Mono) and common multilingual phone-set based multilingual phone recognition systems.

✦ Table of Contents

Preface
Contents
Acronyms
1 Introduction
1.1 Multilingual Phone Recognition
1.2 Articulatory Features for Multilingual Phone Recognition
1.3 Approaches for Multilingual Phone Recognition
1.4 Code-switched Phone Recognition using Multilingual Phone Recognition Systems
1.5 Objective and Scope of the Work
1.6 Proposed Organization of the Book
References
2 Literature Review
2.1 Introduction
2.2 Prior Work on Multilingual Speech Recognition
2.3 Prior Work on Multilingual Speech Recognition using Articulatory Features
2.4 Prior Work on Code-Switched Speech Recognition using Multilingual Speech Recognition Systems
2.5 Summary
References
3 Development and Analysis of Multilingual Phone RecognitionSystem
3.1 Introduction
3.2 Experimental Setup
3.2.1 Multilingual Speech Corpora
3.2.2 Extraction of Mel-frequency Cepstral Coefficients
3.2.3 Training HMMs and DNNs
3.3 Development of Phone Recognition Systems
3.3.1 Development of Monolingual Phone Recognition Systems
3.3.2 Development of Multilingual Phone Recognition Systems
3.3.3 Development of Tandem Multilingual Phone Recognition Systems
3.4 Performance Evaluation of Phone Recognition Systems
3.4.1 Performance Evaluation of Monolingual Phone Recognition Systems
3.4.2 Performance Evaluation of Multilingual Phone Recognition Systems
3.4.3 Performance Evaluation of Tandem Multilingual Phone Recognition Systems
3.5 Discussion of Results
3.5.1 Analysis and Comparison of the Results
3.5.2 Cross-Lingual Analysis
3.6 Summary
References
4 Prediction of Multilingual Articulatory Features
4.1 Introduction
4.2 Articulatory Features Specification
4.3 Articulatory Feature Predictors (AF-Predictors)
4.3.1 Development of Articulatory Feature Predictors
4.3.2 Oracle Articulatory Features
4.3.3 Performance Evaluation of AF-Predictors
4.4 Performance Improvement of AF-Predictors using Multitask Learning (MTL)
4.5 Summary
References
5 Articulatory Features for Multilingual Phone Recognition
5.1 Introduction
5.2 Proposed Approaches for Multilingual Phone Recognition using Articulatory Features
5.2.1 Development of AF-Based Tandem Multilingual Phone Recognition Systems
5.2.2 Fusion of AFs from Multiple AF Groups
5.3 Multitask Learning Based AFs for Multilingual PhoneRecognition
5.4 Summary
References
6 Applications of Multilingual Phone Recognition in Code-Switched and Non-code-Switched Scenarios
6.1 Introduction
6.2 Experimental Setup
6.2.1 Multilingual Speech Corpora
6.2.2 Code-Switched Test Set
6.2.3 Training Support Vector Machines (SVMs)
6.2.4 Extraction of i-vectors
6.3 Approaches for Multilingual Phone Recognition
6.3.1 LID-switched Monolingual Phone Recognition (LID-Mono) Approach
6.3.1.1 Development of Language Identification (LID) System
6.3.1.2 Development of Monolingual Phone Recognition Systems (Mono-PRS)
6.3.2 Multilingual Phone Recognition using Common Multilingual Phone-set (Multi-PRS) Approach
6.4 Evaluation and Comparison of LID-Mono and Multi-PRS Approaches
6.4.1 Non-Code-Switched Scenario
6.4.2 Code-Switched Scenario
6.5 Summary
References
7 Summary and Conclusion
7.1 Summary of the Book
7.2 Contributions of the Book
7.3 Future Scope of Work
Reference
A Support Vector Machines
B Hidden Markov Models for Speech Recognition
Reference
C Deep Neural Networks for Speech Recognition
C.1 FeedForward Neural Networks
C.2 Training Deep Neural Networks
C.3 Interfacing DNN with HMM (DNN-HMMs)
References
Index

📜 SIMILAR VOLUMES

The Integration of Phonetic Knowledge in

📁 The Integration of Phonetic Knowledge in Speech Technology (Text, Speech and Language Technology)

✍ William J. Barry (Editor), Wim A. van Dommelen (Editor) 📂 Library 📅 2005 🌐 English

Continued progress in Speech Technology in the face of ever-increasing demands on the performance levels of applications is a challenge to the whole speech and language science community. Robust recognition and understanding of spontaneous speech in varied environments, good comprehensibility and na

Phonetics and Phonology in Multilingual

📁 Phonetics and Phonology in Multilingual Language Development

✍ Ulrike Gut; Romana Kopečková; Christina Nelson 📂 Library 📅 2023 🏛 Cambridge University Press 🌐 English

This Element focuses on phonetic and phonological development in multilinguals and presents a novel methodological approach to it within Complex Dynamic Systems Theory (CDST). We will show how the traditional conceptualisations of acquisition with a strong focus on linear, incremental development wi

Fractional Fourier Transform Techniques

📁 Fractional Fourier Transform Techniques for Speech Enhancement (SpringerBriefs in Speech Technology)

✍ Prajna Kunche, N. Manikanthababu 📂 Library 📅 2020 🏛 Springer 🌐 English

<p><span>This book explains speech enhancement in the Fractional Fourier Transform (FRFT) domain and investigates the use of different FRFT algorithms in both single channel and multi-channel enhancement systems, which has proven to be an ideal time frequency analysis tool in many speech signal proc

Pattern recognition in speech and langua

📁 Pattern recognition in speech and language processing

✍ WU CHOU, BIING HWANG JUANG 📂 Library 📅 2003 🏛 CRC 🌐 English

Pattern Recognition in Speech and Langua

📁 Pattern Recognition in Speech and Language Processing

✍ Wu Chou, Biing-Hwang Juang 📂 Library 📅 2003 🏛 CRC Press 🌐 English

Over the last 20 years, approaches to designing speech and language processing algorithms have moved from methods based on linguistics and speech science to data-driven pattern recognition techniques. These techniques have been the focus of intense, fast-moving research and have contributed to signi

Pattern Recognition in Speech and Langua

📁 Pattern Recognition in Speech and Language Processing

📂 Library 🌐 English