Automatic Speech and Speaker Recognition: Advanced Topics

✍ Scribed by L. R. Rabiner, B.-H. Juang, C.-H. Lee (auth.), Chin-Hui Lee, Frank K. Soong, Kuldip K. Paliwal (eds.)

Publisher: Springer US
Year: 1996
Tongue: English
Leaves: 523
Series: The Kluwer International Series in Engineering and Computer Science 355
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Research in the field of automatic speech and speaker recognition has made a number of significant advances in the last two decades, influenced by advances in signal processing, algorithms, architectures, and hardware. These advances include: the adoption of a statistical pattern recognition paradigm; the use of the hidden Markov modeling framework to characterize both the spectral and the temporal variations in the speech signal; the use of a large set of speech utterance examples from a large population of speakers to train the hidden Markov models of some fundamental speech units; the organization of speech and language knowledge sources into a structural finite state network; and the use of dynamic, programming based heuristic search methods to find the best word sequence in the lexical network corresponding to the spoken utterance.
Automatic Speech and Speaker Recognition: Advanced Topics groups together in a single volume a number of important topics on speech and speaker recognition, topics which are of fundamental importance, but not yet covered in detail in existing textbooks. Although no explicit partition is given, the book is divided into five parts: Chapters 1-2 are devoted to technology overviews; Chapters 3-12 discuss acoustic modeling of fundamental speech units and lexical modeling of words and pronunciations; Chapters 13-15 address the issues related to flexibility and robustness; Chapter 16-18 concern the theoretical and practical issues of search; Chapters 19-20 give two examples of algorithm and implementational aspects for recognition system realization.
Audience: A reference book for speech researchers and graduate students interested in pursuing potential research on the topic. May also be used as a text for advanced courses on the subject.

✦ Table of Contents

Front Matter....Pages i-xvi
An Overview of Automatic Speech Recognition....Pages 1-30
An Overview of Speaker Recognition Technology....Pages 31-56
Maximum Mutual Information Estimation of Hidden Markov Models....Pages 57-81
Bayesian Adaptive Learning and Map Estimation of HMM....Pages 83-107
Statistical and Discriminative Methods for Speech Recognition....Pages 109-132
Context-Dependent Vector Clustering for Speech Recognition....Pages 133-157
Hidden Markov Network for Precise and Robust Acoustic Modeling....Pages 159-184
From HMMS to Segment Models: Stochastic Modeling for CSR....Pages 185-210
Voice Identification Using Nonparametric Density Matching....Pages 211-232
The Use of Recurrent Neural Networks in Continuous Speech Recognition....Pages 233-258
Hybrid Connectionist Models For Continuous Speech Recognition....Pages 259-283
Automatic Generation of Detailed Pronunciation Lexicons....Pages 285-301
Word Spotting from Continuous Speech Utterances....Pages 303-329
Spectral Dynamics for Speech Recognition Under Adverse Conditions....Pages 331-356
Signal Processing for Robust Speech Recognition....Pages 357-384
Dynamic Programming Search Strategies: From Digit Strings to Large Vocabulary Word Graphs....Pages 385-411
Fast Match Techniques....Pages 413-428
Multiple-Pass Search Strategies....Pages 429-456
Issues in Practical Large Vocabulary Isolated Word Recognition: The IBM Tangora System....Pages 457-479
From Sphinx-II to Whisper — Making Speech Recognition Usable....Pages 481-508
Back Matter....Pages 509-517

✦ Subjects

Signal, Image and Speech Processing; Electrical Engineering

📜 SIMILAR VOLUMES

Automatic Speech and Speaker Recognition

📁 Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

✍ Joseph Keshet, Samy Bengio 📂 Library 📅 2009 🏛 Wiley 🌐 English

This book discusses large margin and kernel methods for speech and speaker recognitionSpeech and Speaker Recognition: Large Margin and Kernel Methods is a collation of research in the recent advances in large margin and kernel methods, as applied to the field of speech and speaker r

Automatic speech and speaker recognition

📁 Automatic speech and speaker recognition: large margin and kernel methods

✍ Joseph Keshet, Samy Bengio 📂 Library 📅 2009 🏛 J. Wiley & Sons 🌐 English

Automatic Speech and Speaker Recognition

📁 Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

✍ Joseph Keshet, Samy Bengio 📂 Library 📅 2009 🏛 Wiley 🌐 English

This book discusses large margin and kernel methods for speech and speaker recognition Speech and Speaker Recognition: Large Margin and Kernel Methods is a collation of research in the recent advances in large margin and kernel methods, as applied to the field of speech and speaker r

Extraction of Prosody for Automatic Spea

📁 Extraction of Prosody for Automatic Speaker, Language, Emotion and Speech Recognition

✍ Leena Mary 📂 Library 📅 2019 🏛 Springer International Publishing 🌐 English

This updated book expands upon prosody for recognition applications of speech processing. It includes importance of prosody for speech processing applications; builds on why prosody needs to be incorporated in speech processing applications; and presents methods for extraction and representati

Automatic Speech Recognition

📁 Automatic Speech Recognition

✍ Renals Steve, King Simon. 📂 Library 🌐 English

In William J. Hardcastle, John Laver, and Fiona E. Gibbon, editors, Handbook of Phonetic Sciences, chapter 22 (Wiley Blackwell, 2010, ISBN: 978-1-4051-4590-9) — on pp. 804—838.<div class="bb-sep"></div>Speech recognition—the transcription of an acoustic speech signal into a string of words—is a hard

Human and Automatic Speaker Recognition

📁 Human and Automatic Speaker Recognition over Telecommunication Channels

✍ Laura Fernández Gallardo (auth.) 📂 Library 📅 2016 🏛 Springer Singapore 🌐 English

This work addresses the evaluation of the human and the automatic speaker recognition performances under different channel distortions caused by bandwidth limitation, codecs, and electro-acoustic user interfaces, among other impairments. Its main contribution is the demonstration of the benefi