<P>This book reflects decades of important research on the mathematical foundations of speech recognition. It focuses on underlying statistical techniques such as hidden Markov models, decision trees, the expectation-maximization algorithm, information theoretic goodness criteria, maximum entropy pr
Incorporating Knowledge Sources into Statistical Speech Recognition
โ Scribed by Wolfgang Minker, Satoshi Nakamura, Konstantin Markov, Sakriani Sakti (auth.)
- Publisher
- Springer US
- Year
- 2009
- Tongue
- English
- Leaves
- 206
- Series
- Lecture Notes in Electrical Engineering 42
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Incorporating Knowledge Sources into Statistical Speech Recognition offers solutions for enhancing the robustness of a statistical automatic speech recognition (ASR) system by incorporating various additional knowledge sources while keeping the training and recognition effort feasible.
The authors provide an efficient general framework for incorporating knowledge sources into state-of-the-art statistical ASR systems. This framework, which is called GFIKS (graphical framework to incorporate additional knowledge sources), was designed by utilizing the concept of the Bayesian network (BN) framework. This framework allows probabilistic relationships among different information sources to be learned, various kinds of knowledge sources to be incorporated, and a probabilistic function of the model to be formulated.
Incorporating Knowledge Sources into Statistical Speech Recognition demonstrates how the statistical speech recognition system may incorporate additional information sources by utilizing GFIKS at different levels of ASR. The incorporation of various knowledge sources, including background noises, accent, gender and wide phonetic knowledge information, in modeling is discussed theoretically and analyzed experimentally.
โฆ Table of Contents
Front Matter....Pages 1-20
Introduction and Book Overview....Pages 1-17
Statistical Speech Recognition....Pages 1-35
Graphical Framework to Incorporate Knowledge Sources....Pages 1-23
Speech Recognition Using GFIKS....Pages 1-59
Conclusions and Future Directions....Pages 1-5
Back Matter....Pages 1-47
โฆ Subjects
Electrical Engineering;Computer Communication Networks;Communications Engineering, Networks;Acoustics;Signal, Image and Speech Processing
๐ SIMILAR VOLUMES
<p>This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) perfo
<p><p>This book presents a statistical parametric speech synthesis (SPSS) framework for developing a speech synthesis system where the desired speech is generated from the parameters of vocal tract and excitation source. Throughout the book, the authors discuss novel source modeling techniques to en
Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for search
Automatic Speech Recognition (ASR) on Linux is becoming easier. Several packages are available for users as well as developers. This document describes the basics of speech recognition and describes some of the available software.
<b>A complete overview of distant automatic speech recognition <p> <p> The performance of conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon as the microphone is moved away from the mouth of the speaker. This is due to a broad variety of effects such as b