๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Bayesian network structures and inference techniques for automatic speech recognition

โœ Scribed by Geoffrey Zweig


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
350 KB
Volume
17
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


This paper describes the theory and implementation of Bayesian networks in the context of automatic speech recognition. Bayesian networks provide a succinct and expressive graphical language for factoring joint probability distributions, and we begin by presenting the structures that are appropriate for doing speech recognition training and decoding. This approach is notable because it expresses all the details of a speech recognition system in a uniform way using only the concepts of random variables and conditional probabilities. A powerful set of computational routines complements the representational utility of Bayesian networks, and the second part of this paper describes these algorithms in detail. We present a novel view of inference in general networks -where inference is done via a change-of-variables that renders the network tree-structured and amenable to a very simple form of inference. We present the technique in terms of straightforward dynamic programming recursions analogous to HMM a-b computation, and then extend it to handle deterministic constraints amongst variables in an extremely efficient manner. The paper concludes with a sequence of experimental results that show the range of effects that can be modeled, and that significant reductions in error-rate can be expected from intelligently factored state representations.


๐Ÿ“œ SIMILAR VOLUMES


Dynamic Bayesian networks for multi-band
โœ Khalid Daoudi; Dominique Fohr; Christophe Antoine ๐Ÿ“‚ Article ๐Ÿ“… 2003 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 292 KB

This paper presents a new approach to multi-band automatic speech recognition which has the advantage to overcome many limitations of classical muti-band systems. The principle of this new approach is to build a speech model in the time-frequency domain using the formalism of dynamic Bayesian networ