๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Tree-structured vector quantization for speech recognition

โœ Scribed by M. Barszcz; W. Chen; G. Boulianne; P. Kenny


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
90 KB
Volume
14
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


We describe some new methods for constructing discrete acoustic phonetic hidden Markov models (HMMs) using tree quantizers having very large numbers (16-64 K) of leaf nodes and tree-structured smoothing techniques. We consider two criteria for constructing tree quantizers (minimum distortion and minimum entropy) and three types of smoothing (mixture smoothing, smoothing by adding 1 and Gaussian smoothing). We show that these methods are capable of achieving recognition accuracies which are generally comparable to those obtained with Gaussian mixture HMMs at a computational cost which is only marginally greater than that of conventional discrete HMMs. We present some evidence of superior performance in situations where the number of HMM distributions to be estimated is small compared with the amount of training data. We also show how our methods can accommodate feature vectors of much higher dimensionality than are traditionally used in speech recognition.


๐Ÿ“œ SIMILAR VOLUMES


Adaptive Vector Quantization for Speech
โœ John Leis; Sridha Sridharan ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 300 KB

We address the problem of speech compression at very low rates, with the short-term spectrum compressed to less than 20 bits per frame. Current techniques apply structured vector quantization (VQ) to the short-term synthesis filter coefficients to achieve rates of the order of 24 to 26 bits per fram

Speaker-independent speech recognition b
โœ Tetsuo Kosaka; Shoichi Matsunaga; Shigeki Sagayama ๐Ÿ“‚ Article ๐Ÿ“… 1996 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 231 KB

We have already proposed the application of tree-structured speaker clustering to supervised speaker adaptation. This paper proposes its application to unsupervised speaker adaptation and speakerindependent (SI) speech recognition. This clustering involves the selection of a speaker cluster from amo

The use of tree-trellis search for large
โœ Eng-Fong Huang; Frank K. Soong; Hsiao-Chuan Wang ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 483 KB

In this paper, we propose the use of a tree-trellis search scheme for the task of large vocabulary Mandarin polysyllabic word recognition. Usually, the task of large vocabulary word recognition is computationally intractable by whole-word based approach. We convert this task into a tree network sear