We describe some new methods for constructing discrete acoustic phonetic hidden Markov models (HMMs) using tree quantizers having very large numbers (16-64 K) of leaf nodes and tree-structured smoothing techniques. We consider two criteria for constructing tree quantizers (minimum distortion and min
Adaptive Vector Quantization for Speech Spectrum Coding
โ Scribed by John Leis; Sridha Sridharan
- Publisher
- Elsevier Science
- Year
- 1999
- Tongue
- English
- Weight
- 300 KB
- Volume
- 9
- Category
- Article
- ISSN
- 1051-2004
No coin nor oath required. For personal study only.
โฆ Synopsis
We address the problem of speech compression at very low rates, with the short-term spectrum compressed to less than 20 bits per frame. Current techniques apply structured vector quantization (VQ) to the short-term synthesis filter coefficients to achieve rates of the order of 24 to 26 bits per frame. In this paper we show that temporal correlations in the VQ index stream can be introduced by dynamic codebook ordering, and that these correlations can be exploited by lossless coding approaches to reduce the number of bits per frame of the VQ scheme. The use of lossless coding ensures that no additional distortion is introduced, unlike other interframe techniques. We then detail two constructive algorithms which are able to exploit this redundancy. The first method is a delayed-decision approach, which dynamically adapts the VQ codebook to allow for efficient entropy coding of the index stream. The second is based on a vector subcodebook approach and does not incur any additional delay. Experimental results are presented for both methods to validate the approach. 1999 Academic Press
๐ SIMILAR VOLUMES
A novel hybrid DPCM/CVQ (classified vector quantization) method is proposed to encode and decode video telephony sequences. The CVQ coding method is based on the detection of human faces within a video signal. This detection will improve three aspects of video coding efficiency. First, knowledge of
The side-match finite-state vector quantization (SMVQ) schemes improve performance over the vector quantization by exploiting the neighboring vector correlations within the image. In this paper, we propose a neural network side-match finite-state vector quantization (NN-SMVQ) scheme that combines th
The line spectrum pair (LSP) is one of the most popular and efficient parameters for representing the short-time spectrum of speech signal. About 34 bits/frame is needed for direct scalar quantization of LSP parameters to maintain a good quality. Based on the spectralweighted Euclidean distance of L