𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Predictor codebook for speaker-independent speech recognition

✍ Scribed by Takeshi Kawabata


Book ID
104591609
Publisher
John Wiley and Sons
Year
1994
Tongue
English
Weight
752 KB
Volume
25
Category
Article
ISSN
0882-1666

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

This paper discusses a method to handle the diversified dynamic features of speech by representing the dynamic features of speech by spectrum predictors and constructing the codebook containing predictors as the elements. The effectiveness of the method for speaker‐independent speech recognition is examined. Three kinds of predictor structures, i.e., the forward predictor, the backward predictor and the interpolator, are examined. The predictor codebook is constructed by the predictor quantization procedure, which is a small modification of the LBG algorithm. For the evaluation of the phoneme recognition level, two kinds statistical evaluation quantities and the phoneme recognition rate have been considered. It is seen as a result that the predictor codebook can realize a high phoneme separation capability and the robustness against the speaker variation. By combining the process actually into the phrase recognition system, the performance at the continuous speech recognition level is evaluated. In either case, the codebook with the backward predictor as the elements exhibited the highest performance.


πŸ“œ SIMILAR VOLUMES


Speaker-independent speech recognition b
✍ Tetsuo Kosaka; Shoichi Matsunaga; Shigeki Sagayama πŸ“‚ Article πŸ“… 1996 πŸ› Elsevier Science 🌐 English βš– 231 KB

We have already proposed the application of tree-structured speaker clustering to supervised speaker adaptation. This paper proposes its application to unsupervised speaker adaptation and speakerindependent (SI) speech recognition. This clustering involves the selection of a speaker cluster from amo

A speaker-independent word recognition b
✍ Hiroshi Matsuura; Tsuneo Nitta πŸ“‚ Article πŸ“… 1994 πŸ› John Wiley and Sons 🌐 English βš– 833 KB

## Abstract Matrix quantization (MQ) is a method which directly quantizes the spectrum‐time pattern. However, it has a problem in that the quantization error is relatively large compared to the vector quantization (VQ), since the dimension is large and the pattern variation is less. From such a vi

N-Best-based unsupervised speaker adapta
✍ Tomoko Matsui; Sadaoki Furui πŸ“‚ Article πŸ“… 1998 πŸ› Elsevier Science 🌐 English βš– 251 KB

This paper proposes an instantaneous speaker adaptation method that uses N-best decoding for continuous mixture-density hidden-Markovmodel-based speech-recognition systems. This method is effective even for speakers whose decoding using speaker-independent (SI) models are error-prone and for whom sp