✦ LIBER ✦

N-Best-based unsupervised speaker adaptation for speech recognition

✍ Scribed by Tomoko Matsui; Sadaoki Furui

Publisher: Elsevier Science
Year: 1998
Tongue: English
Weight: 251 KB
Volume: 12
Category: Article
ISSN: 0885-2308
DOI: 10.1006/csla.1997.0036

No coin nor oath required. For personal study only.

✦ Synopsis

This paper proposes an instantaneous speaker adaptation method that uses N-best decoding for continuous mixture-density hidden-Markovmodel-based speech-recognition systems. This method is effective even for speakers whose decoding using speaker-independent (SI) models are error-prone and for whom speaker adaptation techniques are truly needed. In addition, smoothed estimation and utterance verification are introduced into this method. The smoothed estimation is based on the likelihood values for adapted models of word sequences obtained by N-best decoding and improves the performance of error-prone speakers, and the utterance verification technique reduces the amount of calculation required. Performance evaluation using connected-digit (four-digit strings) recognition experiments performed over actual telephone lines showed a reduction of 36•4% in the error rates of speakers whose decoding using SI models are error-prone.

📜 SIMILAR VOLUMES

Speaker adaptation techniques for speech

Speaker adaptation techniques for speech recognition using probabilistic models

✍ Koichi Shinoda 📂 Article 📅 2005 🏛 John Wiley and Sons 🌐 English ⚖ 353 KB

Adaptation to a speaker's voice in a spe

Adaptation to a speaker's voice in a speech recognition system based on synthetic phoneme references

✍ Mats Blomberg 📂 Article 📅 1991 🏛 Elsevier Science 🌐 English ⚖ 698 KB

Automatic selection of phonetically dist

Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition

✍ Jia-lin Shen; Hsin-min Wang; Ren-yuan Lyu; Lin-shan Lee 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 163 KB

This paper presents an approach of automatic selection of phonetically distributed sentence sets for speaker adaptation, and applies the concept to the task of Mandarin speech recognition with very large vocabulary. This is a different approach to the adaptation data selection problem. A computer al

Interpolation of n-gram and mutual-infor

Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

✍ Z. GuoDong; L. KimTeng 📂 Article 📅 1999 🏛 Elsevier Science 🌐 English ⚖ 191 KB

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a th