๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

String-based minimum verification error (SB-MVE) training for speech recognition

โœ Scribed by Mazin G. Rahim; Chin-Hui Lee


Publisher
Elsevier Science
Year
1997
Tongue
English
Weight
284 KB
Volume
11
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

โœฆ Synopsis


In recent years, we have experienced an increasing demand for speech recognition technology to be utilized in various real-world applications, such as name dialling, message retrieval, etc. During this process, we have learned that the performance of speech recognition systems under laboratory environment cannot be duplicated in the actual service. Two major causes have been identified to this problem. The first is the lack of robustness when the acoustic conditions in testing are different from those in training. The second is the lack of flexibility when handling spontaneous speech input which often contains extraneous speech in addition to the desired speech segments of key phrases. This paper focuses on one aspect of achieving flexible speech recognition, namely, improving the ability to cope with naturally spoken utterances through discriminative utterance verification. We propose an algorithm for training utterance verification systems based on the minimum verification error (MVE) training framework. Experimental results on speaker-independent telephone-based connected digits show a significant improvement in verification accuracy when the discriminant function used in MVE training is made consistent with the confidence measure used in utterance verification. At a 10% rejection rate, for example, the new proposed method reduces the string error rate by a further 22โ€ข7% over our previously reported results in which the MVE-based discriminative training was not incorporated.


๐Ÿ“œ SIMILAR VOLUMES


Prototype-based minimum classification e
โœ Erik McDermott; Shigeru Katagiri ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 642 KB

In previous work we reported high classification rates for learning vector quantization (LVQ) networks trained to classify phoneme tokens shifted in time. It has since been shown that the framework of minimum classification error (MCE) and generalized probabilistic descent (GPD) can treat LVQ as a s