✦ LIBER ✦

A neural fuzzy training approach for improving speech recognition

✍ Scribed by Yasuhiro Komori; Shigeki Sagayama; Alexander H. Waibel

Publisher: John Wiley and Sons
Year: 1993
Tongue: English
Weight: 951 KB
Volume: 24
Category: Article
ISSN: 0882-1666
DOI: 10.1002/scj.4690240808

No coin nor oath required. For personal study only.

✦ Synopsis

Abstract

This paper proposes a new training method for the phoneme identification neural network called “neural fuzzy training.” In the proposed training, nondeterministic (fuzzy) class information is assigned to the training signal, in contrast to the traditional method where a deterministic class information is assigned.

This study aims at the realization of a robust neural network, thereby improving the cumulative recognition rate of the phoneme identification and avoiding overtraining. The proposed neural fuzzy training is realized by backpropagation. In the conventional training, a deterministic phoneme class information is assigned to the training signal of the neural network as the value 1 or 0. However, in the proposed training, the fuzzy class information is assigned to the training signal for each training sample as the likelihood value between 0 and 1.

In the proposed training method, the likelihood is calculated by the monotonically decreasing function (such as exp(−α · d^2^)) of the distance between the training sample and the closest sample belonging to each phoneme class. The proposed neural fuzzy training method has a problem in that a large amount of computation cost is required since the training signal is determined by calculating the distances to all training samples. To solve this problem, the representative samples in each phoneme class are defined and the likelihood to the phoneme classes are determined by calculating the distance between the representative sample and the training sample.

By this simplification of the likelihood calculation, the computational cost to determine the training signal is reduced considerably. To demonstrate the usefulness of the neural fuzzy training, an experiment is conducted: /b, d, g, m, n, N/ identification, 18 consonant identification and phrase recognition using TDNN‐LR. The ATR database is used in the experiment. In the phoneme identification experiment, the speech samples which are extracted using the hand‐label is used. The TDNN is trained using speed samples uttered in word style, and the evaluation is performed using speech samples uttered in phrase style and in sentence style.

In the phrase recognition experiment using TDNN‐LR, the TDNN is trained using speed samples uttered word style using a hand label. The evaluation is performed using speech samples uttered in phrase style. In either experiment, an improvement of using the fuzzy training can be observed. Especially, in the phrase recognition experiment using TDNN‐LR, the top recognition rate is improved from 71.2 percent to 80.9 percent, and the top 5th recognition rate is improved from 92.8 percent to 96.O percent. Furthermore, it appeared also that the neural fuzzy training is a high‐speed training method.

📜 SIMILAR VOLUMES

Parallel implementation of Artificial Ne

Parallel implementation of Artificial Neural Network training for speech recognition

✍ Stefano Scanzio; Sandro Cumani; Roberto Gemello; Franco Mana; P. Laface 📂 Article 📅 2010 🏛 Elsevier Science 🌐 English ⚖ 524 KB

A neuro-fuzzy approach to speech recogni

A neuro-fuzzy approach to speech recognition without time alignment

✍ Mu-Chun Su; Ching-Tang Hsieh; Chieh-Ching Chin 📂 Article 📅 1998 🏛 Elsevier Science 🌐 English ⚖ 622 KB

Several successful approaches to speech recognition have been proposed. Most of them involve time alignment which requires substantial computation and considerable memory storage. In this paper, we present a neuro-fuzzy approach to speech recognition without time alignment. This approach is a powerf

An improved maximum model distance appro

An improved maximum model distance approach for HMM-based speech recognition systems

✍ Q.H He; S Kwong; K.F Man; K.S Tang 📂 Article 📅 2000 🏛 Elsevier Science 🌐 English ⚖ 143 KB

This paper proposes an improved maximum model distance (IMMD) approach for HMM-based speech recognition systems based on our previous work [S. Kwong, Q.H. He, K.F. Man, K.S. Tang. A maximum model distance approach for HMM-based speech recognition, Pattern Recognition 31 (3) (1998) 219}229]. It de"ne

Improvement of noisy speech recognition

Improvement of noisy speech recognition using a proportional alignment decoding algorithm in the training phase

✍ Wei-Wen Hung; Hsiao-Chuan Wang 📂 Article 📅 1998 🏛 Elsevier Science 🌐 English ⚖ 670 KB

Modelling the state duration of hidden Markov models (HMMs) can effectively improve the accuracy in decoding the state sequence of an utterance and result in an improvement of speech recognition accuracy. However, when a speech signal is contaminated by ambient noise, the decoded state sequence may

Computation of temporal pattern primitiv

Computation of temporal pattern primitives in a neural net for speech recognition

✍ Paul Mueller 📂 Article 📅 1988 🏛 Elsevier Science 🌐 English ⚖ 120 KB

A hybrid fuzzy and neural approach for D

A hybrid fuzzy and neural approach for DRAM price forecasting

✍ T. Chen 📂 Article 📅 2011 🏛 Elsevier Science 🌐 English ⚖ 318 KB