A key problem in the use of context-dependent hidden Markov models is the need to balance the desired model complexity with the amount of available training data. This paper describes a method which uses a simple agglomerative algorithm to cluster and tie acoustically similar states. The main proper
Continuously variable duration hidden Markov models for automatic speech recognition
β Scribed by S.E. Levinson
- Publisher
- Elsevier Science
- Year
- 1986
- Tongue
- English
- Weight
- 655 KB
- Volume
- 1
- Category
- Article
- ISSN
- 0885-2308
No coin nor oath required. For personal study only.
β¦ Synopsis
During tile past decade, the applicability of hidden Markov models (HMM) to various facets of speech analysi s has been demonstrated in several different experiments. These investigations all rest on the assumption that speech is a quasi-stationary process whose stationary intervals can be identified with the occupancy of a single state of an appropriate HMM. In the traditional form of the HMM, the probability of duration of a state decreases exponentially with time. This behavior does not provide an adequate representation of the temporal structure of speech.
The solution proposed here is to replace the probability distributions of duration with continuous probability density functions to form a continuously variable duration hidden Markov model (CVDHMM). The gamma distribution is ideally suited to specification. of the durational density since it is one-sided and only has two parameters which, together, define both mean and variance. The main result is a derivation and proof of convergence of re-estimation formulae for all the parameters of the CVDHMM. It is interesting to note that if the state durations are gamma-distributed, one of the formulae is non-algebraic but, fortuitously, has properties such that it is easily and rapidly solved numerically to any desired degree of accuracy. Other results are presented including the performance of the formulae on simulated data.
π SIMILAR VOLUMES
This paper describes a set of modeling techniques for detecting a small vocabulary of keywords in running conversational speech. The techniques are applied in the context of a hidden Markov model (HMM) based continuous speech recognition (CSR) approach to keyword spotting. The word spotting task is
This paper describes, and evaluates on a large scale, the lattice based framework for discriminative training of large vocabulary speech recognition systems based on Gaussian mixture hidden Markov models (HMMs). This paper concentrates on the maximum mutual information estimation (MMIE) criterion wh