Acoustic model clustering based on syllable structure
β Scribed by Izhak Shafran; Mari Ostendorf
- Publisher
- Elsevier Science
- Year
- 2003
- Tongue
- English
- Weight
- 218 KB
- Volume
- 17
- Category
- Article
- ISSN
- 0885-2308
No coin nor oath required. For personal study only.
β¦ Synopsis
Current speech recognition systems perform poorly on conversational speech as compared to read speech, arguably due to the large acoustic variability inherent in conversational speech. Our hypothesis is that there are systematic effects in local context, associated with syllabic structure, that are not being captured in the current acoustic models. Such variation may be modeled using a broader definition of context than in traditional systems which restrict context to be the neighboring phonemes. In this paper, we study the use of word-and syllable-level context conditioning in recognizing conversational speech. We describe a method to extend standard tree-based clustering to incorporate a large number of features, and we report results on the Switchboard task which indicate that syllable structure outperforms pentaphones and incurs less computational cost. It has been hypothesized that previous work in using syllable models for recognition of English was limited because of ignoring the phenomenon of resyllabification (change of syllable structure at word boundaries), but our analysis shows that accounting for resyllabification does not impact recognition performance.
π SIMILAR VOLUMES
Document clustering, the grouping of documents into several clusters, has been recognized as a means for improving efficiency and effectiveness of information retrieval and text mining. With the growing importance of electronic media for storing and exchanging large textual databases, document clust
Fuzzy multiset is applicable as a model of information retrieval because it has the mathematical structure that expresses the number and the degree of attribution of an element simultaneously. Therefore, fuzzy multisets can be used also as a suitable model for document clustering. This paper aims at
A method will be presented which allows a construction of an acoustic source model based on the analysis of microphone array measurements during a train pass-by. The conventional array beam-forming technique is used as a kind of pre-processing or "rst order analysis to calculate in a second step the