✦ LIBER ✦

Auditory scene analysis based on time-frequency integration of shared FM and AM (II): Optimum time-domain integration and stream sound reconstruction

✍ Scribed by Mototsugu Abe; Shigeru Ando

Book ID: 104591131
Publisher: John Wiley and Sons
Year: 2002
Tongue: English
Weight: 326 KB
Volume: 33
Category: Article
ISSN: 0882-1666
DOI: 10.1002/scj.1160

No coin nor oath required. For personal study only.

✦ Synopsis

Abstract

In the preceding paper, we have proposed a method for auditory scene analysis, in which the instantaneous frequency, frequency change rate, and amplitude change rate in time‐frequency space are intensified into a multipeak probability density distribution by voting method and the grouping into streams of mixed sounds is realized. In this paper, as the main point of the second half of this method, we will introduce the assumption that the stream parameters vary slowly according to the known dynamics and propose an integration method on the time axis, in which the probability density distribution of the stream parameters is optimally estimated in time series by a nonparametric Kalman filter. By doing so, the mechanism of higher auditory scene analysis such as enhancement of the accuracy of the stream parameters, interpolation and connection of the breaks of the streams, and introduction of a priori knowledge into stream selection can be realized. Moreover, the separation and reconstruction system of sounds which correspond to streams is constructed, and the proposed technique is verified by fundamental experiments for synthesized sounds or musical sounds and voices. © 2002 Wiley Periodicals, Inc. Syst Comp Jpn, 33(10): 83–94, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.1160