𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Auditory scene analysis based on time-frequency integration of shared FM and AM (II): Optimum time-domain integration and stream sound reconstruction

✍ Scribed by Mototsugu Abe; Shigeru Ando


Book ID
104591131
Publisher
John Wiley and Sons
Year
2002
Tongue
English
Weight
326 KB
Volume
33
Category
Article
ISSN
0882-1666

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

In the preceding paper, we have proposed a method for auditory scene analysis, in which the instantaneous frequency, frequency change rate, and amplitude change rate in time‐frequency space are intensified into a multipeak probability density distribution by voting method and the grouping into streams of mixed sounds is realized. In this paper, as the main point of the second half of this method, we will introduce the assumption that the stream parameters vary slowly according to the known dynamics and propose an integration method on the time axis, in which the probability density distribution of the stream parameters is optimally estimated in time series by a nonparametric Kalman filter. By doing so, the mechanism of higher auditory scene analysis such as enhancement of the accuracy of the stream parameters, interpolation and connection of the breaks of the streams, and introduction of a priori knowledge into stream selection can be realized. Moreover, the separation and reconstruction system of sounds which correspond to streams is constructed, and the proposed technique is verified by fundamental experiments for synthesized sounds or musical sounds and voices. © 2002 Wiley Periodicals, Inc. Syst Comp Jpn, 33(10): 83–94, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.1160