𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Local Normalization and Delayed Decision Making in Speaker Detection and Tracking

✍ Scribed by Johan Koolwaaij; Lou Boves


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
268 KB
Volume
10
Category
Article
ISSN
1051-2004

No coin nor oath required. For personal study only.

✦ Synopsis


This paper describes A2RT's speaker detection and tracking system and its performance on the 1999 NIST speaker recognition evaluation data. The system does not consist of concatenated modules such as, for instance, silence-speech detection, handset and gender detection, and finally speaker detection or tracking, where each module builds on the hard decisions from previous modules, but rather applies the principle of delayed decision making and postpones all hard decisions until the final stage of the detection process. This paper focuses on two important locality issues in detecting or tracking speakers in a telephone conversation, for which the speaker change frequency is usually high. First, channel estimation needs sufficiently long but homogeneous segments. Several kinds of local channel normalization are compared in this paper. Second, local estimation of speaker likelihoods critically depends on the segmentation of the conversation. Our experiments show that a global level of segmentation really improves speaker tracking performance, whereas a more detailed segmentation is needed for speaker detection, because likelihood computation over clusters of segments depends on the purity of the segments. Furthermore, choosing the appropriate type of channel normalization can give a small but consistent improvement in speaker tracking performance.


πŸ“œ SIMILAR VOLUMES


Multiple Speaker Tracking and Detection:
✍ Kemal SΓΆnmez; Larry Heck; Mitchel Weintraub πŸ“‚ Article πŸ“… 2000 πŸ› Elsevier Science 🌐 English βš– 128 KB

We describe SRI's speaker tracking and detection system in the NIST 1998 Speaker Detection and Tracking Development Evaluation. The system is designed for tracking switchboard conversations and uses a twospeaker and silence hidden Markov model (HMM) with a minimum state duration constraint and Gauss

Approaches to Speaker Detection and Trac
✍ Robert B. Dunn; Douglas A. Reynolds; Thomas F. Quatieri πŸ“‚ Article πŸ“… 2000 πŸ› Elsevier Science 🌐 English βš– 257 KB

Two approaches to detecting and tracking speakers in multispeaker audio are described. Both approaches use an adapted Gaussian mixture model, universal background model (GMM-UBM) speaker detection system as the core speaker recognition engine. In one approach, the individual log-likelihood ratio sco

The ELISA Systems for the NIST'99 Evalua
πŸ“‚ Article πŸ“… 2000 πŸ› Elsevier Science 🌐 English βš– 145 KB

This article presents the text-independent speaker detection and tracking systems developed by the members of the ELISA Consortium for the NIST'99 speaker recognition evaluation campaign. ELISA is a consortium grouping researchers of several laboratories sharing software modules, resources and exper

Decision-making Impairments in Women wit
✍ Unna N. Danner; Carolijn Ouwehand; Noor L. van Haastert; Hellen Hornsveld; Denis πŸ“‚ Article πŸ“… 2011 πŸ› John Wiley and Sons 🌐 English βš– 132 KB

## Abstract ## Objective The purpose of the current study was to examine decision making in female patients with binge eating disorder (BED) in comparison with obese and normal weight women. ## Method In the study, 20 patients with BED, 21 obese women without BED and 34 healthy women participate