|
Click here for full text:
Decision Combination in Speech Metadata Extraction
Lin, Xiaofan
HPL-2003-12R1
Keyword(s): speech recognition; metadata extraction; decision combination; multi-layer perceptron; gender classification
Abstract: Speech metadata extraction can both improve speech recognition and enable novel Interactive Voice Response applications. Unlike the previous research, which concentrates on the frame-level signal processing and pattern classification, this paper systematically studies the behavior of decision combination at the utterance level. We analyze the asymptotic characteristics, and the factors affecting frame-level classification. In addition, we introduce new methods to more accurately and efficiently combine frame-level decisions, including phoneme/power-based weighting and smart sampling. Experimental results in gender classification are presented. Notes: Copyright IEEE. To be published in and presented at the 37th Asilomar Conference on Signals, Systems and Computers, 9-12 November 2003, Pacific Grove, CA
5 Pages
Back to Index
|