|
Click here for full text:
Approaches to Reduce the Effects of OOV Queries on Indexed Spoken Audio
Logan, Beth; Moreno, Pedro; Van Thong, JM
HPL-2003-46
Keyword(s): spoken document retrieval; speech indexing; out-of- vocabulary words; oov words
Abstract: We present several novel approaches to the OOV query problem for spoken audio: indexing based on syllable- like units called particles and query expansion according to acoustic confusability for a word index. We also examine linear and OOV-based combination of indexing schemes. We experiment on 75 hours of broadcast news, comparing our approaches to a word index, a phoneme index and a phoneme index queried with phoneme sequences. Our results show that our approaches are superior to both a word index and a phoneme index for OOV words, and have comparable performance to the sequence of phonemes scheme. The particle system has worse performance than the acoustic query expansion scheme. The best system uses word queries for in-vocabulary words and a linear combination of the phoneme sequence scheme and acoustic query expansion for OOV words. This system improved the average precision from 0.35 for a word index to 0.40. Notes: Portions of this work were based on papers published in Human Language Technology Conference 24-27 March 2002, San Diego, CA and in the International Conference on Spoken Language Processing, September 2002, Denver, Colorado
17 Pages
Back to Index
|