Browse > Article
http://dx.doi.org/10.7776/ASK.2009.28.7.674

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis  

Kim, Hyoung-Gook (광운대학교)
Abstract
This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.
Keywords
Phoneme-based retrieval; Word-based retrieval; Hybrid retrieval; Speech recognition; Player event retrieval; Player name retrieval; Spoken content analysis;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 김형국, "오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘," 한국음향학회지, 28권, 5호, 441-446쪽, 2009   과학기술학회마을
2 C. Jingdong J. Benesty, H. Yiteng and S. Doclo, “New insights into the noise reduction Wiener filter,” IEEE Trans. on Audio, Speech, and Language Processing, vol. 14. no. 4, pp. 1218-1234, 2006   DOI   ScienceOn
3 C.-C. Lin, S.-H. Chen, T.-K. Truong and Y. Chang, "Audio classification and categorization based on wavelets and support vector machine," IEEE Trans. on Speech and Audio Processing, vol. 13, no. 5, pp. 644-651, 2005   DOI   ScienceOn
4 P. Yu and F. Seide, "A hybrid word/phoneme-based approach or improved cocabulary-independent search in spontaneous speech," In Proc. ICSLP 2004, pp. 293-296, Oct. 2004
5 J. S. Erkelens, R. C. Hendriks, R. Heusdens and J. Jensen, "Minimum mean-square error estimation of discrete Fourier coefficients with generalized Gamma priors," IEEE Trans. on Audio, Speech, and Language Processing, vol. 15, no. 6, pp. 1741-1752, 2007   DOI   ScienceOn
6 G. Bouselmi, D. Fohr, I. Illina and J.-P. Haton, "Fully automated non-native speech recognition using confusion-based acoustic model integration," In Proc. Interspeech 2005, pp. 1369-1372, Sep. 2005
7 S. E. Johnson, P. Jourlin, J. K. Sparck and P. C. Woodland, "Spoken document retrieval for TREC-9 at cambridge university," 9th TREC9, pp. 117-126, Mar, 2000
8 S. Ravindran and D.V Anderson, "Boosting as a dimen-sionality reduction tool for audio classitication," In Proc. ISCAS 2004, pp. 465-468, May 2004   DOI
9 N. Moreau, H.-G. Kim and T. Sikora, “Combination of phone n-grams for a MPEG-7-based spoken document retrieval system,” In Proc. EUSIPCO 2004, pp. 549-552, Sep. 2004
10 l. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environments," ScienceDirect Signal Pro-cessing, vol. 81, no. 11, pp. 2403-2418, 2001   DOI   ScienceOn