음성 질의 기반 디지털 사진 검색 기법

A Query-by-Speech Scheme for Photo Albuming

  • 김태성 (한국정보통신대학교 음성인식기술 연구실) ;
  • 서영주 (한국정보통신대학교 음성인식기술 연구실) ;
  • 이용주 (원광대학교 전기전자 및 정보공학부) ;
  • 김회린 (한국정보통신대학교 음성인식기술 연구실)
  • 발행 : 2006.03.01

초록

In this paper, we introduce two retrieval methods for photos with speech documents. We compare the pattern of speech query with those of speech documents recorded in digital cameras, and measure the similarities, and retrieve photos corresponding to the speech documents which have high similarity scores. As the first approach, a phoneme recognition scheme is used as the pre-processor for the pattern matching, and in the second one, the vector quantization (VQ) and the dynamic time warping (DTW) are applied to match the speech query with the documents in signal domain itself. Experimental results show that the performance of the first approach is highly dependent on that of phoneme recognition while the processing time is short. The second method provides a great improvement of performance. While the processing time is longer than that of the first method due to DTW, but we can reduce it by taking approximated methods.

키워드