대화형 유전자 알고리즘을 이용한 감성기반 비디오 장면 검색

Emotion-based Video Scene Retrieval using Interactive Genetic Algorithm

  • 유헌우 (연세대학교 인지과학연구소) ;
  • 조성배 (연세대학교 컴퓨터산업공학부)
  • 발행 : 2004.12.01

초록

본 논문에서는 감성에 기반한 장면단위 비디오 검색방법을 제안한다 먼저 특정 줄거리를 담은 장면 비디오 클립에서 급진적/점진적 샷 경계 검출 후. "평균 색상 히스토그램", "평균 자기", "평균 에지 히스토그램", "평균 샷 시간", "점진적 샷 변화율"의 5가지 특징을 추출하고, 이 특징과 사람이 막연하게 가지고 있는 감성공간과의 매핑을 대화형 유전자 알고리즘(IGA, Interactive Genetic Algorithm)을 통하여 실현한다. 제안된 검색 알고리즘은 초기 모집단 비디오들에 대해 찾고자 하는 감성을 내포하고 있는 비디오를 선택하면 선택된 비디오들에서 추출된 특징 벡터를 염색체로 간주하고 이에 대해 교차연산(crossover)을 적용한다. 다음에 새롭게 생성된 염색체들과 특징벡터로 색인된 데이타베이스 비디오들간에 유사도 함수에 의해 가장 유사한 비디오들을 검색하여 다음 세대의 집단으로 제시한다. 이와 같은 과정을 여러 세대에 걸쳐서 실행하여 사용자가 가지고 있는 감성을 내포하는 비디오 집단들을 얻게 된다 제안된 방법의 효과성을 보이기 위해, 300개의 광고 비디오 클립들에 대해 "action", "excitement", "suspense", "quietness", "relaxation", "happiness" 의 감성을 가진 비디오를 검색한 결과 평균 70%의 만족도를 얻을 수 있었다.

An emotion-based video scene retrieval algorithm is proposed in this paper. First, abrupt/gradual shot boundaries are detected in the video clip representing a specific story Then, five video features such as 'average color histogram' 'average brightness', 'average edge histogram', 'average shot duration', and 'gradual change rate' are extracted from each of the videos and mapping between these features and the emotional space that user has in mind is achieved by an interactive genetic algorithm. Once the proposed algorithm has selected videos that contain the corresponding emotion from initial population of videos, feature vectors from the selected videos are regarded as chromosomes and a genetic crossover is applied over them. Next, new chromosomes after crossover and feature vectors in the database videos are compared based on the similarity function to obtain the most similar videos as solutions of the next generation. By iterating above procedures, new population of videos that user has in mind are retrieved. In order to show the validity of the proposed method, six example categories such as 'action', 'excitement', 'suspense', 'quietness', 'relaxation', 'happiness' are used as emotions for experiments. Over 300 commercial videos, retrieval results show 70% effectiveness in average.

키워드

참고문헌

  1. M. Flickner et al., 'Query by image and video content: The QBIC system,' IEEE computer, vol. 28, no. 9, pp. 23-32, 1995 https://doi.org/10.1109/2.410146
  2. A. Pentland, R.W. Picard, and S. Sclaroff, 'Photobook: Content-Based Manipulation of Image Data-bases,' International Journal of Computer Vision, vol. 18, no. 3, pp. 233-254, 1996 https://doi.org/10.1007/BF00123143
  3. J.R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Humphrey, R.C. Jain, and C. Shu, 'The Virage Image Search Engine: An Open Framework for Image Management,' In Proc. SPIE Vol. 2670: Storage and Retrieval for Images and Video Databases IV, pp. 76-86, 1996 https://doi.org/10.1117/12.234785
  4. J.R. Smith and S.-E. Chang, 'VisualSEEK: A Fully Automated Content-Based Image Query System,' in Proc. ACM Multimedia, pp.87-98, 1996 https://doi.org/10.1145/244130.244151
  5. W.Y. Ma and B.S. Manjunath, 'Netra: A Toolbox for Navigating Large Image Databases,' Multimedia Systems, vol. 7, no. 3, pp. 184-198, 1999 https://doi.org/10.1007/s005300050121
  6. C. Carson, S. Belongie, H. Greenspan, and J. Malick, 'Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying,' IEEE Trans, on Pattern Ana-lysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, 2002 https://doi.org/10.1109/TPAMI.2002.1023800
  7. H.-W. Yoo, D.-S. Jang, S.-H. Jung, J.-H. Park, and K.-S. Song, 'Visual Information Retrieval System via Content-Based Approach,' Pattern Recognition, vol. 35, no. 3, pp. 749-769, 2002 https://doi.org/10.1016/S0031-3203(01)00072-3
  8. H.-W. Yoo, S.-H. Jung, D.-S. Jang, and Y.-K. Na, 'Extraction of Major Object Features Using VQ Clustering for Content-Based Image Retrieval,' Pattern Recognition, vol. 35, no. 5, pp. 1115-1126, 2002 https://doi.org/10.1016/S0031-3203(01)00105-4
  9. B. T. Truong, C. Dorai, and S. Venkatesh, 'New Enhancements to Cut, Fade, and Dissolve Detection Processes in Video Segmentation,' in Proc. ACM Int. Conf. on Multimedia, pp.219-227, 2000 https://doi.org/10.1145/354384.354481
  10. Ullas Gargi, Tangachar Kasturi, and Susan H. Srayer, 'Performance Characterization of Videl-Shot-Change Detection Methods', IEEE Trans. Circuits and Systems for Video Technology, pp.1-13, Vol. 10, No.1, 2000 https://doi.org/10.1109/76.825852
  11. T. P. Minka and R. W. Picard, 'Interactive Learning Using a Society of Models,' Pattern Recognition, vol. 30, no.3, pp. 565-581, 1997 https://doi.org/10.1016/S0031-3203(96)00113-6
  12. A. Vailaya, A. K. Jain, and H.J Zhang, 'On Image Classification: City Images vs. Landscapes,' Pattern Recognition, vol. 31, no. 12, pp. 1921-1936, 1998 https://doi.org/10.1016/S0031-3203(98)00079-X
  13. A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H.J Zhang, 'Image Classification for Contentbased Indexing,' IEEE Trans, on Image Processing, vol. 10, no. 1, pp. 117-130, 2001 https://doi.org/10.1109/83.892448
  14. Y. Rui, T.S. Huang, M. Ortega, and S. Mehrota, 'Relevance Feedback: A Power Tool in Interactive Content-Based Image Retrieval,' IEEE Trans, on Circuits and Systems Video Technology, vol. 8, no. 5, pp. 644-655, 1998 https://doi.org/10.1109/76.718510
  15. I.J. Cox, M.L. Miller, T.P. Minka, T.V. Papathomas, and P.N. Yianilos, 'The Bayesian Image Retrieval System, PicHunter : Theory, Implementation and Psycophysical Experiments,' IEEE Trans, on Image Processing, vol. 9, no 1, pp. 20-37, 2000 https://doi.org/10.1109/83.817596
  16. T. Soen, T. Shimada, and M. Akita, 'Objective Evaluation of Color Design,' Color Research and Application, vol. 12, no. 4, pp. 184-194, 1987
  17. S.-B. Cho, 'Towards Creative Evolutionary Systems with Interactive Genetic Algorithm,' Applied Intelligence, vol. 16, no. 2, pp. 129-138, 2002 https://doi.org/10.1023/A:1013614519179
  18. H. Takagi, T. Noda, and S-B. Cho, 'Psychological Space to Hold Impression among Media in Common for Media Database Retrieval System,' in Proc. IEEE Int. Conf. on System, Man, and Cybernetics, pp.263-268, 1999
  19. J.-S. Um, K.-B. Eum, and J.-W. Lee, 'A Study of the Emotional Evaluation Models of Color Patterns Based on the Adaptive Fuzzy System and the Neural Network,' Color Research and Application, vol. 27, no. 3, pp. 208-216, 2002 https://doi.org/10.1002/col.10052
  20. C. Colombo, A. Del Bimbo, and P. Pala, 'Seman-tics in Visual Information Retrieval,' IEEE Multimedia, vol. 6, no. 3, pp.38-53, 1999 https://doi.org/10.1109/93.790610
  21. C. Colombo, A. Del Bimbo, and P. Pala, 'Retrieval of Commercials by Semantic Content: The Semiotic Perspective,' Multimedia Tools and Applications, vol. 13, no. 1, pp. 93-118, 2001 https://doi.org/10.1023/A:1009681324605
  22. J. Itten, Art of Color (Kunst der Farbe), Otto Maier Verlag, Ravensburg, Germany, 1961 (in German)
  23. H.-W. Yoo and D-S. Jang, 'Automated Video Segmentation Using Computer Vision Technique,' International Journal of Information Technology and Decision Making, vol. 2, no. 4, 2003 (To appear)
  24. D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989
  25. J. A. Biles, 'GenJam: A Genetic Algorithm for Generating Jazz Solos,' in Proc. Int. Computer Music Conf., pp. 131-137, 1994
  26. C. Caldwell and V. S. Johnston, 'Tracking a Criminal Suspect through Face-Space with a Genetic Algorithm,' in Proc. Int. Conf. Genetic Algorithm, pp. 416-421, 1991
  27. W. Banzhaf, 'Interactive Evolution,' Handbook of Evolutionary Computation, 1997
  28. J.-Y. Lee and S.-B. Cho, 'Interactive Genetic Algorithm for Content-Based Image Retrieval,' in Proc. Asia Fuzzy Systems Symposium, pp. 479-484, 1998
  29. H. Takagi, 'Interactive Evolutionary Computation: Fusion of the Capabilities of EC Optimization and Human Evaluation,' Proc. of the IEEE, vol. 89, no. 9, pp. 1275-1296, 2001 https://doi.org/10.1109/5.949485