Content-Based Image Retrieval using RBF Neural Network

RBF 신경망을 이용한 내용 기반 영상 검색

  • 이형구 (한국전자통신연구원 영상정보처리연구팀 연구원) ;
  • 유석인 (서울대학교 컴퓨터공학부)
  • Published : 2002.04.01

Abstract

In content-based image retrieval (CBIR), most conventional approaches assume a linear relationship between different features and require users themselves to assign the appropriate weights to each feature. However, the linear relationship assumed between the features is too restricted to accurately represent high-level concepts and the intricacies of human perception. In this paper, a neural network-based image retrieval (NNIR) model is proposed. It has been developed based on a human-computer interaction approach to CBIR using a radial basis function network (RBFN). By using the RBFN, this approach determines the nonlinear relationship between features and it allows the user to select an initial query image and search incrementally the target images via relevance feedback so that more accurate similarity comparison between images can be supported. The experiment was performed to calculate the level of recall and precision based on a database that contains 1,015 images and consists of 145 classes. The experimental results showed that the recall and level of the proposed approach were 93.45% and 80.61% respectively, which is superior than precision the existing approaches such as the linearly combining approach, the rank-based method, and the backpropagation algorithm-based method.

내용 기반 영상 검색에서 대부분의 기존 방법들은 서로 다른 특징들 사이의 선형 관계를 가정하고 또 사용자가 직접 각 특징의 가중치를 설정하도록 한다 허나 특징들 사이의 관계가 선형적으로 가정된 하에서는 고차원의 개념과 인간의 지각 주관성을 충분히 표현할 수 없는 단점이 있다. 본 논문에서는 신경망에 기반한 영상 검색 모델이 제안된다. 이는 RBFN을 이용한 내용 기반 영상 검색 기법과 인간컴퓨터 상호작용의 접근 방법을 기반으로 구축되었다. RBFN을 이용하여 특징들 사이의 비선형적 관계를 추출해낼 수 있고 사용자가 처음에 질의 영상을 선택하고 관련성 피드백을 통하여 점차적으로 목표 영상을 찾아나가도록 함으로써 영상의 비교를 더 정확하게 할 수 있다. 실험은 145개의 클래스로 구분되며 1,015개의 영상을 포함하는 데이타베이스를 사용하여 재생과 정도를 계산하였다. 실험 결과는 제안된 방법의 재생과 정도가 각각 93.45%과 80.61%로서, 기존의 선형 결합 방법이나 순위 기반 방법 그리고 역전파 알고리즘에 기반한 방법보다 더 뛰어난 검색 성능을 지님을 보여준다.

Keywords

References

  1. Y. Rui, T. S. Huang, and S. Chang, 'Image Retrieval: Current Techniques, Promising Directions, and Open Issues', Journal of Visual Communication and Image Representation, Vol.10, pp. 39-62, 1999 https://doi.org/10.1006/jvci.1999.0413
  2. R. Jain, 'Workshop report: NSF workshop on visual information management systems', Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases I, pp, 198-218, San Jose, CA, 1993 https://doi.org/10.1117/12.143650
  3. T. P. Minka and R. W. Picarc, 'Interactive Learning using a 'Society of Models', Proc. IEEE Conf. Comput. Vision Pattern Recognition, pp. 447-452, San Francisco, CA, 1996 https://doi.org/10.1109/CVPR.1996.517110
  4. W. Y. Ma and B. S. Manjunath, 'Texture Features and Learning Similarity', Proc. IEEE Conf. Comput. Vision Pattern Recognition, pp. 425-430, San Francisco, CA, 1996 https://doi.org/10.1109/CVPR.1996.517107
  5. J. R. Smith and S. F. Chang, 'Visually Serarching the Web for Content', IEEE Multimedia Mag., Vol.4, pp. 12-20, Summer 1997 https://doi.org/10.1109/93.621578
  6. S. Haykin, Neural Networks, 2nd ed., Prentice Hall, New Jersey, 1999
  7. M. Wood, N. Campbell, and B. Thomas, 'Iterative Refinement by Relevance Feedback in ContentBased Digital Image Retrieval', Proc. ACM Multimedia 98, pp. 13-20, Bristol, UK, September 1998 https://doi.org/10.1145/290747.290750
  8. R. M. Rao and A. S. Bopardikar, Wavelet Transforms, Addison-Wesley, Massachusetts, 1998
  9. B. M. Mehtre, M. Kankanhalli, and W. F. Lee, 'Shape Measures for Content Based Image Retrieval: A Comparison', Information Processing & Management, Vol.33, No.3, pp. 319-337, 1997 https://doi.org/10.1016/S0306-4573(96)00069-6
  10. A. Pentland, R. Picard, and S. Sclaroff, 'Photobook: Tools for Content-Based Manipulation of Image Databases', Proc. SPIE Conf. Storage and Retrieval of Image and Video Databases II, pp 34-47, San Jose, CA, 1994 https://doi.org/10.1117/12.171786
  11. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, and B. Dom et al., 'Query by Image ad Video Content: The QBIC System', IEEE Computer, Vol.28, No.9, pp. 23-32, 1995 https://doi.org/10.1109/2.410146
  12. J. R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Jain, and C. F. Shu, 'The Virage Image Search Engine: An Open Framework for Image Management', Proc. SPIE Conf. Storage and Retrieval for Still Image and Video Databases IV, pp. 76-87, San Jose, CA, February 1996 https://doi.org/10.1117/12.234785
  13. W. Y. Ma and B. S. Manjunath, 'NETRA: A Toolbox for Navigating Large Image Databases', Proc. IEEE Int. Conf. Image Processing, pp. 568-571, Washington, DC, 1997 https://doi.org/10.1007/s005300050121
  14. G. Sheikholeslami, S. Chatterjee, and A. Zhang, 'NeuroMerge: An Approach for Combining Heterogeneous Features in Content-based Image Retrieval Systems', Proc. 4th Inc. Workshop on Multi-Media Database Management Systems (IW-MMDBMS'98), pp. 106-113. Dayton, Ohio, August 1998 https://doi.org/10.1109/MMDBMS.1998.709516
  15. T. P. Minka, 'An Image Database Browser that Learns from User Interaction', Technical Report 365, MIT Media Laboratory Perceptual Computing, 1996
  16. F. Liu and R. Picard, 'Periodicity, Directionality, and Randomness: Wold Features for Image Modeling and Retrieval', Technical Report 320, MIT Media Laboratory Perceptual Computing, 1996
  17. H. J. Zhang and D. Zhong, 'A Scheme for Visual Feature Based Image Retrieval', Proc. SPIE Conf, Storage and Retrieval for Image and Video Database III, pp. 36-46, San Jose, CA, 1995
  18. J. Dowe, 'Content-Based Retrieval in Multimedia Imaging', Proc. SPIE Conf, Storage and Retrieval for Image and Video Databases I, pp. 164-167, San Jose, CA, 1993
  19. J. Park and I. W. Sandberg, 'Universal Approximation using Radial-Basis-Function Networks', Neural Computation, Vol.3, No.2, pp. 246-257, 1991 https://doi.org/10.1162/neco.1991.3.2.246
  20. S. Chen, C.F.N.Cowan and P.M. Grant, 'Orthogonal Least Squares Learning Algorithm for Radial Basis Function Networks', IEEE Tr. on Neural Networks, Vol.2, No.2, pp. 302-309, March 1991 https://doi.org/10.1109/72.80341