DOI QR코드

DOI QR Code

SIFT Image Feature Extraction based on Deep Learning

딥 러닝 기반의 SIFT 이미지 특징 추출

  • Lee, Jae-Eun (Department of Electronic Materials Engineering, Kwangwoon University) ;
  • Moon, Won-Jun (Department of Electronic Materials Engineering, Kwangwoon University) ;
  • Seo, Young-Ho (Department of Electronic Materials Engineering, Kwangwoon University) ;
  • Kim, Dong-Wook (Department of Electronic Materials Engineering, Kwangwoon University)
  • Received : 2019.01.08
  • Accepted : 2019.03.08
  • Published : 2019.03.30

Abstract

In this paper, we propose a deep neural network which extracts SIFT feature points by determining whether the center pixel of a cropped image is a SIFT feature point. The data set of this network consists of a DIV2K dataset cut into $33{\times}33$ size and uses RGB image unlike SIFT which uses black and white image. The ground truth consists of the RobHess SIFT features extracted by setting the octave (scale) to 0, the sigma to 1.6, and the intervals to 3. Based on the VGG-16, we construct an increasingly deep network of 13 to 23 and 33 convolution layers, and experiment with changing the method of increasing the image scale. The result of using the sigmoid function as the activation function of the output layer is compared with the result using the softmax function. Experimental results show that the proposed network not only has more than 99% extraction accuracy but also has high extraction repeatability for distorted images.

본 논문에서는 일정 크기로 자른 영상의 가운데 픽셀이 SIFT 특징점인지를 판별함으로써 SIFT 특징점을 추출하는 딥 뉴럴 네트워크(Deep Neural Network)를 제안한다. 이 네트워크의 데이터 세트는 DIV2K 데이터 세트를 $33{\times}33$ 크기로 잘라서 구성하고, 흑백 영상으로 판별하는 SIFT와는 달리 RGB 영상을 사용한다. 그라운드 트루스(ground truth)는 옥타브(scale, octave)를 0, 시그마(sigma)는 1.6, 간격(intervals)은 3으로 설정하여 추출한 RobHess SIFT 특징들로 구성한다. VGG-16을 기반으로 컨볼루션 층을 13개에서 23개와 33개로 점점 깊은 네트워크를 구성하고, 영상의 스케일을 증가시키는 방법을 바꿔가며 실험을 수행한다. 출력 층의 활성화 함수로 시그모이드(sigmoid) 함수를 사용한 결과와 소프트맥스(softmax) 함수를 사용한 결과를 비교하여 분석한다. 실험결과 제안한 네트워크가 99% 이상의 추출 정확도를 가질 뿐 아니라 왜곡된 영상에 대해서도 높은 추출 반복성을 가진다는 것을 보인다.

Keywords

BSGHC3_2019_v24n2_234_f0001.png 이미지

그림 1. SIFT DoG 피라미드 생성 과정 Fig. 1. The process to form a SIFT DoG pyramid

BSGHC3_2019_v24n2_234_f0002.png 이미지

그림 2. SIFT 극점 추출 과정 Fig. 2. SIFT extrema extraction process

BSGHC3_2019_v24n2_234_f0003.png 이미지

그림 3. 옥타브를 0으로 설정하여 추출한 SIFT 특징들 Fig. 3. SIFT features extracted with octave set to 0

BSGHC3_2019_v24n2_234_f0004.png 이미지

그림 4. 밝기 정도를 변화시킨 영상의 예; (a) +25, (b) +50, (c) +75, (d) +100 Fig. 4. Image examples with varying brightness level; (a) +25, (b) +50, (c) +75, (d) +100

BSGHC3_2019_v24n2_234_f0005.png 이미지

그림 5. 흐림 정도를 변화시킨 영상의 예; (a) 원본, (b) 반경 0.5, (c) 반경 1.0, (d) 반경 1.5, (e) 반경 2.0, (f) 반경 2.5 Fig. 5. Example Images with varying blur level; (a) original, (b) radius 0.5, (c) radius 1.0, (d) radius 1.5, (e) radius 2.0, (f) radius 2.5

BSGHC3_2019_v24n2_234_f0006.png 이미지

그림 6. 왜곡된 영상의 특징점 반복성 측정결과; (a) 밝기 변화, (b) 흐림 정도 변화 Fig. 6. Results of the feature point repeatability measure for the distorted images; (a) change in brightness, (b) change in blur

표 1. 제안한 DNN 구성 Table 1. Configurations of the proposed DNNs

BSGHC3_2019_v24n2_234_t0001.png 이미지

표 2. 제안한 DNN의 실험결과 Table 2. Experimental results for the proposed DNN

BSGHC3_2019_v24n2_234_t0002.png 이미지

References

  1. C. Harris, M. Stephens, "A combined corner and edge detector," Proceedings of the Alvey Vision Conference, pp.147-151, 1988.
  2. K. Mikolajczyk, C. Schmid, "Indexing based on scale invariant interest points," ICCV, Vol.1, pp. 525-531, 2001.
  3. J. Shi, C. Tomasi, "Good features to track," 9th IEEE Conference on Computer Vision and Pattern Recognition, Springer, Heidelberg, 1994.
  4. D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, Vol.60, No.2, pp.91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  5. H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: Speeded up robust features," In European Conference on Computer Vision, Vol.1, No.2, May 2006.
  6. E. Rosten, T. Drummond, "Machine learning for high-speed corner detection," Proc. 9th European Conference on Computer Vision (ECCV'06), May 2006.
  7. E. Mair, G. Hager, D. Burschka, M. Suppa, and G. Hirzinger, "Adaptive and generic corner detection based on the accelerated segment test," Computer Vision-ECCV 2010, Vol.2, No.2, pp.183-196, 2010.
  8. M. WonJun, S. Youngho, and K. Dongwook, "Parameter Analysis for Time Reduction in Extracting SIFT Keypoints in the Aspect of Image Stitching," Journal of Broadcast Engineering, Vol.23, No.4, pp.559-573, July 2018. https://doi.org/10.5909/JBE.2018.23.4.559
  9. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: an efficient alternative to SIFT or SURF," In Proc. of the IEEE Intl. Conf. on Computer Vision (ICCV), Vol.13, 2011.
  10. E. Agustsson, R. Timofte, "NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study," In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017.
  11. R. Hess, "An Open-Source SIFT Library," ACM Multimedia, pp.1493-1496, 2010.
  12. K. Simonyan, A. Zisserman, "Very deep convolutional networks for large-scale image recognition," In Proc. International Conference on Learning Representations (ICLR), 2015.
  13. K. Mikolajczyk, C. Schmid, "Scale and affine invariant interest point detectors," IJCV, Vol.1, No.60, pp.63-86, 2004. https://doi.org/10.1023/B:VISI.0000027790.02288.f2