[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.01.011

Self-Supervised Rigid Registration for Small Images

Ma, Ruoxin (School of Electronics and Information Engineering, Tongji University)
Zhao, Shengjie (School of Electronics and Information Engineering, Tongji University)
Cheng, Samuel (School of Electrical and Computer Engineering, University of Oklahoma)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.1, 2021 , pp. 180-194 More about this Journal

Abstract

For small image registration, feature-based approaches are likely to fail as feature detectors cannot detect enough feature points from low-resolution images. The classic FFT approach's prediction accuracy is high, but the registration time can be relatively long, about several seconds to register one image pair. To achieve real-time and high-precision rigid registration for small images, we apply deep neural networks for supervised rigid transformation prediction, which directly predicts the transformation parameters. We train deep registration models with rigidly transformed CIFAR-10 images and STL-10 images, and evaluate the generalization ability of deep registration models with transformed CIFAR-10 images, STL-10 images, and randomly generated images. Experimental results show that the deep registration models we propose can achieve comparable accuracy to the classic FFT approach for small CIFAR-10 images (32×32) and our LSTM registration model takes less than 1ms to register one pair of images. For moderate size STL-10 images (96×96), FFT significantly outperforms deep registration models in terms of accuracy but is also considerably slower. Our results suggest that deep registration models have competitive advantages over conventional approaches, at least for small images.

Keywords

Rigid Registration; Self-Supervised Learning; Small Image; LSTM; Homography Estimation;

Citations & Related Records

Reference

1	A. Coates, A. Y. Ng, and H. Lee, "An analysis of single-layer networks in unsupervised feature learning," Journal of Machine Learning Research, vol. 15, pp. 215-223, 2011.
2	R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed., Cambridge, UK: Cambridge University Press, 2006.
3	J. Le Moigne, N. S. Netanyahu, and R. D. Eastman, Image Registration for Remote Sensing, Cambridge, UK: Cambridge University Press, 2011.
4	J. Le Moigne, "Introduction to remote sensing image registration," in Proc. of 2017 IEEE International Geoscience and Remote Sensing Symposium, pp. 2565-2568, 2017.
5	S. Agarwal, Y. Furukawa, N. Snavely, I. Simon, B. Curless, S. M. Seitz, and R. Szeliski, "Building Rome in a day," Communications of the ACM, vol. 54, no. 10, 2011.
6	R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, "ORB-SLAM: A Versatile and Accurate Monocular SLAM System," IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, 2015. DOI
7	J. Ma, X. Jiang, A. Fan, J. Jiang, and J. Yan, "Image Matching from Handcrafted to Deep Features: A Survey," International Journal of Computer Vision, 2020.
8	E. Ferrante and N. Paragios, "Slice-to-volume medical image registration: A survey," Medical Image Analysis, vol. 39, pp. 101-123, 2017. DOI
9	Y. Fu, Y. Lei, T. Wang, W. J. Curran, T. Liu, and X. Yang, "Deep Learning in Medical Image Registration: A Review," Physics in Medicine and Biology, vol. 65, no. 20, 2020.
10	P. Markelj, D. Tomazevic, B. Likar, and F. Pernus, "A review of 3D/2D registration methods for image-guided interventions," Medical Image Analysis, vol. 16, no. 3, pp. 642-661, 2012. DOI
11	R. Liao, L. Zhang, Y. Sun, S. Miao, and C. Chefd'Hotel, "A Review of Recent Advances in Registration Techniques Applied to Minimally Invasive Therapy," IEEE Transactions on Multimedia, vol. 15, no. 5, pp. 983-1000, 2013. DOI
12	D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, vol. 60, pp. 91-110, 2004. DOI
13	H. Bay, T. Tuytelaars, and L. Van Gool, "SURF: Speeded Up Robust Features," in Proc. of European Conference on Computer Vision, pp. 404-417, 2006.
14	E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient alternative to SIFT or SURF," in Proc. of 2011 International Conference on Computer Vision, pp. 2564-2571, 2011.
15	P. Alcantarilla, J. Nuevo, and A. Bartoli, "Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces," in Proc. of the British machine Vision Conference, pp. 13.1-13.11, 2013.
16	E. De Castro and C. Morandi, "Registration of Translated and Rotated Images Using Finite Fourier Transforms," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 700-703, 1987. DOI
17	T. Tuytelaars and K. Mikolajczyk, "Local Invariant Feature Detectors: A Survey," Foundations and Trends^® in Computer Graphics and Vision, vol. 3, no. 3, pp. 177-280, 2007. DOI
18	K. Mikolajczyk and C. Schmid, "A performance evaluation of local descriptors," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005. DOI
19	A. Gruen, "Development and Status of Image Matching in Photogrammetry," in Proc. of Ian Dowman Retirement Symposium, vol. 27, no. 137, pp. 36-57, 2012.
20	B. S. Reddy and B. N. Chatterji, "An FFT-based technique for translation, rotation, and scaleinvariant image registration," IEEE Transactions on Image Processing, vol. 5, no. 8, pp. 1266-1271, 1996. DOI
21	X. Tong, K. Luan, U. Stilla, Z. Ye, Y. Xu, S. Gao, H. Xie, Q. Du, S. Liu, X. Xu, and S. Liu, "Image Registration With Fourier-Based Image Correlation: A Comprehensive Review of Developments and Applications," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 10, pp. 4062-4081, 2019. DOI
22	J. P. Lewis, "Fast Template Matching," Vision Interface, Quebec City, QC, Canada, pp. 120-123, 1995.
23	L. Jing and Y. Tian, "Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 1, 2020.
24	A. Krizhevsky, "Learning multiple layers of features from tiny images," Univ. of Toronto, Toronto, ON, Canada, 2009.
25	D. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," in Proc. of International Conference on Learning Representations, 2014.
26	B. Zitova and J. Flusser, "Image registration methods: a survey," Image and Vision Computing, vol. 21, no. 11, pp. 977-1000, 2003. DOI
27	D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach. Englewood Cliffs, NJ, USA: Prentice Hall, 2003.
28	S. S. M. Salehi, S. Khan, D. Erdogmus, and A. Gholipour, "Real-Time Deep Pose Estimation with Geodesic Loss for Image-to-Template Rigid Registration," IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 470-481, Feb. 2019. DOI
29	S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. DOI
30	A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An imperative style, high-performance deep learning library," Advances in Neural Information Processing Systems, 2019.
31	S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," in Proc. of 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 5987-5995, 2017.
32	J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in Proc. of 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 248-255, 2019.
33	F. L. Bookstein, "Principal warps: thin-plate splines and the decomposition of deformations," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 6, pp. 567-585, 1989. DOI
34	V. Villena-Martinez, S. Oprea, M. Saval-Calvo, J. Azorin-Lopez, A. Fuster-Guillo, and R. B. Fisher, "When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)," Applied Sciences, vol. 10, no. 21, p. 7524, 2020. DOI
35	T. Nguyen, S. W. Chen, S. S. Shivakumar, C. J. Taylor, and V. Kumar, "Unsupervised Deep Homography: A Fast and Robust Homography Estimation Model," IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2346-2353, 2018. DOI
36	I. Rocco, R. Arandjelovic, and J. Sivic, "Convolutional Neural Network Architecture for Geometric Matching," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 11, pp. 2553-2567, 2019. DOI
37	K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2015.
38	Z. Chen, Z. Xu, Q. Gui, X. Yang, Q. Cheng, W. Hou, and M. Ding, "Self-learning based medical image representation for rigid real-time and multimodal slice-to-volume registration," Information Sciences, vol. 541, pp. 502-515, 2020. DOI
39	S. Miao, Z. J. Wang and R. Liao, "A CNN Regression Approach for Real-Time 2D/3D Registration," IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1352-1363, May 2016. DOI
40	J. M. SloanK, A. Goatman, and J. P. Siebert, "Learning Rigid Image Registration - Utilizing Convolutional Neural Networks for Medical Image Registration," in Proc. of the 11^th International Joint Conference on Biomedical Engineering Systems and Technologies, vol. 2, pp. 89-99, 2018.
41	M. A. Fischler and R. C. Bolles, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Communications of the ACM, vol. 24, no. 6, 1981.
42	T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, "An efficient k-means clustering algorithm: analysis and implementation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, July 2002. DOI