Browse > Article
http://dx.doi.org/10.7746/jkros.2019.14.2.139

Synthesizing Image and Automated Annotation Tool for CNN based Under Water Object Detection  

Jeon, MyungHwan (KAIST)
Lee, Yeongjun (Korea Research Institute Ship and Ocean engineering (KRISO))
Shin, Young-Sik (Dept. of Civil and Environmental Engineering, KAIST)
Jang, Hyesu (Dept. of Civil and Environmental Engineering, KAIST)
Yeu, Taekyeong (Korea Research Institute Ship and Ocean engineering (KRISO))
Kim, Ayoung (Dept. of Civil and Environmental Engineering, KAIST)
Publication Information
The Journal of Korea Robotics Society / v.14, no.2, 2019 , pp. 139-149 More about this Journal
Abstract
In this paper, we present auto-annotation tool and synthetic dataset using 3D CAD model for deep learning based object detection. To be used as training data for deep learning methods, class, segmentation, bounding-box, contour, and pose annotations of the object are needed. We propose an automated annotation tool and synthetic image generation. Our resulting synthetic dataset reflects occlusion between objects and applicable for both underwater and in-air environments. To verify our synthetic dataset, we use MASK R-CNN as a state-of-the-art method among object detection model using deep learning. For experiment, we make the experimental environment reflecting the actual underwater environment. We show that object detection model trained via our dataset show significantly accurate results and robustness for the underwater environment. Lastly, we verify that our synthetic dataset is suitable for deep learning model for the underwater environments.
Keywords
Deep Learning; Data Annotation; Object Detection; 3D CAD Model;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, DOI: 10.1109/CVPR.2014.81.
2 J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders, International Journal of Computer Vision, vol. 104, no. 2, pp. 154-171, 2013.   DOI
3 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, DOI: 10.1109/CVPR.2016.91.
4 K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 2980-2988, 2017.
5 R. Girshick, "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 1440-1448, 2015.
6 X. Peng, B. SUN, K. Ali, and K. Saenko, "Exploring Invariances in Deep Convolutional Neural Networks using Synthetic Images," arXiv:1805.12177v2, 2014.
7 H. Su, C. R. Qi, Y. Lim, and L. J. Guibas, "Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 2686-2694, 2015.
8 Trimble Inc, 3D Warehouse, [Online], https://3dwarehouse.sketchup.com, Accessed: March 19, 2019.
9 M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen, and R. Vasudevan, "Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks?," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, Singapore, 2017, DOI: 10.1109/ICRA.2017.7989092.
10 H. Hattori, N. Lee, V. N. Boddeti, F. Beainy, K. M. Kitani, and T. Kanade, "Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator for Static Video Surveillance," International Journal of Computer Vision, vol. 126, no. 9, pp. 1027-1044, Sept., 2018.   DOI
11 P. P. Busto and J. Gall, "Viewpoint refinement and estimation with adapted synthetic data," Computer Vision and Image Understanding, vol. 169, pp. 75-89, Apr., 2018.   DOI
12 Y. Wang, X. Tan, Y. Yang, X. Liu, E. Ding, F. Zhou, and L. S. Davis, "3D Pose Estimation for Fine-Grained Object Categories," European Conference on Computer Vision, pp. 619-632, 2018.
13 Stichting Blender Foundation, Blender, [Online], http://www.blender.org, Accessed: March 19, 2019
14 J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, "SUN database: Large-scale scene recognition from abbey to zoo," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, pp. 3485-3492, 2010.
15 M. Prats, J. Perez, J. J. Fernandez, and P. J. Sanz, "An open source tool for simulation and supervision of underwater intervention missions," 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, pp. 2577-2582, 2012.
16 Y. Cho and A. Kim "Channel invariant online visibility enhancement for visual SLAM in a turbid environment," Journal of Field Robotics, vol. 35, no. 7, pp. 1080-1100, 2018.   DOI
17 Y.-S. Shin, Y.-J. Lee, H-T. Choi, and A. Kim, "Bundle Adjustment and 3D Reconstruction Method for Underwater Sonar Image," Journal of Korea Robotics Society, vol. 11, no. 2, pp. 051-059, Jun., 2016.   DOI
18 SeaDrone Inc, SeaDrone, [Online], https://seadronepro.com, Accessed: March 19, 2019.
19 Y. Lee, J. Choi, and H-T. Choi. "Underwater Robot Localization by Probability-based Object Recognition Framework Using Sonar Image," Journal of Korea Robotics Society, vol. 9, no. 4, pp. 232-241, Nov., 2014   DOI