DOI QR코드

DOI QR Code

Efficient Object Recognition by Masking Semantic Pixel Difference Region of Vision Snapshot for Lightweight Embedded Systems

경량화된 임베디드 시스템에서 의미론적인 픽셀 분할 마스킹을 이용한 효율적인 영상 객체 인식 기법

  • Yun, Heuijee (School of Electronic Engineering, Kyungpook National University) ;
  • Park, Daejin (School of Electronic Engineering, Kyungpook National University)
  • Received : 2022.03.15
  • Accepted : 2022.05.14
  • Published : 2022.06.30

Abstract

AI-based image processing technologies in various fields have been widely studied. However, the lighter the board, the more difficult it is to reduce the weight of image processing algorithm due to a lot of computation. In this paper, we propose a method using deep learning for object recognition algorithm in lightweight embedded boards. We can determine the area using a deep neural network architecture algorithm that processes semantic segmentation with a relatively small amount of computation. After masking the area, by using more accurate deep learning algorithm we could operate object detection with improved accuracy for efficient neural network (ENet) and You Only Look Once (YOLO) toward executing object recognition in real time for lightweighted embedded boards. This research is expected to be used for autonomous driving applications, which have to be much lighter and cheaper than the existing approaches used for object recognition.

카메라를 이용한 영상 처리와 그에 따른 인공지능 기술의 발달로 다양한 분야의 기술이 발전하기 시작했다. 하지만 보드가 가벼울수록 연산이 많이 필요한 영상 처리 알고리즘을 구현하기 힘들다. 본 논문에서는 경량 임베디드 보드에서 물체 인식 알고리즘을 위한 딥러닝을 사용하는 방법을 제안한다. 비교적 적은 양의 계산으로 segmentation을 처리하는 딥러닝 알고리즘을 사용하여 ROI(Region of Interest)를 결정할 수 있다. 영역을 마스킹한 후, 더 정확한 딥러닝 알고리즘을 사용해 물체 감지를 할 수 있다. Python에서 입력 이미지를 처리하기 위해 OpenCV를 사용했고 ENet과 YOLO(You Only Look Once)를 사용하여 이미지를 처리했다. 이 알고리즘을 실행함으로써 평균 오차가 절반으로 감소해 정확한 객체 검출을 처리할 수 있고 경량 임베디드 보드에서 실시간으로 객체 인식을 실행할 수 있다. 이 연구는 자율주행과 IoT에서 저가격 경량화된 응용에 활용될 수 있을 것으로 기대된다.

Keywords

Acknowledgement

This study was supported by the BK21 FOUR project funded by the Ministry of Education, Korea (4199990113966, 10%), and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1A6A1A03025109, 10%). This work was partly supported by an Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2021-0-00944, Metamorphic approach of unstructured validation/verification for analyzing binary code, 40%) and (No. 2022-0-00816, OpenAPI-based hw/sw platform for edge devices and cloud server, integrated with the on-demand code streaming engine powered by AI, 20%) and (No. 2022-0-01170, PIM Semiconductor Design Research Center, 20%).

References

  1. S. Lee, D. Lee, P. Choi, and D. Park, "Efficient Power Reduction Technique of LiDAR Sensor for Controlling Detection Accuracy Based on Vehicle Speed," IEMEK Journal of Embedded Systems and Applications, vol. 15, no. 5, pp. 215-225, Oct. 2020. https://doi.org/10.14372/IEMEK.2020.15.5.215
  2. S. Lee, K. H. Park, D. Park, "Communication-power overhead reduction method using template-based linear approximation in lightweight ecg measurement embedded device," IEMEK Journal of Embedded Systems and Applications, vol. 15, no. 5, pp. 205-214, Aug. 2020. https://doi.org/10.14372/IEMEK.2020.15.5.205
  3. J. Kim and S. Kim "Autonomous-flight Drone Algorithm use Computer vision and GPS," IEMEK Journal of Embedded Systems and Applications, vol. 11, no. 3, pp. 193-200, Jun. 2016. https://doi.org/10.14372/IEMEK.2016.11.3.193
  4. Y. Huang, Y. Li, X. Hu, and W. Ci, "Lane detection based on inverse perspective transformation and Kalman filter," KSII Transactions on Internet and Information Systems (TIIS), vol. 12, no. 2, pp. 643-661, Feb. 2018. https://doi.org/10.3837/tiis.2018.02.006
  5. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas: NV, USA, pp. 779-788, 2016.
  6. K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, Venice, Italy, pp. 2961-2969, 2017.
  7. P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, "Understanding convolution for semantic segmentation," in 2018 IEEE winter conference on applications of computer vision (WACV), Lake Tahoe: NV, USA, pp. 1451-1460, Mar. 2018.
  8. J. Dai, Y. Li, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," Advances in neural information processing systems, vol. 29, pp. 379-387, May. 2016.
  9. A. Paszke, A. Chaurasia, S. Kim, and E.Culurciello, "Enet: A deep neural network architecture for real-time semantic segmentation," arXiv preprint arXiv:1606.02147, 2016.
  10. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas: NV, USA, pp. 770-778, 2016.
  11. S. Kim, Y. Ji, and K.-B. Lee, "An effective sign language learning with object detection based roi segmentation," in Second IEEE International Conference on Robotic Computing (IRC), Laguna Hills: CA, USA, pp. 330-333, 2018.
  12. M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, "2d human pose estimation: New benchmark and state of the art analysis," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus: OH, USA, pp. 3686-3693, Jun. 2014.
  13. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, "TensorFlow: A System for Large-Scale Machine Learning," in 12th USENIX symposium on operating systems design and implementation (OSDI 16), Savannah: GA, USA, pp. 265-283, 2016.
  14. D. G. Kim, Y. S. Park, L. J. Park, and T. Y. Chung, "Developing of new a tensorflow tutorial model on machine learning: focusing on the Kaggle titanic dataset," IEMEK Journal of Embedded Systems and Applications, vol. 14, no. 4, pp. 207-218, Aug. 2019. https://doi.org/10.14372/IEMEK.2019.14.4.207
  15. T. H. Trieu, Darkflow, GitHub Repository. 2018, [Online] Available: https://github.com/thtrieu/darkflow.(accessed on 14 February 2019)
  16. H. Yun and D. Park "Yolo-based Realtime Object Detection using Interleaved Redirection of Time-Multiplexed Streamline of Vision Snapshot for Lightweighted Embedded Processors," in 2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Hualien City, Taiwan, pp. 1-2, Nov. 2021.
  17. NXP. Layerscape LS1028A Family of Industrial Applications Processors [Internet], Available: https://www.nxp.com/docs/en/fact-sheet/ls1028afs.pdf.
  18. J. T. Townsend, "Theoretical analysis of an alphabetic confusion matrix," Perception & Psychophysics, vol. 9, no. 1, pp. 40-50, Jan. 1971. https://doi.org/10.3758/BF03213026
  19. J. Ma, L. Chen, and Z. Gao, "Hardware implementation and optimization of tiny-YOLO network," in International Forum on Digital TV and Wireless Multimedia Communications, Shanghai, China, pp. 224-234, Nov. 2017.
  20. B. Stojanovic, O. Marques, A. Neskovic, and S. Puzovic, "Fingerprint roi segmentation based on deep learning," in 24th Telecommunications Forum (TELFOR), Belgrade, Serbia, pp. 1-4, 2016.
  21. W. Sun, B. Zheng, and W. Qian, "Automatic feature learning using multichannel roi based on deep structured algorithms for computerized lung cancer diagnosis," Computers in biology and medicine, vol. 89, pp. 530-539, Oct. 2017. https://doi.org/10.1016/j.compbiomed.2017.04.006
  22. A. Cerentinia, D. Welfera, M. C. d'Ornellasa, C. J. P. Haygertb, and G. N. Dotto, "Automatic identification of glaucoma using deep learning methods," in Proc. 16th World Congr. Med. Health Informat. Precision Healthcare Through Informat.(MEDINFO), Hangzhou, China, vol. 245, pp. 318-321, 2018.
  23. C. Oksuz, O. Urhan, and M. K. Gullu, "Covid-19 detection with severity level analysis using the deep features, and wrapper-based selection of ranked features," Concurrency and Computation: Practice and Experience, p. e6802, Dec. 2021.