DOI QR코드

DOI QR Code

YOLOv4 기반의 소형 물체탐지기법을 이용한 건설도면 내 철강 자재 문자 검출 및 인식기법

Character Detection and Recognition of Steel Materials in Construction Drawings using YOLOv4-based Small Object Detection Techniques

  • Sim, Ji-Woo (Department of Electronics Engineering, Tech University of KOREA) ;
  • Woo, Hee-Jo (Department of Electronics Engineering, Tech University of KOREA) ;
  • Kim, Yoonhwan (WooSung Steel Inc.) ;
  • Kim, Eung-Tae (Department of Electronics Engineering, Tech University of KOREA)
  • 투고 : 2022.04.07
  • 심사 : 2022.05.10
  • 발행 : 2022.05.30

초록

최근 딥러닝 기반의 객체 검출 및 인식 연구가 발전해가면서 산업 및 실생활에 적용되는 범위가 넓어지고 있다. 건설 분야에도 딥러닝 기반의 시스템이 도입되고 있지만 아직은 미온적이다. 건설 도면에서 자재 산출이 수작업으로 이뤄지고 있어 많은 소요시간과 부정확한 적산 결과로 잘못된 물량산출의 거래가 생길 수 있다. 이를 해결하기 위해서 빠르고 정확한 자동 도면 인식시스템이 필요하다. 따라서 본 논문은 건설도면 내 철강 자재를 검출하고 인식하는 인공지능기반 자동 도면 인식 적산 시스템을 제안한다. 빠른 속도의 YOLOv4 기반에 소형 객체 검출성능을 향상하기 위한 복제 방식의 데이터 증강기법과 공간집중 모듈을 적용하였다. 검출한 철강 자재 영역을 문자 인식한 결과를 토대로 철강 자재를 적산한다. 실험 결과 제안한 방식은 기존 YOLOv4 대비 정확도와 정밀도를 각각 1.8%, 16% 증가시켰다. 제안된 방식의 Precision은 0.938, Recall은 1, AP0.5는 99.4%, AP0.5:0.95 68.8%의 향상된 결과를 얻었다. 문자 인식은 기존 데이터를 사용한 인식률 75.6%에 비해 건설도면에 사용되는 폰트에 맞는 데이터 세트를 구성하여 학습한 결과 99.9%의 인식률을 얻었다. 한 이미지 당 평균 소요시간은 검출 단계는 0.013초, 문자 인식은 0.65초, 적산 단계는 0.16초로 총 0.84초의 결과를 얻었다.

As deep learning-based object detection and recognition research have been developed recently, the scope of application to industry and real life is expanding. But deep learning-based systems in the construction system are still much less studied. Calculating materials in the construction system is still manual, so it is a reality that transactions of wrong volumn calculation are generated due to a lot of time required and difficulty in accurate accumulation. A fast and accurate automatic drawing recognition system is required to solve this problem. Therefore, we propose an AI-based automatic drawing recognition accumulation system that detects and recognizes steel materials in construction drawings. To accurately detect steel materials in construction drawings, we propose data augmentation techniques and spatial attention modules for improving small object detection performance based on YOLOv4. The detected steel material area is recognized by text, and the number of steel materials is integrated based on the predicted characters. Experimental results show that the proposed method increases the accuracy and precision by 1.8% and 16%, respectively, compared with the conventional YOLOv4. As for the proposed method, Precision performance was 0.938. The recall was 1. Average Precision AP0.5 was 99.4% and AP0.5:0.95 was 67%. Accuracy for character recognition obtained 99.9.% by configuring and learning a suitable dataset that contains fonts used in construction drawings compared to the 75.6% using the existing dataset. The average time required per image was 0.013 seconds in the detection, 0.65 seconds in character recognition, and 0.16 seconds in the accumulation, resulting in 0.84 seconds.

키워드

과제정보

This work was supported by the Technology development Program (S3025098) funded by the Ministry of SMEs and Startups(MSS, Korea).

참고문헌

  1. R. Girshick, J. Donahue, T. Darrell, and J. Malik. "Rich feature hierarchies for accurate object detection and semantic segmentation", In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580-587, 2014. doi: https://doi.org/10.1109/CVPR.2014.81
  2. Ross Girshick. "Fast R-CNN", In IEEE/CVF International Conference on Computer Vision(ICCV), pp. 1440-1448, 2015. doi: https://doi.org/10.1109/ICCV.2015.169
  3. Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", In Advances in Neural Information Processing Systems 28, 2015. doi: https://doi.org/10.1109/TPAMI.2016.2577031
  4. Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi. "You Only Look Once: Unified, Real-Time Object Detection", In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016. doi: https://doi.org/10.1109/CVPR.2016.91
  5. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. "SSD: Single Shot MultiBox Detector", In European Conference on Computer Vision, pp 21-37, 2016. doi: https://doi.org/10.1007/978-3-319-46448-0_2
  6. Jan Hosang, Rodrigo Benenson, Bernt Schiele. "Learning non- maximum suppression", In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4507-4515, 2017. doi: https://doi.org/10.1109/CVPR.2017.685
  7. Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection", arXiv preprint arXiv:2004.10934, 2020. doi: https://doi.org/10.48550/arXiv.2004.10934
  8. Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh. "CSPNet: A New Backbone that can Enhance Learning Capability of CNN", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 390-391, 2020. doi: https://doi.org/10.1109/CVPRW50498.2020.00203
  9. Eun-seop Yu, Approach of object recognition from image format engineering drawings using deep learning, Department of Precision Mechanical Engineering Graduate School, Kyungpook National University Daegu, Korea, 2019.
  10. Luis Perez, Jason Wang. "The Effectiveness of Data Augmentation in Image Classification using Deep Learning", Convolutional Neural Networks Vis. Recognit. 11:1-8. 2017. doi: https://doi.org/10.48550/arXiv.1712.04621
  11. Mate Kisantal, Zbigniew Wojna, Jakub Murawski, Jacek Naruniec, Kyunghyun Cho. "Augmentation for Small Object Detection", In arXiv preprint arXiv:1902.07296, 2019. doi: https://doi.org/10.5121/csit.2019.91713
  12. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. "Attention Is All You Need", Advances in Neural Information Processing Systems, 30, 2017.
  13. Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon. "CBAM: Convolutional Block Attention Module", In Proceedings of European Conference on Computer Vision, pp.3-19, 2018. doi: https://doi.org/10.1007/978-3-030-01234-2_1
  14. Baoguang Shi, Xinggang Wang, Pengyuan Lyu, Cong Yao, Xiang Bai. "Robust Scene Text Recognition with Automatic Rectification", In IEEE Conference on Computer Vision and Pattern Recognition, pp. 4168-4176, 2016. doi: https://doi.org/10.1109/CVPR.2016.452
  15. Jeonghun Baek, Geewook Kim, Junyeop Lee, Sungrae Park, Dongyoon Han, Sangdoo Yun, Seong Joon Oh, Hwalsuk Lee, "What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis", In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4715-4723, 2019. doi: https://doi.org/10.1109/ICCV.2019.00481