DOI QR코드

DOI QR Code

FPN(Feature Pyramid Network)을 이용한 고지서 양식 인식

Recognition of Bill Form using Feature Pyramid Network

  • Kim, Dae-Jin (Institute for Image & Cultural Contents, Dongguk University) ;
  • Hwang, Chi-Gon (Dept. of Computer Engineering, IIT, Kwangwoon University) ;
  • Yoon, Chang-Pyo (Dept. Of Computer & Mobile Convergence, GyeongGi University of Science and Technology)
  • 투고 : 2021.01.29
  • 심사 : 2021.03.08
  • 발행 : 2021.04.30

초록

4차산업 혁명 시대를 맞아, 기술의 변화가 다양한 분야에 적용되고 있다. 고지서 분야에서도 자동화, 디지털화, 데이터관리가 되고 있다. 사회에서 유통되는 고지서의 형태는 수만 가지 이상이며, 이를 자동화, 디지털화, 데이터관리를 위해서는 고지서 인식이 필수적이다. 현재 다양한 고지서들을 관리하기 위해서 OCR(Optical Character Recognition) 기술을 활용한다. 이때, 정확도를 높이기 위해, 먼저 고지서 양식을 인식하면, OCR 인식 시 더 높은 인식률을 가질 수 있다. 본 논문에서는 고지서 양식을 구분하기 위해 인덱스로 사용할 수 있는 로고를 객체 인식하였으며, 이때 로고의 크기가 전체 고지서 대비 작으므로 딥러닝 기술 중 FPN(Feature Pyramid Network)을 작은 객체 검출에 활용하였다. 결과적으로, 제안하는 알고리즘을 통해서 자원 낭비를 줄이고, OCR 인식 정확도를 높일 수 있었다.

In the era of the Fourth Industrial Revolution, technological changes are being applied in various fields. Automation digitization and data management are also in the field of bills. There are more than tens of thousands of forms of bills circulating in society and bill recognition is essential for automation, digitization and data management. Currently in order to manage various bills, OCR technology is used for character recognition. In this time, we can increase the accuracy, when firstly recognize the form of the bill and secondly recognize bills. In this paper, a logo that can be used as an index to classify the form of the bill was recognized as an object. At this time, since the size of the logo is smaller than that of the entire bill, FPN was used for Small Object Detection among deep learning technologies. As a result, it was possible to reduce resource waste and increase the accuracy of OCR recognition through the proposed algorithm.

키워드

참고문헌

  1. S. B. Lim and S. M. Cha, "A Study on Promotion Plan of Local Taxpayer Convenience through ICT Technologies - Focus on Intelligent Tax Bill in Gyeonggi Local Government," Korea Association of Tax and Acccounting, vol. 49, no. 0, pp. 95-116, 2016.
  2. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014.
  3. R. Girshick, "Fast r-cnn," Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, 2015.
  4. S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, pp. 91-99, 2015.
  5. K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition," IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1904-1916, 2015. https://doi.org/10.1109/TPAMI.2015.2389824
  6. J. Dai, Y. Li, Y, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," Advances in neural information processing systems, pp. 379-387, 2016.
  7. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
  8. J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
  9. T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
  10. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, "Ssd: Single shot multibox detector," European conference on computer vision, Springer, pp. 21-37, 2016.
  11. Y. Gao, "A One-stage Detector for Extremely-small Objects Based on Feature Pyramid Network," University essay from KTH/Skolan for elektroteknik och datavetenskap (EECS), 2020.
  12. T. Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 936-944, 2017.
  13. R. Rothe, M. Guillaumin, and L. V. Gool, "Non-maximum suppression for object detection by passing messages between windows," Asian Conference on Computer Vision, Springer, pp. 290-306, 2014.
  14. Darknet: Open source neural networks in C [Internet]. Available: http://pjreddie.com/darknet/.
  15. Darknet Custom Object Train [Internet]. Available: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects.