YOLO, EAST : Comparison of Scene Text Detection Performance, Using a Neural Network Model

Park, Chan Yong;Lim, Young Min;Jeong, Seung Dae;Cho, Young Heuk;Lee, Byeong Chul;Lee, Gyu Hyun;Kim, Jin Wook;

doi:10.3745/KTSDE.2022.11.3.115

KIPS Transactions on Software and Data Engineering (정보처리학회논문지:소프트웨어 및 데이터공학)

Volume 11 Issue 3
/
Pages.115-124
/
2022
/
2287-5905(pISSN)
/
2734-0503(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

YOLO, EAST : Comparison of Scene Text Detection Performance, Using a Neural Network Model

YOLO, EAST: 신경망 모델을 이용한 문자열 위치 검출 성능 비교

박찬용 ((주)투아트) ;
임영민 ((주)투아트) ;
정승대 ((주)투아트) ;
조영혁 ((주)투아트) ;
이병철 ((재)경상북도경제진흥원 일자리산업실) ;
이규현 (경북대학교 컴뷰터공학부) ;
김진욱 (경북대학교 컴뷰터공학부)

Received : 2021.06.29
Accepted : 2021.08.31
Published : 2022.03.31

https://doi.org/10.3745/KTSDE.2022.11.3.115 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, YOLO and EAST models are tested to analyze their performance in text area detecting for real-world and normal text images. The earl ier YOLO models which include YOLOv3 have been known to underperform in detecting text areas for given images, but the recently released YOLOv4 and YOLOv5 achieved promising performances to detect text area included in various images. Experimental results show that both of YOLO v4 and v5 models are expected to be widely used for text detection in the filed of scene text recognition in the future.

본 논문에서는 최근 다양한 분야에서 많이 활용되고 있는 YOLO와 EAST 신경망을 이미지 속 문자열 탐지문제에 적용해보고 이들의 성능을 비교분석 해 보았다. YOLO 신경망은 일반적으로 이미지 속 문자영역 탐지에 낮은 성능을 보인다고 알려졌으나, 실험결과 YOLOv3는 문자열 탐지에 비교적 약점을 보이지만 최근 출시된 YOLOv4와 YOLOv5의 경우 다양한 형태의 이미지 속에 있는 한글과 영문 문자열 탐지에 뛰어난 성능을 보여줌을 확인하였다. 따라서, 이들 YOLO 신경망 기반 문자열 탐지방법이 향후 문자 인식 분야에서 많이 활용될 것으로 전망한다.

Keywords

Acknowledgement

이 논문은 2019~2021년도 중소벤처기업부의 창업성장 기술개발사업 지원에 의해 이루어짐[S2833775].

References

Y. M. Baek, B. D. Lee, D. Y. Han, S. D. Yun, and H. S. Lee, "Character region awareness for text detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.9365-9374, 2019.
T. Wang, T. Zhu, L. Jin, C. Luo, X. Chen, Y. Wu, and M. Cai, "Decoupled attention network for text recognition," in Proceedings of the AAAI Conference on Artificial Intelligence, Vol.34, No.7, pp.12216-12224, 2019.
P. Lyu, C. Yao, W. Wu, S. Yan, and X. Bai, "Multi-oriented scene text detection via corner localization and region segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7553-7563, 2018.
Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, "Detecting text in natural image with connectionist text proposal network," in European Conference on Computer Vision, Springer, Cham, pp.56-72, 2016.
M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, "Textboxes: A fast text detector with a single deep neural network," in Thirty-first AAAI Conference on Artificial Intelligence, 2017.
M. Liao, B. Shi, and X. Bai, "Textboxes++: A single-shot oriented scene text detector," IEEE Transactions on Image Processing, Vol.27, No.8, pp.3676-3690, 2018. https://doi.org/10.1109/TIP.2018.2825107
F. Jiang, Z. Hao, and X. Liu, "Deep scene text detection with connected component proposals," arXiv preprint arXiv:1708.05133, 2017.
Y. Jiang, et al., "R2cnn: rotational region cnn for orientation robust scene text detection," arXiv preprint arXiv:1706.09579, 2017.
X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, "East: an efficient and accurate scene text detector," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5551-5560, 2017.
H. Hu, C. Zhang, Y. Luo, Y. Wang, J. Han, and E. Ding, "Wordsup: Exploiting word annotations for character based text detection," in Proceedings of the IEEE International Conference on Computer Vision, pp.4940-4949, 2017.
S. Long, J. Ruan, W. Zhang, X. He, W. Wu, and C. Yao, "Textsnake: A flexible representation for detecting text of arbitrary shapes," in Proceedings of the European Conference on Computer Vision (ECCV), pp.20-36, 2018.
T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao, and C. Sun, "An end-to-end textspotter with explicit alignment and attention," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5020-5029, 2018.
P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai, "Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes," in Proceedings of the European Conference on Computer Vision (ECCV), pp.67-83, 2018.
P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, and X. Li, "Single shot text detector with regional attention," in Proceedings of the IEEE International Conference on Computer Vision, pp.3047-3055, 2017.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, 2016.
S. Qin and R. Manduchi, "Cascaded segmentation-detection networks for word-level text spotting," in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol.1, pp.1275-1282, 2017.
A. Bochkovskiy, C. Y. Wang, and H. Y. M.. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
G. Jocher, K. Nishimura, T. Mineeva, R. Vilarino, GitHub repository [Internet], https://github.com/ultralytics/yolov5
J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
X. Wang, S. Zheng, C. Zhang, R. Li, and L. Gui, "R-YOLO: A real-time text detector for natural scenes with arbitrary rotation," Sensors, Vol.21, No.3, pp.888, 2021.

KIPS Transactions on Software and Data Engineering (정보처리학회논문지:소프트웨어 및 데이터공학)

YOLO, EAST : Comparison of Scene Text Detection Performance, Using a Neural Network Model

YOLO, EAST: 신경망 모델을 이용한 문자열 위치 검출 성능 비교

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)