Browse > Article
http://dx.doi.org/10.3745/KTSDE.2022.11.3.115

YOLO, EAST : Comparison of Scene Text Detection Performance, Using a Neural Network Model  

Park, Chan Yong ((주)투아트)
Lim, Young Min ((주)투아트)
Jeong, Seung Dae ((주)투아트)
Cho, Young Heuk ((주)투아트)
Lee, Byeong Chul ((재)경상북도경제진흥원 일자리산업실)
Lee, Gyu Hyun (경북대학교 컴뷰터공학부)
Kim, Jin Wook (경북대학교 컴뷰터공학부)
Publication Information
KIPS Transactions on Software and Data Engineering / v.11, no.3, 2022 , pp. 115-124 More about this Journal
Abstract
In this paper, YOLO and EAST models are tested to analyze their performance in text area detecting for real-world and normal text images. The earl ier YOLO models which include YOLOv3 have been known to underperform in detecting text areas for given images, but the recently released YOLOv4 and YOLOv5 achieved promising performances to detect text area included in various images. Experimental results show that both of YOLO v4 and v5 models are expected to be widely used for text detection in the filed of scene text recognition in the future.
Keywords
Scene Text Detection; YOLO; EAST; Neural Network;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. M. Baek, B. D. Lee, D. Y. Han, S. D. Yun, and H. S. Lee, "Character region awareness for text detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.9365-9374, 2019.
2 T. Wang, T. Zhu, L. Jin, C. Luo, X. Chen, Y. Wu, and M. Cai, "Decoupled attention network for text recognition," in Proceedings of the AAAI Conference on Artificial Intelligence, Vol.34, No.7, pp.12216-12224, 2019.
3 P. Lyu, C. Yao, W. Wu, S. Yan, and X. Bai, "Multi-oriented scene text detection via corner localization and region segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7553-7563, 2018.
4 Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, "Detecting text in natural image with connectionist text proposal network," in European Conference on Computer Vision, Springer, Cham, pp.56-72, 2016.
5 M. Liao, B. Shi, and X. Bai, "Textboxes++: A single-shot oriented scene text detector," IEEE Transactions on Image Processing, Vol.27, No.8, pp.3676-3690, 2018.   DOI
6 F. Jiang, Z. Hao, and X. Liu, "Deep scene text detection with connected component proposals," arXiv preprint arXiv:1708.05133, 2017.
7 Y. Jiang, et al., "R2cnn: rotational region cnn for orientation robust scene text detection," arXiv preprint arXiv:1706.09579, 2017.
8 T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao, and C. Sun, "An end-to-end textspotter with explicit alignment and attention," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5020-5029, 2018.
9 M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, "Textboxes: A fast text detector with a single deep neural network," in Thirty-first AAAI Conference on Artificial Intelligence, 2017.
10 A. Bochkovskiy, C. Y. Wang, and H. Y. M.. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
11 P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai, "Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes," in Proceedings of the European Conference on Computer Vision (ECCV), pp.67-83, 2018.
12 X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, "East: an efficient and accurate scene text detector," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5551-5560, 2017.
13 H. Hu, C. Zhang, Y. Luo, Y. Wang, J. Han, and E. Ding, "Wordsup: Exploiting word annotations for character based text detection," in Proceedings of the IEEE International Conference on Computer Vision, pp.4940-4949, 2017.
14 S. Long, J. Ruan, W. Zhang, X. He, W. Wu, and C. Yao, "Textsnake: A flexible representation for detecting text of arbitrary shapes," in Proceedings of the European Conference on Computer Vision (ECCV), pp.20-36, 2018.
15 P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, and X. Li, "Single shot text detector with regional attention," in Proceedings of the IEEE International Conference on Computer Vision, pp.3047-3055, 2017.
16 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, 2016.
17 S. Qin and R. Manduchi, "Cascaded segmentation-detection networks for word-level text spotting," in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol.1, pp.1275-1282, 2017.
18 X. Wang, S. Zheng, C. Zhang, R. Li, and L. Gui, "R-YOLO: A real-time text detector for natural scenes with arbitrary rotation," Sensors, Vol.21, No.3, pp.888, 2021.
19 G. Jocher, K. Nishimura, T. Mineeva, R. Vilarino, GitHub repository [Internet], https://github.com/ultralytics/yolov5
20 J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.