DOI QR코드

DOI QR Code

Korean Text Image Super-Resolution for Improving Text Recognition Accuracy

텍스트 인식률 개선을 위한 한글 텍스트 이미지 초해상화

  • Junhyeong Kwon (Department of ECE, INMC, Seoul National University) ;
  • Nam Ik Cho (Department of ECE, INMC, Seoul National University)
  • 권준형 (서울대학교 전기.정보공학부 뉴미디어통신공동연구소) ;
  • 조남익 (서울대학교 전기.정보공학부 뉴미디어통신공동연구소)
  • Received : 2023.01.16
  • Accepted : 2023.03.13
  • Published : 2023.03.30

Abstract

Finding texts in general scene images and recognizing their contents is a very important task that can be used as a basis for robot vision, visual assistance, and so on. However, for the low-resolution text images, the degradations, such as noise or blur included in text images, are more noticeable, which leads to severe performance degradation of text recognition accuracy. In this paper, we propose a new Korean text image super-resolution based on a Transformer-based model, which generally shows higher performance than convolutional neural networks. In the experiments, we show that text recognition accuracy for Korean text images can be improved when our proposed text image super-resolution method is used. We also propose a new Korean text image dataset for training our model, which contains massive HR-LR Korean text image pairs.

카메라로 촬영한 야외 일반 영상에서 텍스트 이미지를 찾아내고 그 내용을 인식하는 기술은 로봇 비전, 시각 보조 등의 기반으로 활용될 수 있는 매우 중요한 기술이다. 하지만 텍스트 이미지가 저해상도인 경우에는 텍스트 이미지에 포함된 노이즈나 블러 등의 열화가 더 두드러지기 때문에 텍스트 내용 인식 성능의 하락이 발생하게 된다. 본 논문에서는 일반 영상에서의 저해상도 한글 텍스트에 대한 이미지 초해상화를 통해서 텍스트 인식 정확도를 개선하였다. 트랜스포머에 기반한 모델로 한글 텍스트 이미지 초해상화를 수행 하였으며, 직접 구축한 고해상도-저해상도 한글 텍스트 이미지 데이터셋에 대하여 제안한 초해상화 방법을 적용했을 때 텍스트 인식 성능이 개선되는 것을 확인하였다.

Keywords

Acknowledgement

This work was supported by the BK21 FOUR program of the Education and Research Program for Future ICT Pioneers, Seoul National University in 2023. And this research was supported by LG AI Research.

References

  1. B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, "Enhanced Deep Residual Networks for Single Image Super-Resolution," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136-144, 2017. doi: https://doi.org/10.1109/CVPRW.2017.151
  2. Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, "Residual Dense Network for Image Super-Resolution," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472-2481, 2018. doi: https://doi.org/10.1109/CVPR.2018.00262
  3. J. Ma, Z. Liang, and L. Zhang, "A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-Resolution," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5911-5920, 2022. doi: https://doi.org/10.1109/CVPR52688.2022.00582
  4. A. Vaswani et al. "Attention is All You Need," Advances in Neural Information Processing Systems, 30,
  5. B. Shi, X. Bai, and C. Yao, "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition," IEEE transactions on pattern analysis and machine intelligence, Vol.39, No.11, pp.2298-2304, 2016. doi: https://doi.org/10.1109/TPAMI.2016.2646371
  6. W. Wang, E. Xie, X. Liu, W. Wang, D. Liang, C. Shen, and X. Bai, "Scene Text Image Super-Resolution in the Wild," European Conference on Computer Vision, Springer, Cham, 2020. doi: https://doi.org/10.1007/978-3-030-58607-2_38
  7. T. Zheng, Z. Chen, S. Fang, H. Xie, and Y. G. Jiang, "Cdistnet: Perceiving Multi-Domain Character Distance for Robust Text Recognition," arXiv preprint arXiv:2111.11011, 2021. doi: https://doi.org/10.48550/arXiv.2111.11011
  8. AI Hub, https://aihub.or.kr (accessed Dec. 28, 2022.)
  9. Outdoor images including Korean texts, https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=105 (accessed Dec. 28, 2022.)
  10. C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Aitken, A. Tejani, and W. Shi, "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4681-4690, 2017. doi: https://doi.org/10.1109/CVPR.2017.19