Search | Korea Science

Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
- Journal of Information Processing Systems
- /
- v.19 no.4
- /
- pp.427-438
- /
- 2023
Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.
https://doi.org/10.3745/JIPS.02.0199 인용 PDF

Junhyeong Kwon;Nam Ik Cho
- Journal of Broadcast Engineering
- /
- v.28 no.2
- /
- pp.178-184
- /
- 2023
Finding texts in general scene images and recognizing their contents is a very important task that can be used as a basis for robot vision, visual assistance, and so on. However, for the low-resolution text images, the degradations, such as noise or blur included in text images, are more noticeable, which leads to severe performance degradation of text recognition accuracy. In this paper, we propose a new Korean text image super-resolution based on a Transformer-based model, which generally shows higher performance than convolutional neural networks. In the experiments, we show that text recognition accuracy for Korean text images can be improved when our proposed text image super-resolution method is used. We also propose a new Korean text image dataset for training our model, which contains massive HR-LR Korean text image pairs.
https://doi.org/10.5909/JBE.2023.28.2.178 인용 PDF