Acknowledgement
This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) under the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2023-RS-2023-00256629) grant funded by the Korea government (MSIT), and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00219107).
References
- Chen H., Xu Z., Gu Z., Li Y., Meng C., Zhu H., Wang W., "DiffUTE: Universal text editing diffusion model," Advances in Neural Information Processing Systems, vol. 36, 2024.
- Qu Y., Tan Q., Xie H., Xu J., Wang Y., Zhang Y., "Exploring stroke-level modifications for scene text editing," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 2, pp. 2119-2127, 2023.
- Lee J., Kim Y., Kim S., Yim M., Shin S., Lee G., Park S., "RewriteNet: Reliable scene text editing with implicit decomposition of text contents and styles," arXiv preprint, arXiv:2107.11041, 2021.
- Dang Q.V., Lee G.S., "Scene text segmentation via multitask cascade transformer with paired data synthesis," IEEE Access, 2023.
- Dang Q.V., Lee G.S., "Scene text segmentation by paired data synthesis," Proceedings of the 2023 IEEE International Conference on Image Processing (ICIP), pp. 545-549, 2023.
- Kingma D.P., "Auto-encoding variational bayes," arXiv preprint, arXiv:1312.6114, 2013.
- Isola P., Zhu J.Y., Zhou T., Efros A.A., "Image-to-image translation with conditional adversarial networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125-1134, 2017.
- Wu L., Zhang C., Liu J., Han J., Liu J., Ding E., Bai X., "Editing text in the wild," Proceedings of the 27th ACM International Conference on Multimedia, pp. 1500-1508, 2019.
- Ji J., Zhang G., Wang Z., Hou B., Zhang Z., Price B., Chang S., "Improving diffusion models for scene text editing with dual encoders," arXiv preprint, arXiv:2304.05568, 2023
- Fang S., Xu C., Niu Y., Chen Z., Pu S., Huang F., "Read like humans: Autonomous, bidirectional and iteratively refining scene text recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7098-7107, 2021.
- Chen Z., Lin W., Huang J., Pu S., "TextDiffuser: Diffusion models for scene text editing," arXiv preprint, arXiv:2304.02328, 2024. https://doi.org/10.1109/TASLP.2023.3345146
- Karatzas D., Shafait F., Uchida S., Iwamura M., Bigorda L., Mestre S.R., Mas J., Mota D.F., Almazan J., de las Heras L.P., "ICDAR 2013 Robust reading competition," Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1484-1493, 2013.
- Ch'Ng S., Chan C.S., "Total-Text: A comprehensive dataset for scene text detection and recognition," Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 935-942, 2017.
- Xu Y., Wang X., Li X., Lv Z., Zhang Y., "Rethinking text segmentation: A novel dataset and method," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2563-2572, 2021.