DOI QR코드

DOI QR Code

SEL-RefineMask: A Seal Segmentation and Recognition Neural Network with SEL-FPN

  • Dun, Ze-dong (School of Mechanical, Electrical and Information Engineering, Shandong University) ;
  • Chen, Jian-yu (School of Mechanical, Electrical and Information Engineering, Shandong University) ;
  • Qu, Mei-xia (School of Mechanical, Electrical and Information Engineering, Shandong University) ;
  • Jiang, Bin (School of Mechanical, Electrical and Information Engineering, Shandong University)
  • Received : 2021.12.16
  • Accepted : 2022.03.15
  • Published : 2022.06.30

Abstract

Digging historical and cultural information from seals in ancient books is of great significance. However, ancient Chinese seal samples are scarce and carving methods are diverse, and traditional digital image processing methods based on greyscale have difficulty achieving superior segmentation and recognition performance. Recently, some deep learning algorithms have been proposed to address this problem; however, current neural networks are difficult to train owing to the lack of datasets. To solve the afore-mentioned problems, we proposed an SEL-RefineMask which combines selector of feature pyramid network (SEL-FPN) with RefineMask to segment and recognize seals. We designed an SEL-FPN to intelligently select a specific layer which represents different scales in the FPN and reduces the number of anchor frames. We performed experiments on some instance segmentation networks as the baseline method, and the top-1 segmentation result of 64.93% is 5.73% higher than that of humans. The top-1 result of the SEL-RefineMask network reached 67.96% which surpassed the baseline results. After segmentation, a vision transformer was used to recognize the segmentation output, and the accuracy reached 91%. Furthermore, a dataset of seals in ancient Chinese books (SACB) for segmentation and small seal font (SSF) for recognition were established which are publicly available on the website.

Keywords

Acknowledgement

This study was supported by Shandong Provincial Natural Science Foundation (ZR2020MA064). We thank the original seal data provided by digital platforms such as Jangseogak of the Academy of Korean Studies and the Kyujanggak Institute for Korean Studies.

References

  1. S. Liang, "Analysis of the role of seal identification on calligraphy and painting identification," Identification and Appreciation to Cultural Relics, vol. 2021, no. 23, pp. 96-98, 2021.
  2. Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, "Detecting text in natural image with connectionist text proposal network," in Computer Vision - ECCV 2016. Cham, Switzerland: Springer, 2016, pp. 56-72.
  3. S. Tangwannawit and W. Saetang, "Recognition of lottery digits using OCR technology," in Proceedings of 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 2016, pp. 632-636.
  4. A. Bissacco, M. Cummins, Y. Netzer, and H. Neven, "PhotoOCR: reading text in uncontrolled conditions," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 2013, pp. 785-792.
  5. B. Su and S. Lu, "Accurate scene text recognition based on recurrent neural network," in Computer Vision - ACCV 2014. Cham, Switzerland: Springer, 2014, pp. 35-48
  6. N. Otsu, "A threshold selection method from gray-level histograms," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979. https://doi.org/10.1109/TSMC.1979.4310076
  7. T. Jiang, M. Qiu, J. Chen, and X. Cao, "LILA: a connected components labeling algorithm in grid-based clustering," in Proceedings of 2009 1st International Workshop on Database Technology and Applications, Wuhan, China, 2009, pp. 213-216.
  8. L. Xu, F. Yin, Q. F. Wang, and C. L. Liu, "Touching character separation in Chinese handwriting using visibility-based foreground analysis," in Proceedings of 2011 International Conference on Document Analysis and Recognition, Beijing, China, 2011, pp. 859-863.
  9. R. Ma and J. Yang, "An improved drop-fall algorithm for handwritten numerals segmentation," Journal of Chinese Computer Systems, vol. 28, no. 11, pp. 2110-2112, 2007. https://doi.org/10.3969/j.issn.1000-1220.2007.11.040
  10. Q. Hu, J. Yang, Q. Zhang, K. Liu, and X. Shen, "An automatic seal imprint verification approach," Pattern Recognition, vol. 28, no. 8, pp. 1251-1266, 1995. https://doi.org/10.1016/0031-3203(94)00165-I
  11. K. Li, B. Batjargal, and A. Maeda, "Character segmentation in Asian collector's seal imprints: an attempt to retrieval based on ancient character typeface," Journal of Data Mining and Digital Humanities, 2021. https://doi.org/10.46298/jdmdh.6102
  12. M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, "Reading text in the wild with convolutional neural networks," International Journal of Computer Vision, vol. 116, no. 1, pp. 1-20, 2016. https://doi.org/10.1007/s11263-015-0823-z
  13. P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai, "Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 2, pp. 532-548, 2021. https://doi.org/10.1109/TPAMI.2019.2937086
  14. K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2980-2988.
  15. H. Li, P. Wang, and C. Shen, "Towards end-to-end text spotting with convolutional recurrent neural networks," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 5238-5246.
  16. D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," 2014 [Online]. Available: https://arxiv.org/abs/1409.0473.
  17. A. Lamb, T. Clanuwat, and A. Kitamoto, "KuroNet: regularized residual U-Nets for end-to-end Kuzushiji character recognition," SN Computer Science, vol. 1, no. 3, pp. 1-15, 2020. https://doi.org/10.1007/s42979-019-0007-y
  18. X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao, and J. Yan, "FOTS: fast oriented text spotting with a unified network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5676-5685.
  19. D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, et al., "ICDAR 2015 competition on robust reading," in Proceedings of 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 2015, pp. 1156-1160.
  20. ICDAR 2017 Robust Reading Competitions [Online]. Available: http://rrc.cvc.uab.es/.
  21. M. Aamir, Z. Rahman, W. A. Abro, M. Tahir, and S. M. Ahmed, "An optimized architecture of image classification using convolutional neural network," International Journal of Image, Graphics and Signal Processing, vol. 10, no. 10, pp. 30-39, 2019.
  22. K. Wada, "Labelme: image polygonal annotation with Python," 2016 [Online]. Available: https://github.com/wkentaro/labelme.
  23. T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar and C. L. Zitnick, "Microsoft coco: common objects in context," in Computer Vision - ECCV 2014. Cham, Switzerland: Springer, 2014, pp. 740-755.
  24. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778.
  25. T. Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 936-944.
  26. G. Zhang, X. Lu, J. Tan, J. Li, Z. Zhang, Q. Li, and X. Hu, "RefineMask: towards high-quality instance segmentation with fine-grained features," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 2021, pp. 6861-6869.
  27. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, et al., "An image is worth 16x16 words: transformers for image recognition at scale," 2021 [Online]. Available: https://arxiv.org/abs/2010.11929.
  28. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, vol. 28, pp. 91-99, 2015
  29. R. Girshick, "Fast R-CNN," in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015, pp. 1440-1448.
  30. K. Chen, J. Pang, J. Wang, Y. Xiong, X. Li, S. Sun, et al., "Hybrid task cascade for instance segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, pp. 4974-4983.
  31. A. Kirillov, Y. Wu, K. He, and R. Girshick, "PointRend: image segmentation as rendering," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, 2020, pp. 9796-9805.
  32. M. Aamir, Y. F. Pu, Z. Rahman, W. A. Abro, H. Naeem, F. Ullah, and A. M. Badr, "A hybrid proposed framework for object detection and classification," Journal of Information Processing Systems, vol. 14, no. 5, pp. 1176-1194, 2018. https://doi.org/10.3745/JIPS.02.0095