[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/JIPS.02.0174

SEL-RefineMask: A Seal Segmentation and Recognition Neural Network with SEL-FPN

Dun, Ze-dong (School of Mechanical, Electrical and Information Engineering, Shandong University)
Chen, Jian-yu (School of Mechanical, Electrical and Information Engineering, Shandong University)
Qu, Mei-xia (School of Mechanical, Electrical and Information Engineering, Shandong University)
Jiang, Bin (School of Mechanical, Electrical and Information Engineering, Shandong University)

Publication Information

Journal of Information Processing Systems / v.18, no.3, 2022 , pp. 411-427 More about this Journal

Abstract

Digging historical and cultural information from seals in ancient books is of great significance. However, ancient Chinese seal samples are scarce and carving methods are diverse, and traditional digital image processing methods based on greyscale have difficulty achieving superior segmentation and recognition performance. Recently, some deep learning algorithms have been proposed to address this problem; however, current neural networks are difficult to train owing to the lack of datasets. To solve the afore-mentioned problems, we proposed an SEL-RefineMask which combines selector of feature pyramid network (SEL-FPN) with RefineMask to segment and recognize seals. We designed an SEL-FPN to intelligently select a specific layer which represents different scales in the FPN and reduces the number of anchor frames. We performed experiments on some instance segmentation networks as the baseline method, and the top-1 segmentation result of 64.93% is 5.73% higher than that of humans. The top-1 result of the SEL-RefineMask network reached 67.96% which surpassed the baseline results. After segmentation, a vision transformer was used to recognize the segmentation output, and the accuracy reached 91%. Furthermore, a dataset of seals in ancient Chinese books (SACB) for segmentation and small seal font (SSF) for recognition were established which are publicly available on the website.

Keywords

Character Recognition; Feature Extraction; FPN; RefineMask; Seal Character Segmentation;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	A. Lamb, T. Clanuwat, and A. Kitamoto, "KuroNet: regularized residual U-Nets for end-to-end Kuzushiji character recognition," SN Computer Science, vol. 1, no. 3, pp. 1-15, 2020. DOI
2	D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, et al., "ICDAR 2015 competition on robust reading," in Proceedings of 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 2015, pp. 1156-1160.
3	ICDAR 2017 Robust Reading Competitions [Online]. Available: http://rrc.cvc.uab.es/.
4	K. Wada, "Labelme: image polygonal annotation with Python," 2016 [Online]. Available: https://github.com/wkentaro/labelme.
5	T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar and C. L. Zitnick, "Microsoft coco: common objects in context," in Computer Vision - ECCV 2014. Cham, Switzerland: Springer, 2014, pp. 740-755.
6	G. Zhang, X. Lu, J. Tan, J. Li, Z. Zhang, Q. Li, and X. Hu, "RefineMask: towards high-quality instance segmentation with fine-grained features," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 2021, pp. 6861-6869.
7	S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, vol. 28, pp. 91-99, 2015
8	R. Girshick, "Fast R-CNN," in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015, pp. 1440-1448.
9	A. Kirillov, Y. Wu, K. He, and R. Girshick, "PointRend: image segmentation as rendering," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, 2020, pp. 9796-9805.
10	M. Aamir, Y. F. Pu, Z. Rahman, W. A. Abro, H. Naeem, F. Ullah, and A. M. Badr, "A hybrid proposed framework for object detection and classification," Journal of Information Processing Systems, vol. 14, no. 5, pp. 1176-1194, 2018. DOI
11	S. Tangwannawit and W. Saetang, "Recognition of lottery digits using OCR technology," in Proceedings of 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 2016, pp. 632-636.
12	P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai, "Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 2, pp. 532-548, 2021. DOI
13	K. Li, B. Batjargal, and A. Maeda, "Character segmentation in Asian collector's seal imprints: an attempt to retrieval based on ancient character typeface," Journal of Data Mining and Digital Humanities, 2021. https://doi.org/10.46298/jdmdh.6102 DOI
14	A. Bissacco, M. Cummins, Y. Netzer, and H. Neven, "PhotoOCR: reading text in uncontrolled conditions," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 2013, pp. 785-792.
15	N. Otsu, "A threshold selection method from gray-level histograms," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979. DOI
16	R. Ma and J. Yang, "An improved drop-fall algorithm for handwritten numerals segmentation," Journal of Chinese Computer Systems, vol. 28, no. 11, pp. 2110-2112, 2007. DOI
17	T. Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 936-944.
18	T. Jiang, M. Qiu, J. Chen, and X. Cao, "LILA: a connected components labeling algorithm in grid-based clustering," in Proceedings of 2009 1st International Workshop on Database Technology and Applications, Wuhan, China, 2009, pp. 213-216.
19	D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," 2014 [Online]. Available: https://arxiv.org/abs/1409.0473.
20	H. Li, P. Wang, and C. Shen, "Towards end-to-end text spotting with convolutional recurrent neural networks," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 5238-5246.
21	X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao, and J. Yan, "FOTS: fast oriented text spotting with a unified network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5676-5685.
22	M. Aamir, Z. Rahman, W. A. Abro, M. Tahir, and S. M. Ahmed, "An optimized architecture of image classification using convolutional neural network," International Journal of Image, Graphics and Signal Processing, vol. 10, no. 10, pp. 30-39, 2019.
23	K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778.
24	A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, et al., "An image is worth 16x16 words: transformers for image recognition at scale," 2021 [Online]. Available: https://arxiv.org/abs/2010.11929.
25	K. Chen, J. Pang, J. Wang, Y. Xiong, X. Li, S. Sun, et al., "Hybrid task cascade for instance segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, pp. 4974-4983.
26	S. Liang, "Analysis of the role of seal identification on calligraphy and painting identification," Identification and Appreciation to Cultural Relics, vol. 2021, no. 23, pp. 96-98, 2021.
27	Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, "Detecting text in natural image with connectionist text proposal network," in Computer Vision - ECCV 2016. Cham, Switzerland: Springer, 2016, pp. 56-72.
28	B. Su and S. Lu, "Accurate scene text recognition based on recurrent neural network," in Computer Vision - ACCV 2014. Cham, Switzerland: Springer, 2014, pp. 35-48
29	L. Xu, F. Yin, Q. F. Wang, and C. L. Liu, "Touching character separation in Chinese handwriting using visibility-based foreground analysis," in Proceedings of 2011 International Conference on Document Analysis and Recognition, Beijing, China, 2011, pp. 859-863.
30	Q. Hu, J. Yang, Q. Zhang, K. Liu, and X. Shen, "An automatic seal imprint verification approach," Pattern Recognition, vol. 28, no. 8, pp. 1251-1266, 1995. DOI
31	M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, "Reading text in the wild with convolutional neural networks," International Journal of Computer Vision, vol. 116, no. 1, pp. 1-20, 2016. DOI
32	K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2980-2988.