[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2021.26.09.037

Tongue Segmentation Using the Receptive Field Diversification of U-net

Li, Yu-Jie (School of Intelligent Manufacturing, Weifang University of Science and Technology, Dept. of Computer and Software Engineering, Wonkwang University)
Jung, Sung-Tae (Dept. of Computer and Software Engineering, Wonkwang University)

Publication Information

Journal of the Korea Society of Computer and Information / v.26, no.9, 2021 , pp. 37-47 More about this Journal

Abstract

In this paper, we propose a new deep learning model for tongue segmentation with improved accuracy compared to the existing model by diversifying the receptive field in the U-net. Methods such as parallel convolution, dilated convolution, and constant channel increase were used to diversify the receptive field. For the proposed deep learning model, a tongue region segmentation experiment was performed on two test datasets. The training image and the test image are similar in TestSet1 and they are not in TestSet2. Experimental results show that segmentation performance improved as the receptive field was diversified. The mIoU value of the proposed method was 98.14% for TestSet1 and 91.90% for TestSet2 which was higher than the result of existing models such as U-net, DeepTongue, and TongueNet.

Keywords

Tongue; Segmentation; Deep Learning; U-net; Receptive Field;

Citations & Related Records

Reference

1	M. Everingham, L.V. Gool, C.K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (voc) challenge," International journal of computer vision, Vol. 88, no. 2, pp. 303-338, June 2010. DOI: 10.1007/s11263-009-0275-4 DOI
2	T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C.L. Zitnick, "Microsoft COCO: Common objects in context," European conference on computer vision, pp. 740-755, Sep. 2014. DOI: 10.1007/978-3-319-10602-1_48 DOI
3	H. Chen, S.T. Jung "Enhancement of Tongue Segmentation by Using Data Augmentation," The Journal of Korea Institute of Information, Electronics, and Communication Technology, Vol. 13. no. 5, pp. 313-322, Oct. 2020. DOI: 10.17661/jkiiect.2020.13.5.313 DOI
4	D. Shi, C. Tang, S.V. Blackley, L. Wang, J. Yang, Y. He, S.I. Bennett, Y. Xiong, X. Shi, L. Zhou, and D. W. Bates, "An annotated dataset of tongue images supporting geriatric disease diagnosis," Data in Brief, Vol. 32, 106153, ISSN 2352-3409, Oct. 2020. DOI: 10.1016/j.dib.2020.106153 DOI
5	P. Qu, Z. Hui, Z. Li, Z. Jing, and G. Chen, "Automatic tongue image segmentation for traditional chinese medicine using deep neural network," Intelligent Computing Theories and Application, pp. 247-259, July 2017. DOI: 10.1007/978-3-319-63309-1_23 DOI
6	A. Chaurasia and E. Culurciello, "LinkNet: Exploiting encoder representations for efficient semantic segmentation," IEEE Visual Communications and Image Processing, pp. 1-4, Dec. 2017. DOI: 10.1109/VCIP.2017.8305148 DOI
7	E. Romera, J.M. Alvarez, L.M. Bergasa, and R. Arroyo, "ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation," IEEE Transactions on Intelligent Transportation Systems, Vol. 19, no. 1, pp. 263-272, Jan. 2018. DOI: 10.1109/TITS.2017.2750080 DOI
8	M.H. Tania, K. Lwin, and M.A. Hossain, "Advances in automated tongue diagnosis techniques," Integrative Medicine Research, Vol. 8, no. 1, pp. 42-56, March 2019. DOI: 10.1016/j.imr.2018.03.001 DOI
9	M.H. Tania, K.Lwin, M.A. Hossain, "Advances in automated tongue diagnosis techniques," Integrative Medicine Research, Vol. 8, no. 1, pp. 42-56, March 2019. DOI: 10.1016/j.imr.2018.03.001 DOI
10	M.H. Subandia, N.D. Kamarudinb, M.A. Yusof, A.A. Bakar, "Prototyping digital tongue diagnosis system on roborealm and raspberry-pi," ZULFAQAR Journal of Defence Science, Engineering & Technology, Vol. 2, no. 1, pp. 24-33, June 2019.
11	Z. Li, Z. Yu, W. Liu, J. Hu, Y. Lin, and Z. Zhang, "Tongue image segmentation via thresholding and clustering," International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1-5, Oct. 2017. DOI: 10.1109/CISP-BMEI.2017.8302207 DOI
12	J. Guo, Y. Yang, Q. Wu, J. Su, and F. Ma, "Adaptive active contour model based automatic tongue image segmentation," International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, pp. 1386-1390, Oct. 2016. DOI: 10.1109/CISPBMEI.2016.7852933 DOI
13	Saparudin, Erwin, and M. Fachrurrozi, "Tongue Segmentation Using Active Contour Model," International Conference on Electrical Engineering, Computer Science and Informatics, pp. 1-5, Nov. 2016. DOI: 10.1088/1757-899X/190/1/012041 DOI
14	W. Liu, J. Hu, Z. Li, Z. Zhang, Z. Ma, and D. Zhang, "Tongue Image Segmentation via Thresholding and Gray Projection," KSII Transactions on Internet and Information Systems, Vol. 13, no. 2, pp. 945-961, Feb. 2019. DOI : 10.3837/tiis.2019.02.025 DOI
15	M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213-3223, June 2016. DOI: 10.1109/CVPR.2016.350 DOI
16	B. Lin, J. Xle, C. Li, and Y. Qu, "Deeptongue: tongue segmentation via resnet," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1035-1039, April 2018. DOI: 10.1109/ICASSP.2018.8462650 DOI
17	C, Zhou, H, Fan and Z. Li, "Tonguenet: accurate localization and segmentation for tongue images using deep neural networks," IEEE Access Vol. 7, pp. 148779-148789, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946681 DOI
18	O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional Networks for Biomedical Image Segmentation," International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, Nov. 2015. DOI: 10.1007/978-3-319-24574-4_28 DOI
19	H. Zhao, J. Shi, X. Qi, et al., "Pyramid scene parsing network," IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881-2890. July 2017. DOI: 10.1109/CVPR.2017.660 DOI
20	W. Xiang, H. Mao, and V. Athitsos, "ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation," IEEE Winter Conference on Applications of Computer Vision, pp. 1789-1796, Jan. 2019. DOI: 10.1109/WACV.2019.00195 DOI
21	G.H. Kim and D.H. Nam, "The development trends of tongue diagnosis system," The Magazine of the IEIE, Vol. 43, no. 12, 35-43, Dec. 2016.
22	K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," IEEE conference on computer vision and pattern recognition, pp. 770-778, June 2016. DOI: 10.1109/CVPR.2016.90 DOI
23	F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," International Conference on Learning Representations, May. 2016. arXiv preprint arXiv:1511.07122v1
24	A. Paszke, A. Chaurasia, S. Kim, et al., "ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation," arXiv preprint arXiv:1606.02147, June 2016.
25	E. Shelhamer, J. Long, and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE Trans. Pattern Anal. Mach. Intell. Vol. 39, no. 4, pp. 640-651, April 2017. DOI: 10.1109/TPAMI.2016.2572683 DOI
26	W. Liu, A. Rabinovich, and A. C. Berg, "Parsenet: Looking wider to see better," arXiv preprint arXiv:1506.04579v2, Nov. 2015.
27	C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, "Large kernel matters-improve semantic segmentation by global convolutional network," IEEE conference on computer vision and pattern recognition, pp. 4353-4361, July 2017. DOI: 10.1109/CVPR.2017.189 DOI
28	J. Ning, D. Zhang, C. Wu, et al., "Automatic Tongue Image Segmentation Based on Gradient Vector Flow and Region Merging," Neural Computing and Applications, Vol. 21, no. 8, pp. 1819-1826, Nov. 2012. DOI: 10.1007/s00521-010-0484-3 DOI