Browse > Article
http://dx.doi.org/10.9708/jksci.2021.26.09.037

Tongue Segmentation Using the Receptive Field Diversification of U-net  

Li, Yu-Jie (School of Intelligent Manufacturing, Weifang University of Science and Technology, Dept. of Computer and Software Engineering, Wonkwang University)
Jung, Sung-Tae (Dept. of Computer and Software Engineering, Wonkwang University)
Abstract
In this paper, we propose a new deep learning model for tongue segmentation with improved accuracy compared to the existing model by diversifying the receptive field in the U-net. Methods such as parallel convolution, dilated convolution, and constant channel increase were used to diversify the receptive field. For the proposed deep learning model, a tongue region segmentation experiment was performed on two test datasets. The training image and the test image are similar in TestSet1 and they are not in TestSet2. Experimental results show that segmentation performance improved as the receptive field was diversified. The mIoU value of the proposed method was 98.14% for TestSet1 and 91.90% for TestSet2 which was higher than the result of existing models such as U-net, DeepTongue, and TongueNet.
Keywords
Tongue; Segmentation; Deep Learning; U-net; Receptive Field;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Everingham, L.V. Gool, C.K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (voc) challenge," International journal of computer vision, Vol. 88, no. 2, pp. 303-338, June 2010. DOI: 10.1007/s11263-009-0275-4   DOI
2 T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C.L. Zitnick, "Microsoft COCO: Common objects in context," European conference on computer vision, pp. 740-755, Sep. 2014. DOI: 10.1007/978-3-319-10602-1_48   DOI
3 H. Chen, S.T. Jung "Enhancement of Tongue Segmentation by Using Data Augmentation," The Journal of Korea Institute of Information, Electronics, and Communication Technology, Vol. 13. no. 5, pp. 313-322, Oct. 2020. DOI: 10.17661/jkiiect.2020.13.5.313   DOI
4 D. Shi, C. Tang, S.V. Blackley, L. Wang, J. Yang, Y. He, S.I. Bennett, Y. Xiong, X. Shi, L. Zhou, and D. W. Bates, "An annotated dataset of tongue images supporting geriatric disease diagnosis," Data in Brief, Vol. 32, 106153, ISSN 2352-3409, Oct. 2020. DOI: 10.1016/j.dib.2020.106153   DOI
5 P. Qu, Z. Hui, Z. Li, Z. Jing, and G. Chen, "Automatic tongue image segmentation for traditional chinese medicine using deep neural network," Intelligent Computing Theories and Application, pp. 247-259, July 2017. DOI: 10.1007/978-3-319-63309-1_23   DOI
6 A. Chaurasia and E. Culurciello, "LinkNet: Exploiting encoder representations for efficient semantic segmentation," IEEE Visual Communications and Image Processing, pp. 1-4, Dec. 2017. DOI: 10.1109/VCIP.2017.8305148   DOI
7 E. Romera, J.M. Alvarez, L.M. Bergasa, and R. Arroyo, "ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation," IEEE Transactions on Intelligent Transportation Systems, Vol. 19, no. 1, pp. 263-272, Jan. 2018. DOI: 10.1109/TITS.2017.2750080   DOI
8 M.H. Tania, K.Lwin, M.A. Hossain, "Advances in automated tongue diagnosis techniques," Integrative Medicine Research, Vol. 8, no. 1, pp. 42-56, March 2019. DOI: 10.1016/j.imr.2018.03.001   DOI
9 M.H. Subandia, N.D. Kamarudinb, M.A. Yusof, A.A. Bakar, "Prototyping digital tongue diagnosis system on roborealm and raspberry-pi," ZULFAQAR Journal of Defence Science, Engineering & Technology, Vol. 2, no. 1, pp. 24-33, June 2019.
10 M.H. Tania, K. Lwin, and M.A. Hossain, "Advances in automated tongue diagnosis techniques," Integrative Medicine Research, Vol. 8, no. 1, pp. 42-56, March 2019. DOI: 10.1016/j.imr.2018.03.001   DOI
11 Z. Li, Z. Yu, W. Liu, J. Hu, Y. Lin, and Z. Zhang, "Tongue image segmentation via thresholding and clustering," International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1-5, Oct. 2017. DOI: 10.1109/CISP-BMEI.2017.8302207   DOI
12 J. Guo, Y. Yang, Q. Wu, J. Su, and F. Ma, "Adaptive active contour model based automatic tongue image segmentation," International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, pp. 1386-1390, Oct. 2016. DOI: 10.1109/CISPBMEI.2016.7852933   DOI
13 Saparudin, Erwin, and M. Fachrurrozi, "Tongue Segmentation Using Active Contour Model," International Conference on Electrical Engineering, Computer Science and Informatics, pp. 1-5, Nov. 2016. DOI: 10.1088/1757-899X/190/1/012041   DOI
14 W. Liu, J. Hu, Z. Li, Z. Zhang, Z. Ma, and D. Zhang, "Tongue Image Segmentation via Thresholding and Gray Projection," KSII Transactions on Internet and Information Systems, Vol. 13, no. 2, pp. 945-961, Feb. 2019. DOI : 10.3837/tiis.2019.02.025   DOI
15 M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213-3223, June 2016. DOI: 10.1109/CVPR.2016.350   DOI
16 B. Lin, J. Xle, C. Li, and Y. Qu, "Deeptongue: tongue segmentation via resnet," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1035-1039, April 2018. DOI: 10.1109/ICASSP.2018.8462650   DOI
17 C, Zhou, H, Fan and Z. Li, "Tonguenet: accurate localization and segmentation for tongue images using deep neural networks," IEEE Access Vol. 7, pp. 148779-148789, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946681   DOI
18 O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional Networks for Biomedical Image Segmentation," International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, Nov. 2015. DOI: 10.1007/978-3-319-24574-4_28   DOI
19 H. Zhao, J. Shi, X. Qi, et al., "Pyramid scene parsing network," IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881-2890. July 2017. DOI: 10.1109/CVPR.2017.660   DOI
20 W. Xiang, H. Mao, and V. Athitsos, "ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation," IEEE Winter Conference on Applications of Computer Vision, pp. 1789-1796, Jan. 2019. DOI: 10.1109/WACV.2019.00195   DOI
21 G.H. Kim and D.H. Nam, "The development trends of tongue diagnosis system," The Magazine of the IEIE, Vol. 43, no. 12, 35-43, Dec. 2016.
22 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," IEEE conference on computer vision and pattern recognition, pp. 770-778, June 2016. DOI: 10.1109/CVPR.2016.90   DOI
23 F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," International Conference on Learning Representations, May. 2016. arXiv preprint arXiv:1511.07122v1
24 A. Paszke, A. Chaurasia, S. Kim, et al., "ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation," arXiv preprint arXiv:1606.02147, June 2016.
25 E. Shelhamer, J. Long, and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE Trans. Pattern Anal. Mach. Intell. Vol. 39, no. 4, pp. 640-651, April 2017. DOI: 10.1109/TPAMI.2016.2572683   DOI
26 W. Liu, A. Rabinovich, and A. C. Berg, "Parsenet: Looking wider to see better," arXiv preprint arXiv:1506.04579v2, Nov. 2015.
27 C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, "Large kernel matters-improve semantic segmentation by global convolutional network," IEEE conference on computer vision and pattern recognition, pp. 4353-4361, July 2017. DOI: 10.1109/CVPR.2017.189   DOI
28 J. Ning, D. Zhang, C. Wu, et al., "Automatic Tongue Image Segmentation Based on Gradient Vector Flow and Region Merging," Neural Computing and Applications, Vol. 21, no. 8, pp. 1819-1826, Nov. 2012. DOI: 10.1007/s00521-010-0484-3   DOI