Browse > Article
http://dx.doi.org/10.3745/JIPS.04.0061

Variations of AlexNet and GoogLeNet to Improve Korean Character Recognition Performance  

Lee, Sang-Geol (Human Resources Development Group for Women Maker in Integrated Engineering, Dongguk University)
Sung, Yunsick (Dept. of Multimedia Engineering, Dongguk University)
Kim, Yeon-Gyu (Dept. of Computer Science Engineering, Pusan National University)
Cha, Eui-Young (Dept. of Computer Science Engineering, Pusan National University)
Publication Information
Journal of Information Processing Systems / v.14, no.1, 2018 , pp. 205-217 More about this Journal
Abstract
Deep learning using convolutional neural networks (CNNs) is being studied in various fields of image recognition and these studies show excellent performance. In this paper, we compare the performance of CNN architectures, KCR-AlexNet and KCR-GoogLeNet. The experimental data used in this paper is obtained from PHD08, a large-scale Korean character database. It has 2,187 samples of each Korean character with 2,350 Korean character classes for a total of 5,139,450 data samples. In the training results, KCR-AlexNet showed an accuracy of over 98% for the top-1 test and KCR-GoogLeNet showed an accuracy of over 99% for the top-1 test after the final training iteration. We made an additional Korean character dataset with fonts that were not in PHD08 to compare the classification success rate with commercial optical character recognition (OCR) programs and ensure the objectivity of the experiment. While the commercial OCR programs showed 66.95% to 83.16% classification success rates, KCR-AlexNet and KCR-GoogLeNet showed average classification success rates of 90.12% and 89.14%, respectively, which are higher than the commercial OCR programs' rates. Considering the time factor, KCR-AlexNet was faster than KCR-GoogLeNet when they were trained using PHD08; otherwise, KCR-GoogLeNet had a faster classification speed.
Keywords
Classification; CNN; Deep Learning; Korean Character Recognition;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 D. Cireşan, U. Meier, J. Masci, and J. Schmidhuber, "Multi-column deep neural network for traffic sign classification," Neural Networks, vol. 32, pp. 333-338, 2012.   DOI
2 N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A convolutional neural network for modelling sentences," 2014 [Online]. Available: https://arxiv.org/abs/1404.2188.
3 P. L. Callet, C. Viard-Gaudin, and D. Barba, "A convolutional neural network approach for objective video quality assessment," IEEE Transactions on Neural Networks, vol. 17, no. 5, pp. 1316-1327, 2006.   DOI
4 D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, "Convolutional neural network committees for handwritten character classification," in Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), Beijing, China, 2011, pp. 1135-1139.
5 T. Wang, D. J. Wu, A. Coates, and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," in Proceedings of the 21st International Conference on Pattern Recognition(ICPR 2012), Tsukuba, Japan, 2012, pp. 3304-3308.
6 Y. Zhang, "Deep convolutional network for handwritten Chinese character recognition," [Online]. Available: http://yuhao.im/files/Zhang_CNNChar.pdf.
7 Z. Zhong, L. Jin, and Z. Xie, "High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature map," in Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 846-850.
8 W. Yang, L. Jin, Z. Xie, and Z. Feng, "Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge," in Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 551-555.
9 D. C. Hwang and S. S. Kim, "Hangul recognition using path following algorithm," IE Interfaces, vol. 3, no. 2, pp. 53-62, 1990.
10 B. K. Sin, and J. H. Kim, "On-line handwritten character recognition with hidden Markov models," in Proceedings of the 4th Annual Conference on Human and Cognitive Language Technology, Seoul, Korea, 1992, pp. 533-542.
11 J. K. Chung, S. I. Kim, and J. C. Namgung, "A study on an on-line handwritten Hangul character recognition by identifying relative positions of strokes," Journal of Information Technology Applications and Management, vol. 4, no. 2, pp. 65-78, 1997.
12 J. Y. Ha, and B. K. Shin, "Optimization of number of states in HMM for on-line Hangul recognition," Proceeding of the Korea Information Science Society, vol. 25, no. 2, pp. 372-374, 1998.
13 J. H. Lee, J. H. Ahn, and I. B. Lee, "Elastic curvature matching for online handwritten Hangul recognition," Proceeding of the Korea Information Science Society, vol. 35, no. 2, pp. 238-239, 2008.
14 H. S. Cho, "A new feature-extraction method using wavelet transformation and fuzzy data for handwritten Hangul recognition," Journal of Korean Institute of Information Technology, vol. 3, no. 4, pp. 11-17, 2005.
15 I. J. Kim and X. Xie, "Handwritten Hangul recognition using deep convolutional neural network," International Journal of Document Analysis and Recognition, vol. 18, no. 1, pp. 1-13, 2011.   DOI
16 D. S. Ham, D. Y. Lee, I. S. Jung, and I. S. Oh, "Construction of printed Hangul character database PHD08," Journal of the Korea Contents Association, vol. 8, no. 11, pp. 33-40, 2008.   DOI
17 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceeding of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.   DOI
18 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in Proceeding of the IEEE Conference on Computer Vison and Patter Recognition, Boston, MA, 2015, pp. 1-9.
19 A. Krizhevsky, I. Sutskever and G. E. Hinton, "ImageNet classification with deep convolutional neural network," in Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12), Lake Tahoe, NV, 2012, pp. 1097-1105.
20 ImageNet Large Scale Visual Recognition Challenge [Online]. Available: http://www.image-net.org/challenges/LSVRC/.
21 S. Arora, A. Bhaskara, R. Ge, and T. Ma, "Provable bounds for learning some deep representations," 2013 [Online]. Available: https://arxiv.org/abs/1310.6343.
22 Linear interpolation [Online]. Available: https://en.wikipedia.org/wiki/Linear_interpolation.
23 Caffe (convolutional architecture for fast feature embedding) [Online]. Available: http://caffe.berkeleyvision.org/.
24 ABBYY FineReader 12 [Online]. Available: http://www.retia.co.kr/cnt/products/products.html?category=1&uid=24&name=finereader-12&tab=1.
25 ABC OCR scanner app [Online]. Available: https://itunes.apple.com/us/app/scanner-ocr-optical-character/id777913435.
26 Office Lens app [Online]. Available: https://itunes.apple.com/kr/app/office-lens/id975925059?mt=8.