Browse > Article
http://dx.doi.org/10.5573/IEIESPC.2017.6.1.053

Convolutional Neural Networks for Character-level Classification  

Ko, Dae-Gun (Imaging Lab, Samsung S-Printing Solution Co., LTD.)
Song, Su-Han (Imaging Lab, Samsung S-Printing Solution Co., LTD.)
Kang, Ki-Min (Imaging Lab, Samsung S-Printing Solution Co., LTD.)
Han, Seong-Wook (Imaging Lab, Samsung S-Printing Solution Co., LTD.)
Publication Information
IEIE Transactions on Smart Processing and Computing / v.6, no.1, 2017 , pp. 53-59 More about this Journal
Abstract
Optical character recognition (OCR) automatically recognizes text in an image. OCR is still a challenging problem in computer vision. A successful solution to OCR has important device applications, such as text-to-speech conversion and automatic document classification. In this work, we analyze character recognition performance using the current state-of-the-art deep-learning structures. One is the AlexNet structure, another is the LeNet structure, and the other one is the SPNet structure. For this, we have built our own dataset that contains digits and upper- and lower-case characters. We experiment in the presence of salt-and-pepper noise or Gaussian noise, and report the performance comparison in terms of recognition error. Experimental results indicate by five-fold cross-validation that the SPNet structure (our approach) outperforms AlexNet and LeNet in recognition error.
Keywords
OCR; Deep-Learning; Convolutional neural networks; MNIST;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. LeCun, C. Cortes, and C. J. C. Burges. The MNIST Database of Handwritten Digits,
2 Caffe-OCR. OCR with Caffe Deep Learning Framework,
3 BLVC (Berkeley Vision and Learning Center). Caffe Deep Learning Framework,
4 R. Mither, S. Indalker, and N. Divekar. Optical Character Recognition, International Journal of Recent Technology & Engineering, IJRTE, 2013.
5 M. Dillgenti, P. Frasconi, and M. Gori. Hidden Tree Markov Models for Document Image Classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, 2003.
6 Y. Cao, S. Wang, and H. Li. Skew detection and correction in document images based on straight-line fitting, Pattern Recognition Letters, ELSEVIER, 2003.
7 A. Kain, and M. W. Macon. Spectral voice conversion for text-to-speech synthesis, Proceedings of the 1998 IEEE International Conference on, IEEE, 1998.
8 D. G. Ko, S. H. Song, K. M. Kang, S. W. Han, and J. H. Yi. Optical Character Recognition Performance Comparison of Convolutional Neural Networks and Tesseract, The 31st International Technical Conference on Circuits/Systems, Computers and Communications Technical Program, ITC/CSCC: pp. 871-874, 2016.
9 R. Smith. Tesseract OCR Engine, Google Inc., OSCON, 2007.
10 R. Smith. An Overview of the Tesseract OCR Engine, International Conference on Document Analysis and Recognition, IEEE, 2007.
11 A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks, in Neural Information Processing System, NIPS, 2012.
12 D. Ciresan, U. Meier, and J. Schmidhuber. Multicolumn Deep Neural Networks for Image Classification, in Proceedings of CVPR 2012, IEEE, 2012.
13 Y. LeCun, L. Bottou, Y. Bengoi, and P. Haffner. Gradient based learning applied to document recognition, in Proceedings of the IEEE, IEEE, 1998.
14 L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. Regularrization of Neural Networks using DropConnect, in Proceedings of the 30th International Conference on Machine Learning, ICML, 2013.
15 D. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber. Deep, big, simple neural nets for hand written digit recognition, Neural Computation, MIT Press Journals, 2010.
16 R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchites for accurate object detection and semantic segmentation, in Proceedings of CVPR 2014, IEEE, 2014.
17 A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10, Unpublished manuscripts, 2010.
18 Y. Kim. Convolutional neural networks for sentence classification. in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP: pp. 1746-1751, 2014.
19 A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks, in Neural Information Processing System, NIPS, 2012.
20 Y. LeCun, L. D. Jackel, L. Bottou, C. Cortes, J. S. Denker, H. Denker, I. Guyon, U. A. Muller, E. Sackinger, P. Simard, and V. Vapnik. Learning Algorithms for Classification: A Comparison on Handwritten Digit Recognition, Neural Networks: The Statistical Mechanics Perspective, World Scientific: 261-276, 1995.
21 R. Johnson and T. Zhang. Effective use of word order for text categorization with convolutional neural networks. CoPR, 2014.
22 X. Zhang, J. Zhao, and Y. LeCun. Character-level Convolutional Networks for Text Classification, Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS, 2015.
23 O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep Face Recognition, in the British Machine Vision Conference, BMVC, 2015.