[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2017.18.6.25

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA

Jeon, Jae-seung (Graduate School of Information Security, Korea university)
Moon, Jong-sub (Graduate School of Information Security, Korea university)

Publication Information

Journal of Internet Computing and Services / v.18, no.6, 2017 , pp. 25-34 More about this Journal

Abstract

Autoencoder is a type of deep learning method where input layer and output layer are the same, and effectively extracts and restores characteristics of input vector using constraints of hidden layer. In this paper, we propose methods of Autoencoders to remove a natural background image which is a noise to the CAPTCHA and recover only a numerical images by applying various autoencoder models to a region where one number of CAPTCHA images and a natural background are mixed. The suitability of the reconstructed image is verified by using the softmax function with the output of the autoencoder as an input. And also, we compared the proposed methods with the other method and showed that our methods are superior than others.

Keywords

CAPTCHA; Autoencoder; Stacked autoencoder; Denoising; SOFTMAX; Deep learning;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	P. Vincent, H. Larochelle, Y. Bengio, and P.A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. of 25th Int. Conf. Mach. Learn. - ICML '08, pp. 1096-1103, 2008. http://machinelearning.org/archive/icml2008/papers/592.pdf
2	A. Ng, "CS229 Lecture notes," CS229 Lecture notes, pp. 1-30, 2000.
3	P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.A. Manzagol, "Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion," J. Mach. Learn. Res., Vol. 11, pp. 3371-3408, 2010. http://www.jmlr.org/papers/v11/vincent10a.html
4	G. E. Hinton and R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science, Vol. 313, no. 5786, pp. 504-507, 2006. https://doi.org/10.1126/science.1127647 DOI
5	A. Ng, "Sparse autoencoder," CS294A Lect. notes, 2011, pp. 1-19.
6	G. E. Hinton, S. Osindero, and Y. W. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., Vol. 18, no. 7, pp. 1527-54, 2006. https://www.cs.toronto.edu/-hinton/absps/fastnc.pdf DOI
7	http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/
8	https://nid.naver.com/login/image/captcha/nhncaptchav4.gif?key=
9	J. Canny, "A Computational Approach to Edge Detection," IEEE Trans. Pattern Anal. Mach. Intell., Vol. PAMI-8, no. 6, pp. 679-698, 1986. htps://doi.org/10.1109/TPAMI.1986.4767851 DOI
10	A. Geron, "Hands on Machine Learning with scikit-learn and Tensorflow," 2017
11	T. Amaral, L. M. Silva, L. A. Alexandre, C. Kandaswamy, J. M. Santos, and J. M. De Sa, "Using different cost functions to train stacked auto-encoders," Artificial Intelligence (MICAI), 2013 12th Mexican International Conference on, pp. 114-120, 2013. https://doi.org/10.1109/MICAI.2013.20
12	E. Bursztein, J. Aigrain, A. Moscicki, and J. C. Mitchell, "The End is Nigh: Generic Solving of Text-based CAPTCHAs," Usenix Woot, 2014. https://www.usenix.org/node/185129
13	https://www.google.com/recaptcha/intro/invisible.html
14	J. Kim, S. Kim, and H. J. Kim, "Breaking character and natural image based CAPTCHA using feature classification," Journal of The Korea Institute of Information Security & Cryptology, Vol. 25, no. 5, pp. 1011-1019, 2015. http://dx.doi.org/10.13089/JKIISC.2015.25.5.1011 DOI
15	B. M. Powell, E. Kalsy, G. Goswami, M. Vatsa, R. Singh, and A. Noore, "Attack-Resistant aiCAPTCHA using a Negative Selection Artificial Immune System," urity and Privacy Workshops (SPW), IEEE, pp. 1-6, 2017. https://doi.org/10.1109/SPW.2017.22
16	K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, "Computers beat humans at single character recognition in reading based human interaction proofs (HIPs)," in Proc. of Second Conf. Email Anti-Spam, 2005. https://www.microsoft.com/en-us/research/wp-content/uploads/2005/01/CEAS2005Final.doc
17	E. Bursztein, M. Martin, and J. C. Mitchell, "Textbased CAPTCHA strengths and weaknesses," in Proc. of 18th ACM Conf. Comput. Commun. Secur., ISBN: 978-1-4503-0948-6, pp. 125-138. 2011. https://doi.org/10.1145/2046707.2046724
18	C. Cruz-Perez, O. Starostenko, F. Uceda-Ponga, V. Alarcon- Aquino, and L. Reyes-Cabrera, "Breaking reCAPTCHAs with unpredictable collapse: Heuristic character segmentation and recognition," Pattern Recognition, vol. 7329, pp. 155-165, 2012. https://link.springer.com/chapter/10.1007/978-3-642-31149-9_16
19	K. Kim, D. Shin, K. Lee and D. Nyang, "CAPTCHA Analysis using Convolution Filtering," Journal of The Korea Institute of Information Security & Cryptology, Vol. 24, no. 6, pp. 1129-1138, 2014. http://dx.doi.org/10.13089/JKIISC.2014.24.6.1129 DOI
20	J. Xie, L. Xu, and E. Chen, "Image Denoising and Inpainting with Deep Neural Networks," Nips, pp. 1-9, 2012. https://papers.nips.cc/paper/4686-image-denoising-and-in painting-with-deep-neural-networks
21	Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy Layer-Wise Training of Deep Networks," Adv. Neural Inf. Process. Syst., Vol. 19, no. 1, pp. 153-160, 2007.

KSCI

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA CAPTCHA에 사용되는 숫자데이터를 자동으로 판독하기 위한 Autoencoder 모델들의 특성 연구

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA