A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA

Jeon, Jae-seung;Moon, Jong-sub;

doi:10.7472/jksii.2017.18.6.25

인터넷정보학회논문지 (Journal of Internet Computing and Services)

제18권6호
/
Pages.25-34
/
2017
/
1598-0170(pISSN)
/
2287-1136(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

CAPTCHA에 사용되는 숫자데이터를 자동으로 판독하기 위한 Autoencoder 모델들의 특성 연구

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA

전재승 ;
문종섭

Jeon, Jae-seung (Graduate School of Information Security, Korea university) ;
Moon, Jong-sub (Graduate School of Information Security, Korea university)

투고 : 2017.09.01
심사 : 2017.10.02
발행 : 2017.12.31

https://doi.org/10.7472/jksii.2017.18.6.25 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

오토인코더(Autoencoder)는 입력 계층과 출력 계층이 동일한 딥러닝의 일종으로 은닉 계층의 제약 조건을 이용하여 입력 벡터의 특징을 효과적으로 추출하고 복원한다. 본 논문에서는 CAPTCHA 이미지 중 하나의 숫자와 자연배경이 혼재된 영역을 대상으로 일련의 다양한 오토인코더 모델들을 적용하여 잡음인 자연배경을 제거하고 숫자 이미지만을 복원하는 방법들을 제시한다. 제시하는 복원 이미지의 적합성은 오토인코더의 출력을 입력으로 하는 소프트맥스 함수를 활성화 함수로 사용하여 검증하고, CAPTCHA 정보를 자동으로 획득하는 다른 방법들과 비교하여, 본 논문에서 제시하는 방법의 우수함을 검증하였다.

Autoencoder is a type of deep learning method where input layer and output layer are the same, and effectively extracts and restores characteristics of input vector using constraints of hidden layer. In this paper, we propose methods of Autoencoders to remove a natural background image which is a noise to the CAPTCHA and recover only a numerical images by applying various autoencoder models to a region where one number of CAPTCHA images and a natural background are mixed. The suitability of the reconstructed image is verified by using the softmax function with the output of the autoencoder as an input. And also, we compared the proposed methods with the other method and showed that our methods are superior than others.

키워드

참고문헌

E. Bursztein, J. Aigrain, A. Moscicki, and J. C. Mitchell, "The End is Nigh: Generic Solving of Text-based CAPTCHAs," Usenix Woot, 2014. https://www.usenix.org/node/185129
https://www.google.com/recaptcha/intro/invisible.html
B. M. Powell, E. Kalsy, G. Goswami, M. Vatsa, R. Singh, and A. Noore, "Attack-Resistant aiCAPTCHA using a Negative Selection Artificial Immune System," urity and Privacy Workshops (SPW), IEEE, pp. 1-6, 2017. https://doi.org/10.1109/SPW.2017.22
K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, "Computers beat humans at single character recognition in reading based human interaction proofs (HIPs)," in Proc. of Second Conf. Email Anti-Spam, 2005. https://www.microsoft.com/en-us/research/wp-content/uploads/2005/01/CEAS2005Final.doc
E. Bursztein, M. Martin, and J. C. Mitchell, "Textbased CAPTCHA strengths and weaknesses," in Proc. of 18th ACM Conf. Comput. Commun. Secur., ISBN: 978-1-4503-0948-6, pp. 125-138. 2011. https://doi.org/10.1145/2046707.2046724
C. Cruz-Perez, O. Starostenko, F. Uceda-Ponga, V. Alarcon- Aquino, and L. Reyes-Cabrera, "Breaking reCAPTCHAs with unpredictable collapse: Heuristic character segmentation and recognition," Pattern Recognition, vol. 7329, pp. 155-165, 2012. https://link.springer.com/chapter/10.1007/978-3-642-31149-9_16
K. Kim, D. Shin, K. Lee and D. Nyang, "CAPTCHA Analysis using Convolution Filtering," Journal of The Korea Institute of Information Security & Cryptology, Vol. 24, no. 6, pp. 1129-1138, 2014. http://dx.doi.org/10.13089/JKIISC.2014.24.6.1129
J. Kim, S. Kim, and H. J. Kim, "Breaking character and natural image based CAPTCHA using feature classification," Journal of The Korea Institute of Information Security & Cryptology, Vol. 25, no. 5, pp. 1011-1019, 2015. http://dx.doi.org/10.13089/JKIISC.2015.25.5.1011
J. Xie, L. Xu, and E. Chen, "Image Denoising and Inpainting with Deep Neural Networks," Nips, pp. 1-9, 2012. https://papers.nips.cc/paper/4686-image-denoising-and-in painting-with-deep-neural-networks
Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy Layer-Wise Training of Deep Networks," Adv. Neural Inf. Process. Syst., Vol. 19, no. 1, pp. 153-160, 2007.
P. Vincent, H. Larochelle, Y. Bengio, and P.A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. of 25th Int. Conf. Mach. Learn. - ICML '08, pp. 1096-1103, 2008. http://machinelearning.org/archive/icml2008/papers/592.pdf
A. Ng, "CS229 Lecture notes," CS229 Lecture notes, pp. 1-30, 2000.
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.A. Manzagol, "Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion," J. Mach. Learn. Res., Vol. 11, pp. 3371-3408, 2010. http://www.jmlr.org/papers/v11/vincent10a.html
G. E. Hinton and R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science, Vol. 313, no. 5786, pp. 504-507, 2006. https://doi.org/10.1126/science.1127647
A. Ng, "Sparse autoencoder," CS294A Lect. notes, 2011, pp. 1-19.
G. E. Hinton, S. Osindero, and Y. W. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., Vol. 18, no. 7, pp. 1527-54, 2006. https://www.cs.toronto.edu/-hinton/absps/fastnc.pdf https://doi.org/10.1162/neco.2006.18.7.1527
http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/
https://nid.naver.com/login/image/captcha/nhncaptchav4.gif?key=
J. Canny, "A Computational Approach to Edge Detection," IEEE Trans. Pattern Anal. Mach. Intell., Vol. PAMI-8, no. 6, pp. 679-698, 1986. htps://doi.org/10.1109/TPAMI.1986.4767851
A. Geron, "Hands on Machine Learning with scikit-learn and Tensorflow," 2017
T. Amaral, L. M. Silva, L. A. Alexandre, C. Kandaswamy, J. M. Santos, and J. M. De Sa, "Using different cost functions to train stacked auto-encoders," Artificial Intelligence (MICAI), 2013 12th Mexican International Conference on, pp. 114-120, 2013. https://doi.org/10.1109/MICAI.2013.20

인터넷정보학회논문지 (Journal of Internet Computing and Services)

CAPTCHA에 사용되는 숫자데이터를 자동으로 판독하기 위한 Autoencoder 모델들의 특성 연구

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)