Browse > Article
http://dx.doi.org/10.13089/JKIISC.2015.25.5.1011

Breaking character and natural image based CAPTCHA using feature classification  

Kim, Jaehwan (Center for Information Security Technologies(CIST), Korea University)
Kim, Suah (Center for Information Security Technologies(CIST), Korea University)
Kim, Hyoung Joong (Center for Information Security Technologies(CIST), Korea University)
Abstract
CAPTCHA(Completely Automated Public Turing test to tell Computers and Humans Apart) is a test used in computing to distinguish whether or not the user is computer or human. Many web sites mostly use the character-based CAPTCHA consisting of digits and characters. Recently, with the development of OCR technology, simple character-based CAPTCHA are broken quite easily. As an alternative, many web sites add noise to make it harder for recognition. In this paper, we analyzed the most recent CAPTCHA, which incorporates the addition of the natural images to obfuscate the characters. We proposed an efficient method using support vector machine to separate the characters from the background image and use convolutional neural network to recognize each characters. As a result, 368 out of 1000 CAPTCHAs were correctly identified, it was demonstrated that the current CAPTCHA is not safe.
Keywords
CAPTCHA; Breaking CAPTCHA; SVM; CNN; HSV color space;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Von Ahn, Luis, et al. "CAPTCHA: Using hard AI problems for security," Advances in Cryptology-EUROCRYPT 2003. Springer Berlin Heidelberg, vol. 2656, pp. 294-311, May. 2003.
2 Hernandez-Castro, Carlos Javier, and Arturo Ribagorda. "Pitfalls in CAPTCHA design and implementation: The Math CAPTCHA, a case study," computers & security, vol. 29, no. 1, pp. 141-157, Feb. 2010.   DOI
3 Soupionis, Yannis, and Dimitris Gritzalis. "Audio CAPTCHA: Existing solutions assessment and a new implementation for VoIP telephony," Computers & Security, vol. 29, no. 5, pp. 603-618, Jul. 2010.   DOI
4 Kalsoom, Sajida, Sheikh Ziauddin, and Abdul Rehman Abbasi. "An image-based CAPTCHA scheme exploiting human appearance characteristics," KSII Transactions on Internet and Information Systems (TIIS), vol. 6, no. 2, pp. 734-750, Feb. 2012.   DOI
5 Bursztein, Elie, Matthieu Martin, and John Mitchell. "Text-based CAPTCHA strengths and weaknesses," Proceedings of the 18th ACM conference on Computer and communications security. ACM, pp. 125-138, Oct. 2011.
6 Gunn, Steve R. "Support vector machines for classification and regression," ISIS technical report 14, May. 1998.
7 Lee, Yuchun. "Handwritten digit recognition using k nearest-neighbor, radial-basis function, and backpropagation neural networks," Neural computation, vol. 3, no. 3, pp. 440-449, Mar. 1991.   DOI
8 Ciresan, Dan Claudiu, et al. "Convolutional neural network committees for handwritten character classification," Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, pp. 1135-1139, Sep. 2011.
9 SungHo Kim, DaeHun Nyang and KyungHee Lee. "Breaking character-based CAPTCHA using color information," Journal of The Korea Institute of Information Security & Cryptology(JKIISC), 19(6), pp. 105-112, Dec. 2009.
10 DaeHun Nyang, YongHeon Choi, SeokJun Hong and Kyunghee Lee, "Analysis of Naver CAPTCHA with Effective Segmentation." Journal of The Korea Institute of Information Security & Cryptology(JKIISC), 23(5), pp. 909-917, Oct. 2013.   DOI
11 Smith, Alvy Ray. "Color gamut transform pairs," ACM Siggraph Computer Graphics. vol. 12, no. 3, pp. 12-19, Aug. 1978.   DOI
12 Ojala, Timo, Matti Pietikainen, and David Harwood. "A comparative study of texture measures with classification based on featured distributions," Pattern recognition, vol. 29, no. 1, pp. 51-59, Jan. 1996.   DOI
13 LeCun, Yann, Koray Kavukcuoglu, and Clement Farabet. "Convolutional networks and applications in vision." Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on. IEEE, pp. 253-256, May. 2010.
14 Recognition demos using CNN, "LeNet-5, convolutional neural networks", http://yann.lecun.com/exdb/lenet/
15 Naver CAPTCHA link, https://nid.naver.com/login/image/captcha/nhncaptchav4.gif?key=??
16 1,000 CAPTCHAs datasets, http://multimedia.korea.ac.kr/uploads/TEST_DATA_1000_image_set.zip
17 CNN Library for MATLAB, https://github.com/rasmusbergpalm/DeepLearnToolbox
18 LBP Library for MATLAB, https://github.com/adikhosla/feature-extraction/tree/master/features