DOI QR코드

DOI QR Code

Android malicious code Classification using Deep Belief Network

  • Shiqi, Luo (School of Software, Xinjiang University) ;
  • Shengwei, Tian (School of Software, Xinjiang University) ;
  • Long, Yu (Network Center, Xinjiang University) ;
  • Jiong, Yu (School of Software, Xinjiang University) ;
  • Hua, Sun (School of Software, Xinjiang University)
  • Received : 2017.05.06
  • Accepted : 2017.09.15
  • Published : 2018.01.31

Abstract

This paper presents a novel Android malware classification model planned to classify and categorize Android malicious code at Drebin dataset. The amount of malicious mobile application targeting Android based smartphones has increased rapidly. In this paper, Restricted Boltzmann Machine and Deep Belief Network are used to classify malware into families of Android application. A texture-fingerprint based approach is proposed to extract or detect the feature of malware content. A malware has a unique "image texture" in feature spatial relations. The method uses information on texture image extracted from malicious or benign code, which are mapped to uncompressed gray-scale according to the texture image-based approach. By studying and extracting the implicit features of the API call from a large number of training samples, we get the original dynamic activity features sets. In order to improve the accuracy of classification algorithm on the features selection, on the basis of which, it combines the implicit features of the texture image and API call in malicious code, to train Restricted Boltzmann Machine and Back Propagation. In an evaluation with different malware and benign samples, the experimental results suggest that the usability of this method---using Deep Belief Network to classify Android malware by their texture images and API calls, it detects more than 94% of the malware with few false alarms. Which is higher than shallow machine learning algorithm clearly.

Keywords

References

  1. G. McGraw and G. Morisett, "Attacking Malicious Code: A Report to the Infosec Research Council," IEEE Software, vol. 17, no. 5, pp. 33-41, Sep/Oct 2000. https://doi.org/10.1109/52.877857
  2. Seshagiri P,Vazhayil A,Sriram P., "AMA: Static Code Analysis of Web Page for the Detection of Malicious Scripts," Procedia Computer Science, 93:768-773, 2016. https://doi.org/10.1016/j.procs.2016.07.291
  3. Willems C, Holz T, Freiling F., "Toward Automated Dynamic Malware Analysis Using CWSandbox," IEEE Security & Privacy Magazine, 5(2):32-39, 2007.
  4. W.Enck, P.Gilbert, B.gon Chun, L.P.Cox, J. Jung, P. McDaniel, and A. Sheth, "Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones," in Proc. of USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 393-407, 2010.
  5. Y. Zhou, Z. Wang, W. Zhou, and X. Jiang, "Hey, you, get off of my market: Detecting malicious apps in official and alternative android markets," in Proc. of Network and Distributed System Security Symposium (NDSS), 2012.
  6. L.-K. Yan and H. Yin, "Droidscope: Seamlessly recon- structing os and dalvik semantic views for dynamic android malware analysis," in Proc. of USENIX Security Symposium, 2012.
  7. Enck, William, Ongtang, et al., "On lightweight mobile phone application certification," in Proc. of ACM Conference on Computer and Communications Security (CCS), pp. 235-245, 2009.
  8. A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner, "Android permissions demystified," in Proc. of ACM Conference on Computer and Communications Security (CCS), pp. 627-638, 2011.
  9. M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang, "Riskranker: scalable and accurate zero-day android malware detection," in Proc. of International Conference on Mobile Systems, Applications, and Services (MOBISYS), pages 281-294, 2012.
  10. Filiol E, Jacob G, Liard M L., "Evaluation methodology and theoretical model for antiviral behavioural detection strategies," Journal of Computer Virology and Hacking Techniques, 3(1):23-37, 2007.
  11. Venkitaraman R, Gupta G. "Static program analysis of embedded executable assembly code[C]," in Proc. of International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES 2004, Washington Dc, Usa, 157-166, September, 2004.
  12. Zhang B, Li Q, Ma Y., "Research on dynamic heuristic scanning technique and the application of the malicious code detection model," Information Processing Letters, 117:19-24, 2017. https://doi.org/10.1016/j.ipl.2016.06.014
  13. Caballero J, Grier C, Kreibich C, et al., "Measuring Pay-per-Install: The Commoditization of Malware Distribution," in Proc. of Usenix Conference on Security, USENIX Association, 2011.
  14. Chen P S, Lin S C, Sun C H., "Simple and effective method for detecting abnormal internet behaviors of mobile devices," Information Sciences, 321:193-204, 2015. https://doi.org/10.1016/j.ins.2015.04.035
  15. Coron J S., "On the Exact Security of Full Domain Hash," Advances in Cryptology - CRYPTO 2000, Springer Berlin Heidelberg, 229-235, 2004.
  16. Griffin K, Schneider S, Hu X, et al., "Automatic Generation of String Signatures for Malware Detection," in Proc. of Recent Advances in Intrusion Detection, International Symposium, RAID 2009, Saint-Malo, France, Proceedings. DBLP, 101-120, September 23-25, 2009.
  17. Mohaisen A, Alrawi O, Larson M, et al., "Towards a Methodical Evaluation of Antivirus Scans and Labels," Revised Selected Papers of the, International Workshop on Information Security Applications, Springer-Verlag New York, Inc., 231-241, 2003.
  18. Mehdi B, Ahmed F, Khayyam S A, et al., "Towards a Theory of Generalizing System Call Representation for In-Execution Malware Detection," in Proc. of IEEE International Conference on Communications, IEEE, 1-5, 2010.
  19. Xie P D, Li M J, Wang Y J, et al., "Unpacking Techniques and Tools in Malware Analysis," Applied Mechanics & Materials, 198-199:343-350, 2012. https://doi.org/10.4028/www.scientific.net/AMM.198-199.343
  20. Cowen B, Shafi K., "Fractal methods for the representation and analysis of polymorphism in malware," in Proc. of Military Communications and Information Systems Conference, IEEE, 1-5, 2013.
  21. Ozsoy M, Khasawneh K N, Donovick C, et al., "Hardware-based Malware Detection using Low level Architectural Features," IEEE Transactions on Computers, pp. 3332-3344, 2016.
  22. J St'astna, M Tomasek, "The Problem of Malware Packing and its Occurrence in Harmless Software," Acta Electrotechnica et Informatica, 16(3): 41-47, 2016. https://doi.org/10.15546/aeei-2016-0022
  23. Park Y, Reeves D, Mulukutla V, et al., "Fast malware classification by automated behavioral graph matching," in Proc. of CSIIRW '10 Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research, 1-4, 2010.
  24. Christodorescu M, Jha S, Seshia S A, et al., "Semantics-aware malware detection," in Proc. of Security and Privacy, 2005 IEEE Symposium on, 32-46, 2005.
  25. Fredrikson M, Jha S, Christodorescu M, et al., "Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors," IEEE Symposium on Security and Privacy. IEEE Computer Society, 45-60, 2010.
  26. Kolbitsch C, Comparetti P M, Kruegel C, et al., "Effective and efficient malware detection at the end host," in Proc. of 18th Usenix Security Symposium, 351-366, Montreal, Canada, August 10-14, 2009.
  27. Chen K Z, Johnson N, D'Silva V, et al., "Contextual Policy Enforcement in Android Applications with Permission Event Graphs," Heredity, 110(6):586, 2013. https://doi.org/10.1038/hdy.2013.9
  28. Schultz M G, Eskin E, Zadok F, et al., "Data mining methods for detection of new malicious executables," in Proc. of Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on. IEEE, 38-49, 2001.
  29. Arp D, Spreitzenbarth M, Hübner M, et al., "DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket," in Proc. of Network and Distributed System Security Symposium, 2014.
  30. Schultz M G, Eskin E, Zadok F, et al., "Data mining methods for detection of new malicious executables," in Proc. of Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on. IEEE, 38-49, 2001.
  31. Alam M S, Vuong S T., "Random Forest Classification for Detecting Android Malware," in Proc. of Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing, 663-669, 2013.
  32. Shen F, Shen C, Zhou X, et al., "Face image classification by pooling raw features," Pattern Recognition, 54(C):94-103, 2016. https://doi.org/10.1016/j.patcog.2016.01.010
  33. Shen F, Zhou X, Yang Y, et al., "A Fast Optimization Method for General Binary Code Learning," IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 25(12):5610-5621, 2016. https://doi.org/10.1109/TIP.2016.2612883
  34. Shen F, Yang Y, Liu L, et al., "Asymmetric Binary Coding for Image Search," IEEE Transactions on Multimedia, 19(9), 2022-2032, 2017. https://doi.org/10.1109/TMM.2017.2699863
  35. Shi C Z, Zhao Q, Luo L P., "Application of Gray-Scale Texture Feature in the Diagnosis of Pulmonary Nodules," Applied Mechanics & Materials, 140:34-37, 2012.
  36. Hadizadeh H., "Multi-resolution local Gabor wavelets binary patterns for gray-scale texture description," Pattern Recognition Letters, 65(C):163-169, 2015. https://doi.org/10.1016/j.patrec.2015.07.038
  37. Han X G, Qu W, Yao X X, et al., "Research on malicious code variants detection based on texture fingerprint," Journal on Communications, 2014.
  38. Geoffery E. Hinton, Salakhutdinov RR., "Reducing the dimensionality of data with neural networks," Science, 313(5786), 504-7, Jul 28 2006. https://doi.org/10.1126/science.1127647
  39. Ch'Ng S I, Seng K P, Ang L M, et al., "Block-based Deep Belief Networks for face recognition," International Journal of Biometrics, 4(2), 130-143, 2012. https://doi.org/10.1504/IJBM.2012.046247
  40. Yu D, Deng L., "Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP]," IEEE Signal Processing Magazine, 28(1), 145-154, 2011. https://doi.org/10.1109/MSP.2010.939038
  41. Geoffrey E. Hinton, "Training products of experts by minimizing contrastive divergence," Neural Comput., 14(8), 1771-1800, August 2002. https://doi.org/10.1162/089976602760128018
  42. Yin H, Song D, Egele M, et al., "Panorama:capturing system-wide information flow for malware detection and analysis," in Proc. of ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, Usa, DBLP, 116-127, October 2007.
  43. L. Nataraj, S. Karthikeyan, G. Jacob, and B. Manjunath, "Malware images: Visualization and autmoatic classification," in Proc. of VizSec '11 the 8th International Symposium on Visualization for Cyber Security, 2011.
  44. Kancherla K, Mukkamala S., "Image visualization based malware detection," Computational Intelligence in Cyber Security, IEEE, 40-44, 2013.
  45. Zhao K, Zhang D, Su X, et al., "Fest: A feature extraction and selection tool for Android malware detection," in Proc. of Computers and Communication (ISCC), 2015 IEEE Symposium on, 714-720, 2015.
  46. Louk M, Lim H, Lee H J, et al., "An effective framework of behavior detection-advanced static analysis for malware detection," in Proc. of International Symposium on Communications and Information Technologies, IEEE, 361-365, 2014.
  47. Peng L, Wang R, Wu A., "Research on Unknown Malicious Code Automatic Detection Based on Space Relevance Features," Journal of Computer Research & Development, 49(5):949-957, 2012.
  48. Bengio Y., "Learning deep architectures for AI," Foundations and trends in machine learning, 2(1):1-127, 2009. https://doi.org/10.1561/2200000006