Browse > Article
http://dx.doi.org/10.13089/JKIISC.2020.30.4.657

Machine Learning Based Automated Source, Sink Categorization for Hybrid Approach of Privacy Leak Detection  

Shim, Hyunseok (Soongsil University)
Jung, Souhwan (Soongsil University)
Abstract
The Android framework allows apps to take full advantage of personal information through granting single permission, and does not determine whether the data being leaked is actual personal information. To solve these problems, we propose a tool with static/dynamic analysis. The tool analyzes the Source and Sink used by the target app, to provide users with information on what personal information it used. To achieve this, we extracted the Source and Sink through Control Flow Graph and make sure that it leaks the user's privacy when there is a Source-to-Sink flow. We also used the sensitive permission information provided by Google to obtain information from the sensitive API corresponding to Source and Sink. Finally, our dynamic analysis tool runs the app and hooks information from each sensitive API. In the hooked data, we got information about whether user's personal information is leaked through this app, and delivered to user. In this process, an automated Source/Sink classification model was applied to collect latest Source/Sink information, and the we categorized latest release version of Android(9.0) with 88.5% accuracy. We evaluated our tool on 2,802 APKs, and found 850 APKs that leak personal information.
Keywords
Android application; Flow analysis; Malware detection; Privacy leak;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Techcrunch, S"martphone owners are using 9 apps per day, 30 per month" https://techcrunch.com/2017/05/04/report-smartphone-owners-are-using-9-apps-per-day-30-per-month/, Last Accessed 22 Apr 2020.
2 Statista, "Market reach of the most popular Android app categories worldwide as of June 2018" https://www.statista.com/statistics/200855/favourite-smartphone-app-categories-by-share-of-smartphone-users/, Last Accessed 22 Apr 2020.
3 Comscore, "The 2017 U.S. Mobile App Report" https://www.comscore.com/Insights/Presentations-and-Whitepapers/2017/The-2017-US-Mobile-App-Report?cs_edgescape_cc=KR, Last Accessed 22 Apr 2020.
4 Siegfried, R., Steven, A., Eric, B., "A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks," in Proc. Proceedings of the 21st Annual Network and Distributed System Security Symposium (NDSS 2014), Feb. 2014.
5 Shengqian, Y., Dacong, Y., Haowei, W., Yan, W., Atanas, R., "Static Control-Flow Analysis of User-Driven Callbacks in Android Applications," 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, pp. 89-99, May. 2015.
6 Noriyuki, S., Tetsuo, K., Katsuhisa, M., "Detecting Invalid Layer Combinations Using Control-Flow Analysis for Android," COP'16 Proceedings of the 8th International Workshop on Context-Oriented Programming, pp. 27-32, July. 2016.
7 Goran, P., Lisa Nguyen, Q., Eric, B., "Codebase-adaptive detection of security-relevant methods," Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 181-191, July. 2019.
8 Statista, "Average number of new Android app releases per day from 3rd quarter 2016 to 1st quarter 2018" https://www.statista.com/statistics/276703/android-app-releases-worldwide/, Last Accessed 22 Apr 2020.
9 Yan H., Weiqiang K., Deng D., Jun Y., "Method-Level Permission Analysis Based on Static Call Graph of Android Apps," 2018 5th International Conference on Dependable Systems and Their Applications (DSA), pp. 8-14, Sep. 2018.
10 Lei Z., Zhemin Y., Yuyu H., Zhenyu Z., Zhiyun Q., Geng H., et al., "Invetter: Locating Insecure Input Validations in Android Services," CCS '18 Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1165-1178, Oct. 2018.
11 Google Developer, "Google Developer" https://developer.android.com/?hl=ko, Last Accessed 22 Apr 2020.
12 Michael B., Sven B., Erik D., Patrick M., Damien O., Sebastian W., "On demystifying the android application framework: re-visiting android permission specification analysis," SEC'16 Proceedings of the 25th USENIX Conference on Security Symposium, Austin, pp. 1101-1118, Aug. 2016.
13 AndroidXRef, "AndroidXRef" http://androidxref.com/, Last Accessed 22 Aug 2019
14 Xposed Module Repository, "XPosed" https://dl-xda.xposed.info/framework/, Last Accessed 22 Apr 2020.
15 Statcounter, "Mobile Operating System Market Share Worldwide" https://gs.statcounter.com/os-marketshare/mobile/worldwide, Last Accessed 22 Apr 2020.
16 Android Malware Analysis as a Service, "AMAaaS" https://AMAaaS.com, Last Accessed 22 Apr 2020.
17 Steven, A., Siegfried, R., Christian, F., Eric, B., Alexandre, B., Jacques, K.,, et al., "FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps," Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, Edinburgh, pp. 259-269, June. 2014.