DOI QR코드

DOI QR Code

Reliability Analysis of Privacy Policies Using Android Static Analysis

안드로이드 정적 분석을 활용한 개인정보 처리방침의 신뢰성 분석

  • 정윤교 (공군사관학교 컴퓨터과학과)
  • Received : 2022.08.01
  • Accepted : 2022.10.11
  • Published : 2023.01.31

Abstract

Mobile apps frequently request permission to access sensitive data for user convenience. However, while using mobile applications, sensitive and personal data has been leaked even if users do not allow it. To deal with this problem, Google App Store has required developers to disclose how the mobile app handles user data in a privacy policy. However, users are not certain that the privacy policy describes all the app's behavior. They have no choice but to rely on the privacy policy to confirm how the app uses data. This study designed a system that checks the reliability of privacy policies by analyzing the privacy policy texts and mobile apps. First, the system extracts and analyzes the privacy policy texts to check which personal data the privacy policy discloses that the mobile apps can collect. After analyzing which data apps can access using android static analysis, we compare both results to analyze the reliability of privacy policies. For the experiment, we collected the APK files and metadata of about 13K android apps registered in the Google Play Store and preprocessed the apps by four conditions. According to the comparison between privacy policies and mobile app behavior, many apps can access more personal data than disclosed in the privacy policy.

모바일 앱은 사용자의 편의를 위해 개인정보에 접근할 수 있는 권한을 자주 요청한다. 하지만 이에 따라 모바일 앱을 이용하는 동안 허용되지 않은 개인정보가 유출되는 문제가 많이 발생했다. 이러한 문제를 해결하기 위해 구글 앱스토어에 등록된 앱은 개인정보 처리방침에 사용자의 개인정보를 앱에서 어떻게 활용하는지 명시하도록 했다. 하지만 앱이 수행하는 개인정보 수집 및 처리 과정이 개인정보 처리방침에 정확히 공개되어 있는지 확인하기 어려우며, 모바일 앱 사용자가 앱이 접근할 수 있는 개인정보에 대해 알기 위해서는 개인정보 처리방침에 의존해야만 한다. 본 연구에서는 개인정보 처리방침과 모바일 앱을 분석하여 개인정보 처리방침의 신뢰성을 확인하는 시스템을 제시한다. 먼저 개인정보 처리방침의 텍스트를 추출 및 분석하여 모바일 앱이 어떤 개인정보를 이용할 수 있다고 공개하는지 확인한다. 이후 안드로이드 정적 분석을 통해 앱이 접근할 수 있는 개인정보 분류를 확인하고, 두 결과를 비교하여 개인정보 처리방침을 신뢰할 수 있는지 분석한다. 실험을 위해 구글 앱스토어에 등록된 약 13,000개 안드로이드 앱의 패키지 파일과 부가정보를 수집한 뒤 분석할 수 있는 앱을 선정하기 위해 4가지 조건에 따라 전처리를 진행했다. 선정한 앱을 대상으로 텍스트 분석과 모바일 앱 분석을 진행하고, 이를 비교하여 모바일 앱은 개인정보 처리방침에 공개한 것보다 더욱 많은 개인정보에 접근할 수 있음을 증명한다.

Keywords

References

  1. L. A. Mutchler, J. P. Shim, and D. Ormond, "Exploratory study on users' behavior: smartphone usage," in Proceedings of Americas Conference on Information Systems, pp.418, 2011.
  2. G. Jeon, M. Choi, S. Lee, J. H. Yi, and H. Cho, "Automated multi-layered bytecode generation for preventing sensitive information leaks from android applications," IEEE Access, Vol.9, pp.119578-119590, 2021. https://doi.org/10.1109/ACCESS.2021.3107601
  3. S. Kim and J. Hur, "Mobile application privacy leak detection and security enhancement research," Journal of the Korea Institute of Information Security & Cryptology, Vol.29, No.1, pp.195-203, 2019. https://doi.org/10.13089/JKIISC.2019.29.1.195
  4. Google, User data policy [Internet], https://support.google.com/googleplay/android-developer/answer/10144311.
  5. European Union, General Data Protection Regulation [Internet], https://gdpr-info.eu/.
  6. State of California Department of Justice, California Consumer Privacy Act [Internet], https://oag.ca.gov/privacy/ccpa.
  7. U.S. Congress legislation, American Data Privacy and Protection Act[Internet], https://www.congress.gov/bill/117th-congress/house-bill/8152
  8. A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner, "Android permissions demystified," in Proceedings of the 18th ACM conference on Computer and communications security, pp.627-638, 2011.
  9. K. W. Y. Au, Y. F. Zhou, Z. Huang, and D. Lie, "Pscout: Analyzing the android permission specification," in Proceedings of the 2012 ACM conference on Computer and Communications Security, pp.217-228, 2012.
  10. D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Proceeding of the Network and Distributed System Security Symposium, pp.23-26, 2014.
  11. I. M. Almomani and A. A. Khayer, "A comprehensive analysis of the android permissions system," IEEE Access, Vol.8, pp.216671-216688, 2020. https://doi.org/10.1109/ACCESS.2020.3041432
  12. Z. Wu and S. U.-J. Lee, "Forgotten permission usages: An empirical study on app description based android app analysis," Journal of the Korea Society of Computer and Information, Vol.26, No.6, pp.107-113, 2021. https://doi.org/10.9708/JKSCI.2021.26.06.107
  13. J. Gamba, M. Rashed, A. Razaghpanah, J. Tapiador, and N. Vallina-Rodriguez, "An analysis of pre-installed android software," in Proceeding of IEEE Symposium on Security and Privacy, pp.1039-1055, 2020.
  14. A. M. McDonald and L. F. Cranor, "The cost of reading privacy policies," Isjlp, Vol.4, pp.543, 2008.
  15. S. Wilson et al., "The creation and analysis of a website privacy policy corpus," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp.1330-1340, 2016.
  16. S. Zimmeck et al., "Maps: Scaling privacy compliance analysis to a million apps," in Proceedings on Privacy Enhancing Technologies, pp.66-86, 2019.
  17. F. Liu, S. Wilson, P. Story, S. Zimmeck, and N. Sadeh, "Towards automatic classification of privacy policy text," School of Computer Science Carnegie Mellon University, 2018.
  18. T. Libert, "An automated approach to auditing disclosure of third-party data collection in website privacy policies," in Proceedings of the 2018 World Wide Web Conference, pp.207-216,
  19. I. Paek, J. Oh, and K. Lee, "A study on the methods for ensuring the transparency of the privacy policies in android environment: based on General Data Protection Regulation," Journal of the Korea Institute of Information Security & Cryptology, Vol.29, No.6, pp.1477-1489, 2019. https://doi.org/10.13089/JKIISC.2019.29.6.1477
  20. Leontiadis. I, Efstratiou. C, Picone. M, and Mascolo. C, "Don't kill my ads! balancing privacy in an ad-supported mobile application market." in Proceedings of the Twelfth Workshop on Mobile Computing Systems & Applications, pp.1-6, 2012.
  21. M. Backes, S. Bugiel, and E. Derr, "Reliable third-party library detection in android and its security applications," in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp.356-367, 2016.
  22. M. I. Gordon, D. Kim, J. Perkins, L. Gilham, N. Nguyen, and M. Rinard, "Information-flow analysis of android applications in droidsafe," in Proceeding of the Network and Distributed System Security Symposium, pp.110, 2015
  23. F. Wei, S. Roy, and X. Ou, "Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps," ACM Transactions on Privacy and Security, Vol.21, No.3, pp.1-32,
  24. Y. Pan, X. Ge, C. Fang, and Y. Fan, "A systematic literature review of android malware detection using static analysis," IEEE Access, Vol.8, pp.116363-116379, 2020. https://doi.org/10.1109/access.2020.3002842
  25. W. Enck et al., "Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones," ACM Transactions on Computer Systems, Vol.32, No.2, pp.1-29, 2014. https://doi.org/10.1145/2619091
  26. A. Razaghpanah et al., "Haystack: A multi-purpose mobile vantage point in user space," arXiv preprint arXiv:1510.014 19, 2015.
  27. J. Ren, M. Lindorfer, D. J. Dubois, A. Rao, D. Choffnes, and N. Vallina-Rodriguez, "Bug fixes, improvements,... and privacy leaks: A longitudinal study of pii leaks across android app versions," in Proceeding of the Network and Distributed System Security Symposium, 2018.
  28. J. Ahn, H. Yoon, and S. Jung, "An enhancement scheme of dynamic analysis for evasive android malware," Journal of the Korea Institute of Information Security & Cryptology, Vol.29, No.3, pp.519-529, 2019. https://doi.org/10.13089/JKIISC.2019.29.3.519
  29. Android Privacy Analysis, Dataset of privacy policy and mobile app analysis [Internet], https://android-privacy.github.io/
  30. Mozilla, Readability.js [Internet], https://github.com/moz illa/readability.
  31. Anthony Desnos, Androguard documentation [Internet], https://androguard.readthedocs.io.
  32. Google, The Privacy Sandbox [Internet], https://privacysan dbox.com.
  33. Y. Jung, "Reliability analysis of privacy policies based on android static analysis," in Proceedings of the Annual Spring Conference of Korea Information Processing Society Conference (KIPS), Vol.29, pp.221-224, 2022.