프라이버시 보존 분류 방법 동향 분석

  • 김평 (서울과학기술대학교 ITM 전공) ;
  • 문수빈 (서울과학기술대학교 SW분석설계학과) ;
  • 조은지 (서울과학기술대학교 SW분석설계학과) ;
  • 이윤호 (서울과학기술대학교 ITM 전공)
  • Published : 2017.06.30

Abstract

기계 학습(machine-learning) 분야의 분류 알고리즘(classification algorithms)은 의료 진단, 유전자 정보 해석, 스팸 탐지, 얼굴 인식 및 신용 평가와 같은 다양한 응용 서비스에서 사용되고 있다. 이와 같은 응용 서비스에서의 분류 알고리즘은 사용자의 민감한 정보를 포함하는 데이터를 이용하여 학습을 수행하는 경우가 많으며, 분류 결과도 사용자의 프라이버시와 연관된 경우가 많다. 따라서 학습에 필요한 데이터의 소유자, 응용 서비스 사용자, 그리고 서비스 제공자가 서로 다른 보안 도메인에 존재할 경우, 프라이버시 보호 문제가 발생할 수 있다. 본 논문에서는 이러한 문제를 해결하면서도 분류 서비스를 제공할 수 있도록 도와주는 프라이버시 보존 분류 프로토콜(privacy-preserving classification protocol: PPCP) 에 대해 소개한다. 구체적으로 PPCP의 프라이버시 보호 요구사항을 분석하고, 기존의 연구들이 프라이버시 보호를 위해 사용하는 암호학적 기본 도구(cryptographic primitive)들에 대해 소개한다. 최종적으로 그러한 암호학적 기본 도구를 사용하여 설계된 프라이버시 보존 분류 프로토콜에 대한 기존 연구들을 소개하고 분석한다.

Keywords

References

  1. J. Wiens, J. Guttag, E. Horvitz, "Learning evolving patient risk processes for c. diff colonization," ICML Workshop on Machine Learning from Clinical Data, 2012.
  2. A. Singh, J.Guttag, "A comparison of non-symmetric entropy-based classificat- ion trees and support vector machine for cardiovascular risk stratification," Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 79-82, 2011.
  3. A. Singh, G. Nadkarni, J. Guttag, E. Bottinger, "Leveraging hierarchy in medical codes for predictive modeling," Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM, pp. 96-103, 2014.
  4. C. Aggarwal, S. Philip, "A general survey of privacy-preserving data mining models and algorithms," Privacy-preserving data mining, Springer US, pp. 11-52, 2008.
  5. R. Bost, R. Popa, S. Tu, S. Goldwasser, "Machine Learning Classification over Encrypted Data," NDSS, 2015.
  6. A. Yao, "Protocols for secure computations," Annual Symposium on Foundations of Computer Science, IEEE, pp. 160-164, 1982.
  7. W. Henecka, A. Sadeghi, T. Schneider, I. Wehrenberg, "TASTY: tool for automating secure two-party computations," Proceedings of the 17th ACM conference on Computer and communications security, ACM, pp. 451-462, 2010.
  8. D Malkhi, N Nisan, B Pinkas, Y Sella, "Fairplay-Secure Two-Party Computation System." USENIX Security Symposium, 4, 2004.
  9. P. Paillier, "Public-key cryptosystems based on composite degree residuosity classes," International Conference on the Theory and Applications of Cryptographic Techniques. Springer Berlin Heidelberg, pp. 223-238, 1999.
  10. S. Goldwasser, S. Micali. "Probabilistic encryption & how to play mental poker keeping secret all partial information," Proceedings of the fourteenth annual ACM symposium on Theory of computing, ACM, pp. 365-377, 1982.
  11. I. Damgard, M. Geisler, M. Kroigard. "A correction to'efficient and secure comparison for on-line auctions'," International Journal of Applied Cryptography, 1(4), pp. 323-324, 2009. https://doi.org/10.1504/IJACT.2009.028031
  12. C. Gentry, "Fully homomorphic encryption using ideal lattices," STOC. Vol. 9. pp. 169-178, 2009. https://doi.org/10.1142/S0219493709002610
  13. D. Beaver, "Commodity-based cryptography," Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, ACM, pp. 446-455, 1997.
  14. G. Asharov, Y. Lindell, T. Schneider, M. Zohner, "More efficient oblivious transfer and extensions for faster secure computation," Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, ACM, pp. 535-548, 2013.
  15. J. Ziegeldorf, J. Metzke, M. Henze, K. Wehrle, "Choose wisely: a comparison of secure two-party computation frameworks," Security and Privacy Workshops, IEEE, pp. 198-205, 2015.
  16. M. Barni, P. Failla, R. Lazzeretti, A. Paus, A. Sadeghi, T. Schneider, V. Kolesnikov, "Efficient privacy-preserving classification of ECG signals," First IEEE International Workshop on Information Forensics and Security, IEEE, pp. 91-95, 2009.
  17. T. Veugen, "Improving the DGK comparison protocol," International Workshop on Information Forensics and Security, IEEE, pp. 49-54, 2012.
  18. X. Liu, R. Lu, J. Ma, L. Chen, B. Qin, "Privacy-preserving patient-centric clinical decision support system on naive Bayesian classification," IEEE journal of biomedical and health informatics 20(2), pp. 655-668, 2016. https://doi.org/10.1109/JBHI.2015.2407157
  19. Y. Elmehdwi, B. Samanthula, W. Jiang, "K-nearest neighbor classification over semantically secure encrypted relational data," IEEE transactions on Knowledge and data engineering, 27(5), pp. 1261-1273, 2015. https://doi.org/10.1109/TKDE.2014.2364027
  20. D. Wu, T. Feng, M. Naehrig, K. Lauter, "Privately evaluating decision trees and random forests," Proceedings on Privacy Enhancing Technologies, 4, pp. 335-355, 2016.
  21. T. Graepel, K. Lauter, M. Naehrig, "ML confidential: Machine learning on encrypted data," International Conference on Information Security and Cryptology, Springer Berlin Heidelberg, pp. 1-21, 2012.
  22. J. Bos, K. Lauter, M. Naehrig, "Private predictive analysis on encrypted medical data," Journal of biomedical informatics, 50, pp. 234-243, 2014. https://doi.org/10.1016/j.jbi.2014.04.003
  23. A. Khedr, G. Gulak, V. Vaikuntanathan, "SHIELD: scalable homomorphic implementation of encrypted data-classifiers," IEEE Transactions on Computers, 65(9), pp. 2848-2858, 2016. https://doi.org/10.1109/TC.2015.2500576
  24. W. Lu, S. Kawasaki, J. Sakuma, "Using Fully Homomorphic Encryption for Statistical Analysis of Categorical, Ordinal and Numerical Data," 2017.
  25. B. David, R. Dowsley, R. Katti, A. Nascimento, "Efficient unconditionally secure comparison and privacy preserving machine learning classification protocols," International Conference on Provable Security. Springer International Publishing, pp. 354-367, 2015.
  26. M. Cock, R. Dowsley, C. Horst, R. Katti, A. Nascimento, W. Poon, S. Truex, "Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models based on Pre-Computation," IEEE Transactions on Dependable and Secure Computing, DOI: 10.1109/TDSC.2017.2679189, 2017.