Browse > Article
http://dx.doi.org/10.3745/JIPS.03.0160

Survey on Data Deduplication in Cloud Storage Environments  

Kim, Won-Bin (Dept. of Software Convergence, Soonchunhyang University)
Lee, Im-Yeong (Dept. of Software Convergence, Soonchunhyang University)
Publication Information
Journal of Information Processing Systems / v.17, no.3, 2021 , pp. 658-673 More about this Journal
Abstract
Data deduplication technology improves data storage efficiency while storing and managing large amounts of data. It reduces storage requirements by determining whether replicated data is being added to storage and omitting these uploads. Data deduplication technologies require data confidentiality and integrity when applied to cloud storage environments, and they require a variety of security measures, such as encryption. However, because the source data cannot be transformed, common encryption techniques generally cannot be applied at the same time as data deduplication. Various studies have been conducted to solve this problem. This white paper describes the basic environment for data deduplication technology. It also analyzes and compares multiple proposed technologies to address security threats.
Keywords
Date Deduplication; Cloud Storage; Encryption; Security;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 M. Bellare, C. Namprempre, D. Pointcheval, and M. Semanko, "The one-more-RSA-inversion problems and the security of Chaum's Blind Signature Scheme," Journal of Cryptology, vol. 16, no. 3, pp. 182-215, 2003.
2 J. Stanek, A. Sorniotti, E. Androulaki, and L. Kencl, "A secure data deduplication scheme for cloud storage," in Financial Cryptography and Data Security. Heidelberg, Germany: Springer, 2014, pp. 99-118
3 X. Jin, L. Wei, M. Yu, N. Yu, and J. Sun, "Anonymous deduplication of encrypted data with proof of ownership in cloud storage," in Proceedings of 2013 IEEE/CIC International Conference on Communications in China (ICCC), Xi'an, China, 2013, pp. 224-229.
4 J. Hur, D. Koo, Y. Shin, and K. Kang, "Secure data deduplication with dynamic ownership management in cloud storage," IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 11, pp. 3113-3125, 2016.   DOI
5 M. I. Salam, W. C. Yau, J. J. Chin, S. H. Heng, H. C. Ling, R. C. Phan, G. S. Poh, S. Y. Tan, and W. S. Yap, "Implementation of searchable symmetric encryption for privacy-preserving keyword search on cloud storage," Human-centric Computing and Information Sciences, vol. 5, article no. 19, 2015. https://doi.org/10.1186/s13673-015-0039-9   DOI
6 P. Puzio, R. Molva, M. Onen, and S. Loureiro, "ClouDedup: secure deduplication with encrypted data for cloud storage," in Proceedings of 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, 2013, pp. 363-370.
7 P. Puzio, R. Molva, M. Onen, and S. Loureiro, "Block-level de-duplication with encrypted data," Open Journal of Cloud Computing (OJCC), vol. 1, no. 1, pp. 10-18, 2014.
8 J. Camenisch and G. Neven, "Simulatable adaptive oblivious transfer," in Advances in Cryptology-EUROCRYPT2007. Heidelberg, Germany: Springer, 2007, pp. 573-590
9 D. Russell, "Data deduplication will be even bigger in 2010," 2010 [Online]. Available from: https://www.gartner.com/en/documents/1297513/data-deduplication-will-be-even-bigger-in-2010.
10 D. Harnik, B. Pinkas, and A. Shulman-Peleg, "Side channels in cloud services: deduplication in cloud storage," IEEE Security & Privacy, vol. 8, no. 6, pp. 40-47, 2010.
11 C. Park, D. Hong, C. Seo, and K. Y. Chang, "Privacy preserving source based deduplication in cloud storage," Journal of the Korea Institute of Information Security & Cryptology, vol. 25, no. 1, pp. 123-132, 2015.   DOI
12 M. Dutch, "Understanding data deduplication ratios," 2008 [Online]. Available: https://www.snia.org/sites/default/files/Understanding_Data_Deduplication_Ratios-20080718.pdf.
13 J. Paulo and J. Pereira, "A survey and classification of storage deduplication systems," ACM Computing Surveys (CSUR), vol. 47, no. 1, article no. 11, 2014.
14 J. Li, X. Chen, X. Huang, S. Tang, Y. Xiang, M. M. Hassan, and A. Alelaiwi, "Secure distributed deduplication systems with improved reliability," IEEE Transactions on Computers, vol. 64, no. 12, pp. 3569-3579, 2015.   DOI
15 J. Wang and X. Chen, "Efficient and secure storage for outsourced data: a survey," Data Science and Engineering, vol. 1, no. 3, pp. 178-188, 2016.   DOI
16 J. Kim and S. Nepal, "A cryptographically enforced access control with a flexible user revocation on untrusted cloud storage," Data Science and Engineering, vol. 1, no. 3, pp. 149-160, 2016.   DOI
17 S. Rafaeli and D. Hutchison, "A survey of key management for secure group communication," ACM Computing Surveys (CSUR), vol. 35, no. 3, pp. 309-329, 2003.   DOI
18 N. Cook, D. Milojicic, and V. Talwar, "Cloud management," Journal of Internet Services and Applications, vol. 3, no. 1, pp. 67-75, 2012.   DOI
19 S. A. El-Booz, G. Attiya, and N. El-Fishawy, "A secure cloud storage system combining time-based one-time password and automatic blocker protocol," EURASIP Journal on Information Security, vol. 2016, no. 1, article no. 13, 2016. https://doi.org/10.1186/s13635-016-0037-0   DOI
20 N. Singh and A. K. Singh, "Data privacy protection mechanisms in cloud," Data Science and Engineering, vol. 3, no. 1, pp. 24-39, 2018.   DOI
21 Z. Guan, J. Li, Y. Zhang, R. Xu, Z. Wang, and T. Yang, "An efficient traceable access control scheme with reliable key delegation in mobile cloud computing," EURASIP Journal on Wireless Communications and Networking, vol. 2016, article no. 208, 2016. https://doi.org/10.1186/s13638-016-0705-2   DOI
22 K. Hashizume, D. G. Rosado, E. Fernandez-Medina, and E. B. Fernandez, "An analysis of security issues for cloud computing," Journal of Internet Services and Applications, vol. 4, article no. 5, 2013. https://doi.org/10.1186/1869-0238-4-5   DOI
23 A. Brinkmann, S. Effert, F. M. auf der Heide, and C. Scheideler, "Dynamic and redundant data placement," in Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS), Toronto, Canada, 2007.
24 J. Xu, E. C. Chang, and J. Zhou, "Weak leakage-resilient client-side deduplication of encrypted data in cloud storage," in Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, Hangzhou, China, 2013, pp. 195-206.
25 J. Li, X. Chen, M. Li, J. Li, P. P. Lee, and W. Lou, "Secure deduplication with efficient and reliable convergent key management," IEEE Transactions on Parallel And Distributed Systems, vol. 25, no. 6, pp. 1615-1625, 2014.   DOI
26 U. Habiba, R. Masood, M. A. Shibli, and M. A. Niazi, "Cloud identity management security issues & solutions: a taxonomy," Complex Adaptive Systems Modeling, vol. 2, no. 1, pp. 1-37, 2014.   DOI
27 N. Fotiou, A. Machas, G. C. Polyzos, and G. Xylomenos, "Access control as a service for the cloud," Journal of Internet Services and Applications, vol. 6, no. 1, pp. 1-15, 2015.   DOI
28 W. K. Ng, Y. Wen, and H. Zhu, "Private data deduplication protocols in cloud storage," in Proceedings of the 27th Annual ACM Symposium on Applied Computing, Trento, Italy, 2012, pp. 441-446.
29 Y. J. Shin, J. Hur, and K. Kim, "Security weakness in the proof of storage with deduplication," 2012 [Online]. Available: https://eprint.iacr.org/2012/554.pdf.
30 J. Blasco, R. Di Pietro, A. Orfila, and A. Sorniotti, "A tunable proof of ownership scheme for deduplication using bloom filters," in Proceedings of 2014 IEEE Conference on Communications and Network Security, San Francisco, CA, 2014, pp. 481-489.
31 A. Iyengar, R. Cahn, J. A. Garay, and C. Jutla, "Design and implementation of a secure distributed data repository," IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY, 1998.
32 A. W. Leung, E. L. Miller, and S. Jones, "Scalable security for petascale parallel file systems," in Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, Reno, NV, 2007, pp. 1-12.
33 M. W. Storer, K. Greenan, D. D. Long, and E. L. Miller, "Secure data deduplication," in Proceedings of the 4th ACM International Workshop on Storage Security and Survivability, Alexandria, VA, 2008, pp. 1-10.
34 S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg, "Proofs of ownership in remote storage systems," in Proceedings of the 18th ACM Conference on Computer and Communications Security, Chicago, IL, 2011, pp. 491-500.
35 J. Li, M. N. Krohn, D. Mazieres, and D. E. Shasha, "Secure Untrusted Data Repository (SUNDR)," in Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI), San Francisco, CA, 2004, pp. 121-136.
36 J. R. Douceur, A. Adya, W. J. Bolosky, P. Simon, and M. Theimer, "Reclaiming space from duplicate files in a serverless distributed file system," in Proceedings 22nd International Conference on Distributed Computing Systems, Vienna, Austria, 2002, pp. 617-624.
37 E. L., Miller, D. D. Long, W. E. Freeman, and B. Reed, "Strong security for network-attached storage," in Proceedings of the 1st UNENIX Conference on File and Storage Technologies (FAST), Monterey, CA, 2002, pp. 1-13.
38 C. Wang, Z. G. Qin, J. Peng, and J. Wang, "A novel encryption scheme for data deduplication system," in Proceedings of 2010 International Conference on Communications, Circuits and Systems (ICCCAS), 2010, pp. 265-269.
39 S. Keelveedhi, M. Bellare, and T. Ristenpart, "Dupless: server-aided encryption for deduplicated storage," in Proceedings of the 22nd USENIX Security Symposium, Washington, DC, 2013, pp. 179-194.
40 R. C. Merkle, "A digital signature based on a conventional encryption function," in Advances in Cryptology - CRYPT'87. Heidelberg, Germany: Springer, 1987, pp. 369-378.
41 L. Marques and C. J. Costa, "Secure deduplication on mobile devices," in Proceedings of the 2011 Workshop on Open Source and Design of Communication, Lisbon, Portugal, 2011, pp. 19-26.
42 M. Naor and O. Reingold, "Number-theoretic constructions of efficient pseudo-random functions," Journal of the ACM, vol. 51, no. 2, pp. 231-262, 2004.   DOI
43 S. Quinlan and S. Dorward, "Venti: a new approach to archival storage," in Proceedings of the 1st UNENIX Conference on File and Storage Technologies (FAST), Monterey, CA, 2002, pp. 89-101.
44 M. Miao, J. Wang, H. Li, and X. Chen, "Secure multi-server-aided data deduplication in cloud computing," Pervasive and Mobile Computing, vol. 24, pp. 129-137, 2015.   DOI
45 M. Bellare, S. Keelveedhi, and T. Ristenpart, "Message-locked encryption and secure deduplication," in Advances in Cryptology - EUROCRYPT 2013. Heidelberg, Germany: Springer, 2013, pp. 296-312.
46 D. Chaum, "Blind signatures for untraceable payments," in Advances in Cryptology. Boston, MA: Springer, 1993, pp. 199-203.
47 B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors," Communications of the ACM, vol. 13, no. 7, pp. 422-426, 1970.   DOI
48 N. Kaaniche and M. Laurent, "A secure client side deduplication scheme in cloud storage environments," in Proceedings of 2014 6th International Conference on New Technologies, Mobility and Security (NTMS), Dubai, UAE, 2014, pp. 1-7.
49 K. Kim, K. Y. Chang, and I. K. Kim, "Deduplication technologies over encrypted data," Electronics and Telecommunications Trends, vol. 33, no. 1, pp. 68-77, 2018.   DOI