Browse > Article
http://dx.doi.org/10.3837/tiis.2021.04.014

Image Deduplication Based on Hashing and Clustering in Cloud Storage  

Chen, Lu (Engineering Research Center of Post Big Data Technology and Application of Jiangsu Province Nanjing University of Posts and Telecommunications)
Xiang, Feng (National Engineering Laboratory for Logistics Information Technology)
Sun, Zhixin (Engineering Research Center of Post Big Data Technology and Application of Jiangsu Province Nanjing University of Posts and Telecommunications)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.4, 2021 , pp. 1448-1463 More about this Journal
Abstract
With the continuous development of cloud storage, plenty of redundant data exists in cloud storage, especially multimedia data such as images and videos. Data deduplication is a data reduction technology that significantly reduces storage requirements and increases bandwidth efficiency. To ensure data security, users typically encrypt data before uploading it. However, there is a contradiction between data encryption and deduplication. Existing deduplication methods for regular files cannot be applied to image deduplication because images need to be detected based on visual content. In this paper, we propose a secure image deduplication scheme based on hashing and clustering, which combines a novel perceptual hash algorithm based on Local Binary Pattern. In this scheme, the hash value of the image is used as the fingerprint to perform deduplication, and the image is transmitted in an encrypted form. Images are clustered to reduce the time complexity of deduplication. The proposed scheme can ensure the security of images and improve deduplication accuracy. The comparison with other image deduplication schemes demonstrates that our scheme has somewhat better performance.
Keywords
Cloud Storage; Clustering; Image Deduplication; Perceptual Hash; Feature Extraction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Shen, T. Zhou, D. He, Y. Zhang, X. Sun, and Y. Xiang, "Block Design-Based Key Agreement for Group Data Sharing in Cloud Computing," IEEE Transactions on Dependable and Secure Computing, vol. 16, no. 6, pp. 996-1010, July 2019.   DOI
2 W. Xia, H. Jiang, D. Feng, F. Douglis, P. Shilane, Y. Hua, M. Fu, Y. Zhang, and Y. Zhou, "A Comprehensive Study of the Past, Present, and Future of Data Deduplication," in Proc. of the IEEE, vol. 104, no. 9, pp.1681-1710, Aug. 2016.   DOI
3 Y. Fu, N. Xiao, H. Jiang, G. Hu, and W. Chen, "Application-Aware Big Data Deduplication in Cloud Environment," IEEE Transactions on Cloud Computing, vol. 7, no. 4, pp. 921-934, May 2019.   DOI
4 M. Bellare, S. Keelveedhi, and T. Ristenpart, "Message-Locked Encryption and Secure Deduplication," in Proc. of Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pp. 296-312, 2013.
5 H. S. Gunawi, N. Agrawal, A. C. Arpaci-Dusseau, J. Schindler, and R. H. Arpaci-Dusseau, "Deconstructing commodity storage clusters," in Proc. of the 32nd International Symposium on Computer Architecture (ISCA'05), pp. 60-71, 2005.
6 A. Muthitacharoen, B. Chen, and D. Mazieres, "A low-bandwidth network file system," in Proc. of the 18th ACM Symposium on Operating Systems Review, vol. 35, no. 5, pp. 174-187, Oct. 2009.
7 L. Zhang and J. Ma, "Image Annotation by Incorporating Word Correlations into Multi-class SVM," Soft Computing, pp. 917-927, Feb. 2009.
8 T. D. Moreton, I. A. Pratt, and T. L. Harris, "Storage, Mutability and Naming in Pasta," in Proc. of International Conference on Research in Networking, pp. 215-219, 2002.
9 L. L. You, K. T. Pollack, and D. D. E. Long, "Deep Store: An Archival Storage System Architecture," in Proc. of the 21st International Conference on Data Engineering (ICDE'05), pp. 804-815, 2005.
10 B. Xiao, K. Wang, X. Bi, W. Li, and J. Han, "2D-LBP: An Enhanced Local Binary Feature for Texture Image Classification," IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 9, pp. 2796-2808, Sep. 2019.   DOI
11 R. Chen, Y. Mu, G. Yang, and F. Guo, "BL-MLE: Block-Level Message-Locked Encryption for Secure Large File Deduplication," IEEE Transactions on Information Forensics and Security, vol. 10, no. 12, pp. 2643-2652, Dec. 2015.   DOI
12 W. Mao and X. Peng, "WLIB-SIFT: A Distinctive Local Image Feature Descriptor," in Proc. of IEEE 2nd International Conference on Information Communication and Signal Processing (ICICSP), pp. 379-383, 2019.
13 C. Yan, B. Shao, H. Zhao, R. Ning, Y. Zhang, and F. Xu, "3D Room Layout Estimation from a Single RGB Image," IEEE Transactions on Multimedia, vol. 22, no. 11, pp. 3014-3024, Jan. 2020.   DOI
14 S. Chatterjee and K. Sarawadekar, "An Optimized Architecture of HEVC Core Transform Using Real-Valued DCT Coefficients," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 65, no. 12, pp. 2052-2056, Dec. 2018.   DOI
15 Y. Zhuang and L. Liang, "A Novel Local Invariant Feature Extraction Method for High-dynamic Range Images," in Proc. of the 2nd International Conference on Safety Produce Informatization (IICSPI), pp. 307-310, 2019.
16 C. Yan, B. Gong, Y. Wei, and Y. Gao, "Deep Multi-View Enhancement Hashing for Image Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 4, pp. 1445-1451, Apr. 2021.   DOI
17 Y. Li, D. Wang, and J. Wang, "Perceptual image hash function via associative memory-based self-correcting," Electronics Letters, vol. 54, no. 4, pp. 208-210, Feb. 2018.   DOI
18 J. Li, Y. K. Li, X. Chen, P. P. C. Lee, and W. Lou, "A Hybrid Cloud Approach for Secure Authorized Deduplication," IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 5, pp. 1206-1216, May 2015.   DOI
19 H. Gang, H. Yan, and L. Xu, "Secure Image Deduplication in Cloud Storage," in Proc. of the 3 rd International Conference on Information and Communication Technology-EurAsia (ICTEURASIA) and the 9 th International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS), pp. 243-251, 2015.
20 D. Li, C. Yang, C. Li, Q. Jiang, X. Chen, J. Ma, and J. Ren, "A client-based secure deduplication of multimedia data," in Proc. of IEEE International Conference on Communications (ICC), pp. 1-6, 2017.
21 X. Guo, A. Fu, B. Kuang, and W. Ding, "Secure deduplication and integrity audit system based on convergent encryption for cloud storage," Journal on Communications, vol. 38, no. Z2, pp. 156-163, June 2017.
22 Y. Zhou, D. Feng, W. Xia, and M. Fu, "A twice-Hash based convergent encryption strategy for data deduplication," Computer Engineering & Science, vol. 38, no. 9, pp. 1755-1762, Sep. 2016.
23 F. Yan, Y. Tan, Q. Zhang, F. Wu, Z. Cheng, and J. Zheng, "An effective RAID data layout for object-based deduplication backup system," Chinese Journal of Electronics, vol. 25, no. 5, pp. 832-840, Sep. 2016.   DOI
24 H. Shin, D. Koo, Y. Shin, and J. Hur, "Privacy-Preserving and Updatable Block-Level Data Deduplication in Cloud Storage Services," in Proc. of IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 392-400, 2018.
25 F. Rashid and A. Miri, "Secure image data deduplication through compressive sensing," in Proc. of 14th Annual Conference on Privacy, Security and Trust (PST), pp. 569-572, 2016.
26 N. R. A. Rani, S. K. R. Kumar, and P. P. Kumar, "A Survey on Data Redundancy Check in a Hybrid Cloud by using Convergent Encryption," Indian Journal of Science and Technology, vol. 9, no. 4, pp. 1-5, Jan. 2016.
27 K. W. Su, J. S. Leu, M. C. Yu, Y. T. Wu, E. C. Lee, and T. Song, "Design and implementation of various file deduplication schemes on storage devices," Mobile Networks and Applications, vol. 22, no. 1, pp. 40-50, Jan. 2017.   DOI
28 W. J. Bolosky, S. Corbin, D. Goebel, and J. R. Douceur, "Single instance storage in Windows® 2000," in Proc. of the 4th Conference on Usenix Windows Systems Symposium, pp. 13-24, 2000.
29 X. Li, J. Li, and F. Huang, "A secure cloud storage system supporting privacy-preserving fuzzy deduplication," Soft Computing, vol. 20, no. 4, pp. 1437-1448, Jan. 2016.   DOI
30 C. Yan, Z. Li, Y. Zhang, Y. Liu, X. Ji, and Y. Zhang, "Depth image denoising using nuclear norm and learning graph model," ACM Transactions on Multimedia Computing Communications and Applications, vol. 16, no. 4, pp. 1-17, Dec. 2020.
31 Z. Zhang, Y. Liu, Z. Xiong, J. Li, and M. Zhang, "Focus and Blurriness Measure Using Reorganized DCT Coefficients for an Autofocus Application," IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 1, pp. 15-30, Jan. 2018.   DOI