• Title/Summary/Keyword: 데이터 중복제거 기술

Search Result 78, Processing Time 0.027 seconds

Privacy Preserving Source Based Deduplicaton Method (프라이버시 보존형 소스기반 중복제거 방법)

  • Nam, Seung-Soo;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.14 no.2
    • /
    • pp.175-181
    • /
    • 2016
  • Cloud storage servers do not detect duplication of conventionally encrypted data. To solve this problem, convergent encryption has been proposed. Recently, various client-side deduplication technology has been proposed. However, this propositions still cannot solve the security problem. In this paper, we suggest a secure source-based deduplication technology, which encrypt data to ensure the confidentiality of sensitive data and apply proofs of ownership protocol to control access to the data, from curious cloud server and malicious user.

Privacy Preserving Source Based Deduplication In Cloud Storage (클라우드 스토리지 상에서의 프라이버시 보존형 소스기반 중복데이터 제거기술)

  • Park, Cheolhee;Hong, Dowon;Seo, Changho;Chang, Ku-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.1
    • /
    • pp.123-132
    • /
    • 2015
  • In cloud storage, processing the duplicated data, namely deduplication, is necessary technology to save storage space. Users who store sensitive data in remote storage want data be encrypted. However Cloud storage server do not detect duplication of conventionally encrypted data. To solve this problem, Convergent Encryption has been proposed. But it inherently have weakness due to brute-force attack. On the other hand, to save storage space as well as save bandwidths, client-side deduplication have been applied. Recently, various client-side deduplication technology has been proposed. However, this propositions still cannot solve the security problem. In this paper, we suggest a secure source-based deduplication technology, which encrypt data to ensure the confidentiality of sensitive data and apply proofs of ownership protocol to control access to the data, from curious cloud server and malicious user.

Study of Efficient Algorithm for Deduplication of Complex Structure (복잡한 구조의 데이터 중복제거를 위한 효율적인 알고리즘 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.29-36
    • /
    • 2021
  • The amount of data generated has been growing exponentially, and the complexity of data has been increasing owing to the advancement of information technology (IT). Big data analysts and engineers have therefore been actively conducting research to minimize the analysis targets for faster processing and analysis of big data. Hadoop, which is widely used as a big data platform, provides various processing and analysis functions, including minimization of analysis targets through Hive, which is a subproject of Hadoop. However, Hive uses a vast amount of memory for data deduplication because it is implemented without considering the complexity of data. Therefore, an efficient algorithm has been proposed for data deduplication of complex structures. The performance evaluation results demonstrated that the proposed algorithm reduces the memory usage and data deduplication time by approximately 79% and 0.677%, respectively, compared to Hive. In the future, performance evaluation based on a large number of data nodes is required for a realistic verification of the proposed algorithm.

Chunk Placement Scheme on Distributed File System Using Deduplication File System (중복제거 파일 시스템을 적용한 분산 파일 시스템에서의 청크 배치 기법)

  • Kim, Keonwoo;Kim, Jeehong;Eom, Young Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.68-70
    • /
    • 2013
  • 대량의 데이터를 효과적으로 저장하고 관리하기 위해서 클라우드 스토리지 시스템에서는 분산 파일 시스템 기술이 이용되고 있다. 그러나 데이터가 증가함에 따라 분산 파일 시스템을 이용함에도 스토리지 확장 비용이 증가하게 된다. 본 논문에서는 분산 파일 시스템의 스토리지 확장 비용을 줄이기 위해서 우리는 중복제거 파일 시스템을 적용한 분산 파일 시스템에서의 청크 배치 기법을 제안한다. 오픈 소스 기반의 분산 파일 시스템인 MooseFS 에 중복제거 파일 시스템인 lessfs 를 적용함으로써 스토리지의 가용공간을 늘릴 수 있으며, 이는 스토리지 확장 비용을 줄이는 효과를 가져온다. 또한, 동일한 청크는 같은 청크 서버에 배치 시킴으로써 중복제거 기회를 높인다. 실험을 통해서 제안 시스템의 중복제거량과 성능에 대해서 평가한다.

Design and Implementation of SANique Smart Vault Backup System for Massive Data Services (대용량 데이터 서비스를 위한 SANique Smart Vault 백업 시스템의 설계 및 구현)

  • Lee, Kyu Woong
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.2
    • /
    • pp.97-106
    • /
    • 2014
  • There is a lot of interest in the data storage and backup systems according to increasing the data intensive services and related user's data. The overhead of backup performance in massive storage system is a critical issue because the traditional incremental backup strategies causes the time consuming bottleneck in the SAN environment. The SANique Smart Vault system is a high performance backup solution with data de-duplication technology and it guarantees these requirements. In this paper, we describe the architecture of SANique Smart Vault system and illustrate efficient delta incremental backup method based on journaling files. We also present the record-level data de-duplication method in our proposed backup system. The proposed forever incremental backup and data de-duplication algorithms are analyzed and investigated by performance evaluation of other commercial backup solutions.

  • PDF

Secure and Efficient Client-side Deduplication for Cloud Storage (안전하고 효율적인 클라이언트 사이드 중복 제거 기술)

  • Park, Kyungsu;Eom, Ji Eun;Park, Jeongsu;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.1
    • /
    • pp.83-94
    • /
    • 2015
  • Deduplication, which is a technique of eliminating redundant data by storing only a single copy of each data, provides clients and a cloud server with efficiency for managing stored data. Since the data is saved in untrusted public cloud server, however, both invasion of data privacy and data loss can be occurred. Over recent years, although many studies have been proposed secure deduplication schemes, there still remains both the security problems causing serious damages and inefficiency. In this paper, we propose secure and efficient client-side deduplication with Key-server based on Bellare et. al's scheme and challenge-response method. Furthermore, we point out potential risks of client-side deduplication and show that our scheme is secure against various attacks and provides high efficiency for uploading big size of data.

Design and Implementation of Multiple Filter Distributed Deduplication System Applying Cuckoo Filter Similarity (쿠쿠 필터 유사도를 적용한 다중 필터 분산 중복 제거 시스템 설계 및 구현)

  • Kim, Yeong-A;Kim, Gea-Hee;Kim, Hyun-Ju;Kim, Chang-Geun
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.10
    • /
    • pp.1-8
    • /
    • 2020
  • The need for storage, management, and retrieval techniques for alternative data has emerged as technologies based on data generated from business activities conducted by enterprises have emerged as the key to business success in recent years. Existing big data platform systems must load a large amount of data generated in real time without delay to process unstructured data, which is an alternative data, and efficiently manage storage space by utilizing a deduplication system of different storages when redundant data occurs. In this paper, we propose a multi-layer distributed data deduplication process system using the similarity of the Cuckoo hashing filter technique considering the characteristics of big data. Similarity between virtual machines is applied as Cuckoo hash, individual storage nodes can improve performance with deduplication efficiency, and multi-layer Cuckoo filter is applied to reduce processing time. Experimental results show that the proposed method shortens the processing time by 8.9% and increases the deduplication rate by 10.3%.

Eliminating Redundant Data for Storage Efficiency on Distributed File Systems (저장 공간의 효율성을 위한 분산 파일 시스템의 중복 데이터 제거 기법)

  • Kim, Jung Hoon;Lim, ByoungHong;Eom, Young Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.111-112
    • /
    • 2009
  • 최근 IT 분야의 키워드 중 하나인 클라우드 컴퓨팅에서, 분산 파일 시스템의 선택은 대용량의 데이터를 관리하기 위해 매우 중요하다. 오픈소스 분산 파일 시스템 중 하나인 HDFS는 데이터의 효율적인 저장과 검색의 장점을 통해 최근 널리 사용되고 있다. HDFS는 데이터를 3단계 중복저장을 통해 신뢰성을 보장한다. 그러나 이러한 중복저장은 데이터 저장의 효율성 저하의 문제점을 갖고 있다. 따라서 본 논문에서는 MD5 해쉬 기술을 적용한 중복 데이터 제거 기법을 제안한다. 본 기법은 시뮬레이션을 통해 저장 공간의 효율성을 향상의 결과를 확인하였다.

Hybrid Data Deduplication Method for reducing wear-level of SSD (SSD의 마모도 감소를 위한 복합적 데이터 중복 제거 기법)

  • Lee, Seung-Kyu;Yang, Yu-Seok;Kim, Deok-Hwan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06a
    • /
    • pp.543-546
    • /
    • 2011
  • SSD는 일반적으로 사용되는 HDD와는 달리 기계적 동작이 없는 반도체 메모리를 사용하여 데이터를 저장하는 장치이다. 플래시 기반의 SSD는 읽기 성능이 뛰어난 반면 덮어쓰기 연산이 안되는 단점이 있다. 즉 마모도가 존재하여 SSD의 수명에 영향을 준다. 하지만 HDD보다 뛰어난 성능 때문에 노트북이나 중요한 데이터 등을 다루는 시스템 등에서 많이 사용하고 있다. 본 논문에서는 이러한 SSD를 서버 스토리지로 사용할 때 기존의 데이터 중복 제거 기법의 장점만을 조합한 복합적 데이터 중복 제거 기술을 제안하고 이 기법이 SSD의 마모도 측면에서 훨씬 효율적임을 검증하였다.