• Title/Summary/Keyword: Distributed Storage

Search Result 698, Processing Time 0.023 seconds

Data Access Frequency based Data Replication Method using Erasure Codes in Cloud Storage System (클라우드 스토리지 시스템에서 데이터 접근빈도와 Erasure Codes를 이용한 데이터 복제 기법)

  • Kim, Ju-Kyeong;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.2
    • /
    • pp.85-91
    • /
    • 2014
  • Cloud storage system uses a distributed file system for storing and managing data. Traditional distributed file system makes a triplication of data in order to restore data loss in disk failure. However, enforcing data replication method increases storage utilization and causes extra I/O operations during replication process. In this paper, we propose a data replication method using erasure codes in cloud storage system to improve storage space efficiency and I/O performance. In particular, according to data access frequency, the proposed method can reduce the number of data replications but using erasure codes can keep the same data recovery performance. Experimental results show that proposed method improves performance in storage efficiency 40%, read throughput 11%, write throughput 10% better than HDFS does.

RDP: A storage-tier-aware Robust Data Placement strategy for Hadoop in a Cloud-based Heterogeneous Environment

  • Muhammad Faseeh Qureshi, Nawab;Shin, Dong Ryeol
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4063-4086
    • /
    • 2016
  • Cloud computing is a robust technology, which facilitate to resolve many parallel distributed computing issues in the modern Big Data environment. Hadoop is an ecosystem, which process large data-sets in distributed computing environment. The HDFS is a filesystem of Hadoop, which process data blocks to the cluster nodes. The data block placement has become a bottleneck to overall performance in a Hadoop cluster. The current placement policy assumes that, all Datanodes have equal computing capacity to process data blocks. This computing capacity includes availability of same storage media and same processing performances of a node. As a result, Hadoop cluster performance gets effected with unbalanced workloads, inefficient storage-tier, network traffic congestion and HDFS integrity issues. This paper proposes a storage-tier-aware Robust Data Placement (RDP) scheme, which systematically resolves unbalanced workloads, reduces network congestion to an optimal state, utilizes storage-tier in a useful manner and minimizes the HDFS integrity issues. The experimental results show that the proposed approach reduced unbalanced workload issue to 72%. Moreover, the presented approach resolve storage-tier compatibility problem to 81% by predicting storage for block jobs and improved overall data block placement by 78% through pre-calculated computing capacity allocations and execution of map files over respective Namenode and Datanodes.

Scalable Blockchain Storage Model Based on DHT and IPFS

  • Chen, Lu;Zhang, Xin;Sun, Zhixin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2286-2304
    • /
    • 2022
  • Blockchain is a distributed ledger that combines technologies such as cryptography, consensus mechanism, peer-to-peer transmission, and time stamping. The rapid development of blockchain has attracted attention from all walks of life, but storage scalability issues have hindered the application of blockchain. In this paper, a scalable blockchain storage model based on Distributed Hash Table (DHT) and the InterPlanetary File System (IPFS) was proposed. This paper introduces the current research status of the scalable blockchain storage model, as well as the basic principles of DHT and the InterPlanetary File System. The model construction and workflow are explained in detail. At the same time, the DHT network construction mechanism, block heat identification mechanism, new node initialization mechanism, and block data read and write mechanism in the model are described in detail. Experimental results show that this model can reduce the storage burden of nodes, and at the same time, the blockchain network can accommodate more local blocks under the same block height.

Software Defined Storaging Method for Data Sharing and Maintenance on Distributed Storage Envorinment (분산 저장환경의 데이터공유 및 관리를 위한 소프트웨어 정의 저장 방법)

  • Cha, ByungRae;Park, Sun;Kim, JongWon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.644-645
    • /
    • 2014
  • This paper proposes a software defined storaging method to converge the network virtualization techique and the RAID of distributed storage environment. The proposed method designs software based storage which it apply a flexible control and maintenance of storages. In addition, the method overcomes the restricted of physical storage cpapcity and cut cousts of data recovery.

  • PDF

Flood Runoff Analysis by a Storage Function Model (저류함수법에 의한 홍수유출해석)

  • 남궁달;김규성
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.38 no.2
    • /
    • pp.75-86
    • /
    • 1996
  • The formulas for estimating the constants of storage function model including K and TL for runoff analysis and a distributed storage function model are discussed in this study. First, the relations between parameters of the storage function model and the kinematic runoff model are theoretically examined, and then optimum constants of storage function model are obtained by the Standardized Davidson-Fletcher-Powell (SDFP) method. Through this analysis, theoretical formulas were obtained as $K = 0.63 {\alpha} KsB{^0.6}$ and $T_{L}=0.11 {\alpha} KsB{^0.6} r{^0.4} {_e}$, which are difficult to use practically because of the unclarified definition of shape factors. From a practical point of view, empirical formula were derived as $K=15.6{^0.3} {_m}$ and $T_{L}=2.1B{^0.36} {_m} {_e}/r{^0.4} {_e}$ for applied watersheds. The proposed formulas are verified for several recoded floods at a few points of watersheds. It is also found that the distributed storage function. can be applied to flood runoff analysis using the new formulas aboved mentioned.

  • PDF

Standard Status on ITU-T Distributed Ledger Technology (ITU-T에서 분산원장기술 표준화 동향)

  • Kwon, D.S.;Park, J.D.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.2
    • /
    • pp.50-68
    • /
    • 2020
  • Distributed Ledger Technology (DLT) refers to a process and related technologies that enable a person to safely suggest, verify, and record state changes (usually updates) to synchronize ledgers distributed across network nodes. DLTs are becoming increasingly important as data management requirements evolve. Therefore, they need to understand the current state of standards (such as distributed storage and access technologies) to address future requirements. This paper provides ITU-T FG-DLT standard activities, such as standardization ization trends, use cases, reference architectures, platform evaluation criteria and future prospects.

Design and Implementation of iATA-based RAID5 Distributed Storage Servers (iATA 기반의 RAID5 분산 스토리지 서버의 설계 및 구현)

  • Ong, Ivy;Lim, Hyo-Taek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.2
    • /
    • pp.305-311
    • /
    • 2010
  • iATA (Internet Advanced Technology Attachment) is a block-level protocol developed to transfer ATA commands over TCP/IP network, as an alternative network storage solution to address insufficient storage problem in mobile devices. This paper employs RAID5 distributed storage servers concept into iATA, in which the idea behind is to combine several machines with relatively inexpensive disk drives into a server array that works as a single virtual storage device, thus increasing the reliability and speed of operations. In the case of one machine failed, the server array will not destroy immediately but able to function in a degradation mode. Meanwhile, information can be easily recovered by using boolean exclusive OR (XOR) logical function with the bit information on the remaining machines. We perform I/O measurement and benchmark tool result indicates that additional fault tolerance feature does not delay read/write operations with reasonable file size ranged in 4KB-2MB, yet higher data integrity objective is achieved.

Distributed data deduplication technique using similarity based clustering and multi-layer bloom filter (SDS 환경의 유사도 기반 클러스터링 및 다중 계층 블룸필터를 활용한 분산 중복제거 기법)

  • Yoon, Dabin;Kim, Deok-Hwan
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.5
    • /
    • pp.60-70
    • /
    • 2018
  • A software defined storage (SDS) is being deployed in cloud environment to allow multiple users to virtualize physical servers, but a solution for optimizing space efficiency with limited physical resources is needed. In the conventional data deduplication system, it is difficult to deduplicate redundant data uploaded to distributed storages. In this paper, we propose a distributed deduplication method using similarity-based clustering and multi-layer bloom filter. Rabin hash is applied to determine the degree of similarity between virtual machine servers and cluster similar virtual machines. Therefore, it improves the performance compared to deduplication efficiency for individual storage nodes. In addition, a multi-layer bloom filter incorporated into the deduplication process to shorten processing time by reducing the number of the false positives. Experimental results show that the proposed method improves the deduplication ratio by 9% compared to deduplication method using IP address based clusters without any difference in processing time.

Optimal Sizing of Distributed Power Generation System based on Renewable Energy Considering Battery Charging Method (배터리 충전방식을 고려한 신재생에너지 기반 분산발전시스템의 용량선정)

  • Kim, Hye Rim;Kim, Tong Seop
    • Plant Journal
    • /
    • v.17 no.3
    • /
    • pp.34-36
    • /
    • 2021
  • The interest in renewable energy-based distributed power generation systems is increasing due to the recognitions of the breakthrough of existing centralized power generation, energy conversion, and environmental problems. In this study, the optimal capacity was selected by simulating a distributed power generation system based on PV and WT using lead acid batteries as the energy storage system. CHP was adopted as the existing power source, and the optimal capacity of the system was derived through MOGA according to the operating modes(full load/part load) of the existing power source. In addition, it was confirmed that the battery life differs when the battery charging method is changed at the same battery capacity. Therefore, for economical and stable power supply and demand, the capacity selection of the distributed generation system considering the battery charging method should be performed.