• Title/Summary/Keyword: Gluster File System

Search Result 10, Processing Time 0.023 seconds

Performance Enhancement and Evaluation of Distributed File System for Cloud (클라우드 분산 파일 시스템 성능 개선 및 평가)

  • Lee, Jong Hyuk
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.11
    • /
    • pp.275-280
    • /
    • 2018
  • The choice of a suitable distributed file system is required for loading large data and high-speed processing through subsequent applications in a cloud environment. In this paper, we propose a write performance improvement method based on GlusterFS and evaluate the performance of MapRFS, CephFS and GlusterFS among existing distributed file systems in cloud environment. The write performance improvement method proposed in this paper enhances the response time by changing the synchronization level used by the synchronous replication method from disk to memory. Experimental results show that the distributed file system to which the proposed method is applied is superior to other distributed file systems in the case of sequential write, random write and random read.

Performance Optimization in GlusterFS on SSDs (SSD 환경 아래에서 GlusterFS 성능 최적화)

  • Kim, Deoksang;Eom, Hyeonsang;Yeom, Heonyoung
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.2
    • /
    • pp.95-100
    • /
    • 2016
  • In the current era of big data and cloud computing, the amount of data utilized is increasing, and various systems to process this big data rapidly are being developed. A distributed file system is often used to store the data, and glusterFS is one of popular distributed file systems. As computer technology has advanced, NAND flash SSDs (Solid State Drives), which are high performance storage devices, have become cheaper. For this reason, datacenter operators attempt to use SSDs in their systems. They also try to install glusterFS on SSDs. However, since the glusterFS is designed to use HDDs (Hard Disk Drives), when SSDs are used instead of HDDs, the performance is degraded due to structural problems. The problems include the use of I/O-cache, Read-ahead, and Write-behind Translators. By removing these features that do not fit SSDs which are advantageous for random I/O, we have achieved performance improvements, by up to 255% in the case of 4KB random reads, and by up to 50% in the case of 64KB random reads.

Implementation and Performance Measuring of Erasure Coding of Distributed File System (분산 파일시스템의 소거 코딩 구현 및 성능 비교)

  • Kim, Cheiyol;Kim, Youngchul;Kim, Dongoh;Kim, Hongyeon;Kim, Youngkyun;Seo, Daewha
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1515-1527
    • /
    • 2016
  • With the growth of big data, machine learning, and cloud computing, the importance of storage that can store large amounts of unstructured data is growing recently. So the commodity hardware based distributed file systems such as MAHA-FS, GlusterFS, and Ceph file system have received a lot of attention because of their scale-out and low-cost property. For the data fault tolerance, most of these file systems uses replication in the beginning. But as storage size is growing to tens or hundreds of petabytes, the low space efficiency of the replication has been considered as a problem. This paper applied erasure coding data fault tolerance policy to MAHA-FS for high space efficiency and introduces VDelta technique to solve data consistency problem. In this paper, we compares the performance of two file systems, MAHA-FS and GlusterFS. They have different IO processing architecture, the former is server centric and the latter is client centric architecture. We found the erasure coding performance of MAHA-FS is better than GlusterFS.

A Study on the Test Results and Implementation of Correlated Result Saving System using the Gluster File System (Gluster 파일시스템을 이용한 상관자료 수집 시스템 구축 및 시험고찰)

  • Yeom, Jae-Hwan;Oh, Se-Jin;Roh, Duk-Gyoo;Jung, Dong-Kyu;Hwang, Ju-Yeon;Oh, Chungsik;Kim, Hyo-Ryoung
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.17 no.2
    • /
    • pp.53-60
    • /
    • 2016
  • In this paper, we introduce the implementation and test results of a new method of correlated result storage to achieve the full performance of the Daejeon hardware correlator. Recently, the observation of 8 Gbps speed, which is the maximum observational standard of KVN(Korean VLBI Network), has been performed. The correlation processing using the Daejeon hardware correlator is also required. Therefore, a new correlation result storage introduction has become necessary. The maximum correlation result output speed of the Daejeon hardware correlator is 1.4 GB/sec per 25.6 ms integration time. The conventional correlation result storage system can not cope with the maximum correlation output speed of the Daejeon hardware correlator, and the output speed is limited to 1/4. That is, among the four input ports of the Daejeon hardware correlator, the three inputs are limited to correspond to the observation rate of 1 Gbps. This new storage system uses the Gluster file system among many of the latest technologies used in storage systems. In tests that meet the maximum output rate of 1.4 GB/sec for the Daejeon hardware correlator, 350 MB/sec for each of the four optical outputs, resulting in 1.4 GB/sec in total.

Design of GlusterFS Based Big Data Distributed Processing System in Smart Factory (스마트 팩토리 환경에서의 GlusterFS 기반 빅데이터 분산 처리 시스템 설계)

  • Lee, Hyeop-Geon;Kim, Young-Woon;Kim, Ki-Young;Choi, Jong-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.1
    • /
    • pp.70-75
    • /
    • 2018
  • Smart Factory is an intelligent factory that can enhance productivity, quality, customer satisfaction, etc. by applying information and communications technology to the entire production process including design & development, manufacture, and distribution & logistics. The precise amount of data generated in a smart factory varies depending on the factory's size and state of facilities. Regardless, it would be difficult to apply traditional production management systems to a smart factory environment, as it generates vast amounts of data. For this reason, the need for a distributed big-data processing system has risen, which can process a large amount of data. Therefore, this article has designed a Gluster File System (GlusterFS)-based distributed big-data processing system that can be used in a smart factory environment. Compared to existing distributed processing systems, the proposed distributed big-data processing system reduces the system load and the risk of data loss through the distribution and management of network traffic.

Monitoring Design for Distributed File System GlusterFS (GlusterFS 분산 파일 시스템 모니터링 설계)

  • Lee, Jeong-Hyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.174-177
    • /
    • 2015
  • 최근 Social, Mobile, IoT 등에 기반한 비즈니스 데이터의 폭증과 함께 이를 저장하고 처리하기 위한 Big Data 플랫폼, 분산 스토리지 기술 등이 사용되고 있다. 최근 제안된 분산 스토리지들은 클라우드 기반 기술과 Scale-Out 아키텍처를 적용하여 데이터의 증가에 대응할 수 있는 구조를 갖추고 있다. 분산 스토리지의 노드가 수백 대 이상으로 증가하는 경우 수작업을 통한 관리방법으로는 운영관리는 불가능하며 자동화된 운영관리와 모니터링 방법이 필요하다. 본 논문에서는 GlusterFS 분산 스토리지를 대상으로 네트워크, 서버, 디스크, 스토리지 서비스 등 시스템 상태를 구간별로 모니터링할 수 있도록 설계하였다. 이를 통해 분산 스토리지 전체 인프라에 대한 모니터링과 스토리지 서비스 수준을 모니터링 할 수 있도록 하였다.

Performance Evaluation of Open Source Based Distributed File System for Cloud Storage (클라우드 스토리지를 위한 오픈 소스 기반 분산 파일 시스템의 성능 평가)

  • Lee, Seho;Kim, Ji-Hong;Eom, Yong Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.185-187
    • /
    • 2012
  • 최근 클라우드 컴퓨팅 기술은 기존의 서버, 데스크톱 컴퓨팅 환경을 빠르게 변화시키며, 차세대 인터넷 서비스의 핵심 분야로 부각되고 있다. 클라우드 컴퓨팅 기술 중 특히 저비용, 안정성, 확장성, 무결성 그리고 보안성을 가지고 있는 클라우드 스토리지 서비스가 각광 받고 있다. 이에 본 논문은 클라우드 스토리지 기반 기술인 분산 파일 시스템에 관해서 살펴보고, 오픈소스 기반의 분산 파일 시스템인 MooseFS, XtreemFS, GlusterFS, Ceph 등을 이용하여 시스템 구축 및 성능 측정을 수행 하였다. 수행결과 Postmark에서는 GlusterFS, MD5SUM에서는 XtreemFS가 가장 좋은 성능을 보여주었다.

Torus Network Based Distributed Storage System for Massive Multimedia Contents (토러스 연결망 기반의 대용량 멀티미디어용 분산 스토리지 시스템)

  • Kim, Cheiyol;Kim, Dongoh;Kim, Hongyeon;Kim, Youngkyun;Seo, Daewha
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1487-1497
    • /
    • 2016
  • Explosively growing service of digital multimedia data increases the need for highly scalable low-cost storage. This paper proposes the new storage architecture based on torus network which does not need network switch and erasure coding for efficient storage usage for high scalability and efficient disk utilization. The proposed model has to compensate for the disadvantage of long network latency and network processing overhead of torus network. The proposed storage model was compared to two most popular distributed file system, GlusterFS and Ceph distributed file systems through a prototype implementation. The performance of prototype system shows outstanding results than erasure coding policy of two file systems and mostly even better results than replication policy of them.

An Analysis and Comparison of Open Source Based Distributed File System for Cloud Environment (클라우드 환경의 오픈소스 기반 분산 파일 시스템 분석 및 비교)

  • Kim, Keonwoo;Kim, Jeehong;Eom, Young Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.182-184
    • /
    • 2012
  • 클라우드 컴퓨팅이 많은 관심을 받고 발전하면서 여러 IT선도업체에서 클라우드 컴퓨팅 기술 개발에 많은 투자를 하고 있다. 이러한 클라우드 컴퓨팅 환경에서는 대부분의 데이터를 서버에 저장한다. 이러한 이유로 클라우드 환경에서 사용되는 파일 시스템은 기존의 파일 시스템 보다 많은 데이터를 저장하게 된다. 이에 따라 많은 데이터를 처리하기 위해서 클라우드 환경에서는 분산 파일 시스템 기술을 활용하고 있다. 또한 분산 파일 시스템은 네트워크상의 여러 스토리지 서버에 데이터가 분산되어 저장되기 때문에 데이터의 관리뿐만 아니라 성능, 장애 허용, 보안 등의 요구사항을 만족해야 한다. 본 논문에서는 XtreemFS, Ceph, GlusterFS, MooseFS 등의 분산 파일 시스템들을 기능적 측면에서 살펴보고, 각 분산 파일 시스템을 본 논문에서 제안하는 기능적 평가요소 측면에서 비교하고 평가한다.

Comparative Analysis on Cloud and On-Premises Environments for High-Resolution Agricultural Climate Data Processing (고해상도 농업 기후 자료 처리를 위한 클라우드와 온프레미스 비교 분석)

  • Park, Joo Hyeon;Ahn, Mun Il;Kang, Wee Soo;Shim, Kyo-Moon;Park, Eun Woo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.4
    • /
    • pp.347-357
    • /
    • 2019
  • The usefulness of processing and analysis systems of GIS-based agricultural climate data is affected by the reliability and availability of computing infrastructures such as cloud, on-premises, and hybrid. Cloud technology has grown in popularity. However, various reference cases accumulated over the years of operational experiences point out important features that make on-premises technology compatible with cloud technology. Both cloud and on-premises technologies have their advantages and disadvantages in terms of operational time and cost, reliability, and security depending on cases of applications. In this study, we have described characteristics of four general computing platforms including cloud, on-premises with hardware-level virtualization, on-premises with operating system-level virtualization and hybrid environments, and compared them in terms of advantages and disadvantages when a huge amount of GIS-based agricultural climate data were stored and processed to provide public services of agro-meteorological and climate information at high spatial and temporal resolutions. It was found that migrating high-resolution agricultural climate data to public cloud would not be reasonable due to high cost for storing a large amount data that may be of no use in the future. Therefore, we recommended hybrid systems that the on-premises and the cloud environments are combined for data storage and backup systems that incur a major cost, and data analysis, processing and presentation that need operational flexibility, respectively.