• Title/Summary/Keyword: Cluster file system

Search Result 91, Processing Time 0.036 seconds

Design and Implementation of CAN Cluster File System (캔 클러스터 파일 시스템의 설계 및 구현)

  • 황인철;임동혁;김호진;맹승렬;조정완
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.28-30
    • /
    • 2004
  • 요즘 네트웍과 PC의 성능이 향상됨에 따라 값싼 PC를 빠른 네트웍으로 묶어 높은 성능을 얻고자 하는 클러스터 시스템에 대하여 많이 연구되어 왔다. 이러한 연구의 한 분야로서 클러스터 시스템에서 각 노드의 CPU나 메모리에 비하여 상대적으로 느린 디스크에 접근하는 파일 시스템을 효율적으로 구성하려는 연구가 이루어지고 있다. 기존 클러스터 파일 시스템은 기존에 연구되었던 분산 시스템의 파일 시스템을 그대로 사용하는 경우가 많았다. 기존 분산 시스템들은 클러스터 시스템과 유사한 부분들이 존재 하지만 다른 부분도 존재한다. 클러스터 시스템을 사용하는 사용자에게 높은 성능의 데이터 입출력과 효율적인 지원을 위해서는 클러스터 시스템의 특성을 잘 활용하는 클러스터 파일 시스템에 대한 연구가 필요하다. 본 논문에서는 클러스터 시스템의 특성을 잘 활용하는 캔 클러스터 파일 시스템의 설계 및 구현에 대하여 기술한다. 캔 클러스터 파일 시스템은 자료 저장 시스템을 클러스터 시스템의 특성을 잘 활용하는 단일 디스크 입출력을 사용하고 그 위에 상호 협력 캐쉬를 구현함으로서 높은 대역폭의 데이터 입출력을 제공한다. 이러한 캔 클러스터 파일 시스템의 성능을 기존 파일 시스템 중 PVFS와 테스트 프로그램 수행을 통하여 성능을 비교, 분석한다.

  • PDF

A Content-based Load Balancing Algorithm for Cluster File System (클러스터 파일 시스템의 내용 기반 부하 분산 알고리즘)

  • 장준호;박성용
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.526-528
    • /
    • 2004
  • 메타데이타에 대한 접근이 특정 디렉토리에 집중되며 메타데이타 연산마다 다른 계산량을 가지는 클러스터 파일 시스템의 특성상 메타데이타 서버 간 부하의 불균형과 과부하가 발생한다. 따라서 클러스터 파일 시스템의 성능을 결정짓는 중요 요소인 메타데이타 서비스의 성능을 위해서는 메타데이타 서버들의 과부하 상황에 대처할 수 있는 합리적인 부하 분산 기법이 필수적이다. 메타데이타 공간을 분할하여 담당영역만을 관리하는 비대칭 메타데이타 서버를 위해 본 눈문은 클라이언트 요청의 내용을 분석하여 담당 메타데이타 서버를 결정하고 해당 연산의 종류에 따라 단순 검색, 메타데이타 중복 저장(replication), 또는 메타데이타에 대한 로깅(logging)을 수행하는 내용 기반의 부하 분산 알고리즘을 제시하였다.

  • PDF

A Classification Mechanism for Content-Based P2P File Manager (컨텐츠 기반 P2P 파일 관리를 위한 분류 기법)

  • Min, Su-Hong;Cho, Dong-Sub
    • Proceedings of the KIEE Conference
    • /
    • 2004.05a
    • /
    • pp.62-64
    • /
    • 2004
  • P2P Systems have grown dramatically in recent years. Now many P2P systems have developed and been confronted by P2P technical challenges. We should consider how to efficiently locate desired resources. In this paper we integrated the existing pure P2P and hybrid P2P model. We try to keep roles of super peer in hybrid and concurrently use pure P2P model for searching resource. In order to improve the existing search mechanism, we present contents-based classification mechanism. Proposed system have the following features. This can forward only query to best peer using RI. Second, it is self-organization. A peer can reconfigure network that it can communicate directly with based on best peer. Third, peers can cluster each other through contents-based classification.

  • PDF

Crystal : Cryptographic File System Based On Cluster ins Environment (Crystal : 클러스터 기반의 암호화 파일 시스템)

  • 황보준형;서대화
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10a
    • /
    • pp.802-804
    • /
    • 2001
  • 하드웨어의 발달과 인터넷의 보편화로 점차 정보의 보안의 필요성이 대두되었다. 암호화 파일 시스템은 사용자의 기밀성을 요구하는 파일의 안전한 저장을 위해 제안되었다. 이 암호화 파일 시스템은 사용자에게 투명성을 제공하여 사용의 편리성을 제공한다. 또한 기존의 암호화 시스템이 사용자 영역에서 이루어져 문맥교환의 횟수가 많아 시스템의 성능이 떨어지는데 반해 암호화 파일 시스템은 커널레벨에서 암호화 서비스가 이루어지므로 시스템의 성능이 저하되는 것을 방지해준다. 하지만 암호화 서비스 자체가 큰 과부하가 되어 일반 파일 시스템에 비해 성능이 많이 떨어진다는 단점이 있다. 따라서 본 논문에서는 클러스터 기반의 파일 시스템을 통해 암호화 파일 시스템의 부하를 분산시켜 성능을 개선함과 동시에 암호화된 파일을 분산 저장하므로 보안성을 높여준다. 제안된 암호화 파일 시스템은 시스템이 확장되었을 경우 그와 비례해서 시스템의 성능이 개선됨을 알 수 있다.

  • PDF

Design of MAHA Supercomputing System for Human Genome Analysis (대용량 유전체 분석을 위한 고성능 컴퓨팅 시스템 MAHA)

  • Kim, Young Woo;Kim, Hong-Yeon;Bae, Seungjo;Kim, Hag-Young;Woo, Young-Choon;Park, Soo-Jun;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.81-90
    • /
    • 2013
  • During the past decade, many changes and attempts have been tried and are continued developing new technologies in the computing area. The brick wall in computing area, especially power wall, changes computing paradigm from computing hardwares including processor and system architecture to programming environment and application usage. The high performance computing (HPC) area, especially, has been experienced catastrophic changes, and it is now considered as a key to the national competitiveness. In the late 2000's, many leading countries rushed to develop Exascale supercomputing systems, and as a results tens of PetaFLOPS system are prevalent now. In Korea, ICT is well developed and Korea is considered as a one of leading countries in the world, but not for supercomputing area. In this paper, we describe architecture design of MAHA supercomputing system which is aimed to develop 300 TeraFLOPS system for bio-informatics applications like human genome analysis and protein-protein docking. MAHA supercomputing system is consists of four major parts - computing hardware, file system, system software and bio-applications. MAHA supercomputing system is designed to utilize heterogeneous computing accelerators (co-processors like GPGPUs and MICs) to get more performance/$, performance/area, and performance/power. To provide high speed data movement and large capacity, MAHA file system is designed to have asymmetric cluster architecture, and consists of metadata server, data server, and client file system on top of SSD and MAID storage servers. MAHA system softwares are designed to provide user-friendliness and easy-to-use based on integrated system management component - like Bio Workflow management, Integrated Cluster management and Heterogeneous Resource management. MAHA supercomputing system was first installed in Dec., 2011. The theoretical performance of MAHA system was 50 TeraFLOPS and measured performance of 30.3 TeraFLOPS with 32 computing nodes. MAHA system will be upgraded to have 100 TeraFLOPS performance at Jan., 2013.

Automatic real-time system of the global 3-D MHD model: Description and initial tests

  • Park, Geun-Seok;Choi, Seong-Hwan;Cho, Il-Hyun;Baek, Ji-Hye;Park, Kyung-Sun;Cho, Kyung-Suk;Choe, Gwang-Son
    • Bulletin of the Korean Space Science Society
    • /
    • 2009.10a
    • /
    • pp.26.2-26.2
    • /
    • 2009
  • The Solar and Space Weather Research Group (SOS) in Korea Astronomy and Space Science Institute (KASI) is constructing the Space Weather Prediction Center since 2007. As a part of the project, we are developing automatic real-time system of the global 3-D magnetohydrodynamics (MHD) simulation. The MHD simulation model of earth's magnetosphere is designed as modified leap-frog scheme by T. Ogino, and it was parallelized by using message passing interface (MPI). Our work focuses on the automatic processing about simulation of 3-D MHD model and visualization of the simulation results. We used PC cluster to compute, and virtual reality modeling language (VRML) file format to visualize the MHD simulation. The system can show the variation of earth's magnetosphere by the solar wind in quasi real time. For data assimilation we used four parameters from ACE data; density, pressure, velocity of solar wind, and z component of interplanetary magnetic field (IMF). In this paper, we performed some initial tests and made a animation. The automatic real-time system will be valuable tool to understand the configuration of the solar-terrestrial environment for space weather research.

  • PDF

An LDPC Code Replication Scheme Suitable for Cloud Computing (클라우드 컴퓨팅에 적합한 LDPC 부호 복제 기법)

  • Kim, Se-Hoe;Lee, Won-Joo;Jeon, Chang-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.134-142
    • /
    • 2012
  • This paper analyze an LDPC code replication method suitable for cloud computing. First, we determine the number of blocks suitable for cloud computing through analysis of the performance for the file availability and storage overhead. Also we determine the type of LDPC code appropriate for cloud computing through the performance for three types of LDPC codes. Finally we present the graph random generation method and the comparing method of each generated LDPC code's performance by the iterative decoding process. By the simulation, we confirmed the best graph's regularity is left-regular or least left-regular. Also, we confirmed the best graph's total number of edges are minimum value or near the minimum value.

A Hadoop-based Multimedia Transcoding System for Processing Social Media in the PaaS Platform of SMCCSE

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku;Jeong, Changsung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2827-2848
    • /
    • 2012
  • Previously, we described a social media cloud computing service environment (SMCCSE). This SMCCSE supports the development of social networking services (SNSs) that include audio, image, and video formats. A social media cloud computing PaaS platform, a core component in a SMCCSE, processes large amounts of social media in a parallel and distributed manner for supporting a reliable SNS. Here, we propose a Hadoop-based multimedia system for image and video transcoding processing, necessary functions of our PaaS platform. Our system consists of two modules, including an image transcoding module and a video transcoding module. We also design and implement the system by using a MapReduce framework running on a Hadoop Distributed File System (HDFS) and the media processing libraries Xuggler and JAI. In this way, our system exponentially reduces the encoding time for transcoding large amounts of image and video files into specific formats depending on user-requested options (such as resolution, bit rate, and frame rate). In order to evaluate system performance, we measure the total image and video transcoding time for image and video data sets, respectively, under various experimental conditions. In addition, we compare the video transcoding performance of our cloud-based approach with that of the traditional frame-level parallel processing-based approach. Based on experiments performed on a 28-node cluster, the proposed Hadoop-based multimedia transcoding system delivers excellent speed and quality.

Development of Big-data Management Platform Considering Docker Based Real Time Data Connecting and Processing Environments (도커 기반의 실시간 데이터 연계 및 처리 환경을 고려한 빅데이터 관리 플랫폼 개발)

  • Kim, Dong Gil;Park, Yong-Soon;Chung, Tae-Yun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.4
    • /
    • pp.153-161
    • /
    • 2021
  • Real-time access is required to handle continuous and unstructured data and should be flexible in management under dynamic state. Platform can be built to allow data collection, storage, and processing from local-server or multi-server. Although the former centralize method is easy to control, it creates an overload problem because it proceeds all the processing in one unit, and the latter distributed method performs parallel processing, so it is fast to respond and can easily scale system capacity, but the design is complex. This paper provides data collection and processing on one platform to derive significant insights from various data held by an enterprise or agency in the latter manner, which is intuitively available on dashboards and utilizes Spark to improve distributed processing performance. All service utilize dockers to distribute and management. The data used in this study was 100% collected from Kafka, showing that when the file size is 4.4 gigabytes, the data processing speed in spark cluster mode is 2 minute 15 seconds, about 3 minutes 19 seconds faster than the local mode.

A Distributed VOD Server Based on Virtual Interface Architecture and Interval Cache (버추얼 인터페이스 아키텍처 및 인터벌 캐쉬에 기반한 분산 VOD 서버)

  • Oh, Soo-Cheol;Chung, Sang-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.10
    • /
    • pp.734-745
    • /
    • 2006
  • This paper presents a PC cluster-based distributed VOD server that minimizes the load of an interconnection network by adopting the VIA communication protocol and the interval cache algorithm. Video data is distributed to the disks of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large amount of video data transferred. This paper developed a distributed VOD file system, which is based on VIA, to minimize cost using interconnection network when accessing remote disks. VIA is a user-level communication protocol removing the overhead of TCP/IP. This papers also improved the performance of the interconnection network by expanding the maximum transfer size of VIA. In addition, the interval cache reduces traffic on the interconnection network by caching, in main memory, the video data transferred from disks of remote server nodes. Experiments using the distributed VOD server of this paper showed a maximum performance improvement of 21.3% compared with a distributed VOD server without VIA and the interval cache, when used with a four-node PC cluster.