• Title/Summary/Keyword: MAHA Supercomputer

Search Result 2, Processing Time 0.016 seconds

Design of MAHA Supercomputing System for Human Genome Analysis (대용량 유전체 분석을 위한 고성능 컴퓨팅 시스템 MAHA)

  • Kim, Young Woo;Kim, Hong-Yeon;Bae, Seungjo;Kim, Hag-Young;Woo, Young-Choon;Park, Soo-Jun;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.81-90
    • /
    • 2013
  • During the past decade, many changes and attempts have been tried and are continued developing new technologies in the computing area. The brick wall in computing area, especially power wall, changes computing paradigm from computing hardwares including processor and system architecture to programming environment and application usage. The high performance computing (HPC) area, especially, has been experienced catastrophic changes, and it is now considered as a key to the national competitiveness. In the late 2000's, many leading countries rushed to develop Exascale supercomputing systems, and as a results tens of PetaFLOPS system are prevalent now. In Korea, ICT is well developed and Korea is considered as a one of leading countries in the world, but not for supercomputing area. In this paper, we describe architecture design of MAHA supercomputing system which is aimed to develop 300 TeraFLOPS system for bio-informatics applications like human genome analysis and protein-protein docking. MAHA supercomputing system is consists of four major parts - computing hardware, file system, system software and bio-applications. MAHA supercomputing system is designed to utilize heterogeneous computing accelerators (co-processors like GPGPUs and MICs) to get more performance/$, performance/area, and performance/power. To provide high speed data movement and large capacity, MAHA file system is designed to have asymmetric cluster architecture, and consists of metadata server, data server, and client file system on top of SSD and MAID storage servers. MAHA system softwares are designed to provide user-friendliness and easy-to-use based on integrated system management component - like Bio Workflow management, Integrated Cluster management and Heterogeneous Resource management. MAHA supercomputing system was first installed in Dec., 2011. The theoretical performance of MAHA system was 50 TeraFLOPS and measured performance of 30.3 TeraFLOPS with 32 computing nodes. MAHA system will be upgraded to have 100 TeraFLOPS performance at Jan., 2013.

Workflow-based Bio Data Analysis System for HPC (HPC 환경을 위한 워크플로우 기반의 바이오 데이터 분석 시스템)

  • Ahn, Shinyoung;Kim, ByoungSeob;Choi, Hyun-Hwa;Jeon, Seunghyub;Bae, Seungjo;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.97-106
    • /
    • 2013
  • Since human genome project finished, the cost for human genome analysis has decreased very rapidly. This results in the sharp increase of human genome data to be analyzed. As the need for fast analysis of very large bio data such as human genome increases, non IT researchers such as biologists should be able to execute fast and effectively many kinds of bio applications, which have a variety of characteristics, under HPC environment. To accomplish this purpose, a biologist need to define a sequence of bio applications as workflow easily because generally bio applications should be combined and executed in some order. This bio workflow should be executed in the form of distributed and parallel computing by allocating computing resources efficiently under HPC cluster system. Through this kind of job, we can expect better performance and fast response time of very large bio data analysis. This paper proposes a workflow-based data analysis system specialized for bio applications. Using this system, non-IT scientists and researchers can analyze very large bio data easily under HPC environment.