• Title/Summary/Keyword: 매니코어 시스템

Search Result 35, Processing Time 0.024 seconds

Performance evaluation of collective I/O on an SMP supercomputer (SMP 슈퍼컴퓨터에서의 집합 IO 성능)

  • Cha, Kwangho;Kim, Sungho;Lee, Sik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.1732-1734
    • /
    • 2010
  • 멀티 코어 또는 매니 코어 기반의 HPC 시스템 보급이 늘어나면서 HPC 어플리케이션이 사용하는 프로세스의 수 또한 증가하고 있다. 이런 경우, 기존의 IO 방식이 아닌 병렬 IO 의 사용을 고려하여야 하는데 그 중 특히 집합 IO 는 중요한 역할을 수행한다. 본 연구에서는 IBM Power 595 기반 슈퍼 컴퓨터에서 집합 IO 특성을 알아 본다.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.

Trends in Unikernel and Its Application to Manycore Systems (유니커널의 동향과 매니코어 시스템에 적용)

  • Cha, S.J.;Jeon, S.H.;Ramneek, Ramneek;Kim, J.M.;Jeong, Y.J.;Jung, S.I.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.6
    • /
    • pp.129-138
    • /
    • 2018
  • As recent applications are requiring more CPUs for their performance, manycore systems have evolved. Since existing operating systems do not provide performance scalability in manycore systems, Azalea, a multi-kernel based system, has been developed for supporting performance scalability. Unikernel is a new operating system technology starting with the concept of a library OS. Applying unikernel to Azalea enables an improvement in performance. In this paper, we first analyze the current technology trends of unikernel, and then discuss the applications and effects of unikernel to Azalea. Azalea-unikernel was built in a single image consisting of libOS, runtime libraries, and an application, and executed with the desired number of cores and memory size in bare-metal. In particular, it supports source and binary compatibility such that existing linux binaries can be rebuilt and executed in Azalea-unikernel, and already built binaries can be run immediately without modification with a better performance. It not only achieves a performance enhancement, it is also a more secure OS for manycore systems.

Design of MAHA Supercomputing System for Human Genome Analysis (대용량 유전체 분석을 위한 고성능 컴퓨팅 시스템 MAHA)

  • Kim, Young Woo;Kim, Hong-Yeon;Bae, Seungjo;Kim, Hag-Young;Woo, Young-Choon;Park, Soo-Jun;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.81-90
    • /
    • 2013
  • During the past decade, many changes and attempts have been tried and are continued developing new technologies in the computing area. The brick wall in computing area, especially power wall, changes computing paradigm from computing hardwares including processor and system architecture to programming environment and application usage. The high performance computing (HPC) area, especially, has been experienced catastrophic changes, and it is now considered as a key to the national competitiveness. In the late 2000's, many leading countries rushed to develop Exascale supercomputing systems, and as a results tens of PetaFLOPS system are prevalent now. In Korea, ICT is well developed and Korea is considered as a one of leading countries in the world, but not for supercomputing area. In this paper, we describe architecture design of MAHA supercomputing system which is aimed to develop 300 TeraFLOPS system for bio-informatics applications like human genome analysis and protein-protein docking. MAHA supercomputing system is consists of four major parts - computing hardware, file system, system software and bio-applications. MAHA supercomputing system is designed to utilize heterogeneous computing accelerators (co-processors like GPGPUs and MICs) to get more performance/$, performance/area, and performance/power. To provide high speed data movement and large capacity, MAHA file system is designed to have asymmetric cluster architecture, and consists of metadata server, data server, and client file system on top of SSD and MAID storage servers. MAHA system softwares are designed to provide user-friendliness and easy-to-use based on integrated system management component - like Bio Workflow management, Integrated Cluster management and Heterogeneous Resource management. MAHA supercomputing system was first installed in Dec., 2011. The theoretical performance of MAHA system was 50 TeraFLOPS and measured performance of 30.3 TeraFLOPS with 32 computing nodes. MAHA system will be upgraded to have 100 TeraFLOPS performance at Jan., 2013.

Improving Haskell GC-Tuning Time Using Divide-and-Conquer (분할 정복법을 이용한 Haskell GC 조정 시간 개선)

  • An, Hyungjun;Kim, Hwamok;Liu, Xiao;Kim, Yeoneo;Byun, Sugwoo;Woo, Gyun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.9
    • /
    • pp.377-384
    • /
    • 2017
  • The performance improvement of a single core processor has reached its limit since the circuit density cannot be increased any longer due to overheating. Therefore, the multicore and manycore architectures have emerged as viable approaches and parallel programming becomes more important. Haskell, a purely functional language, is getting popular in this situation since it naturally supports parallel programming owing to its beneficial features including the implicit parallelism in evaluating expressions and the monadic tools supporting parallel constructs. However, the performance of Haskell parallel programs is strongly influenced by the performance of the run-time system including the garbage collector. Though a memory profiling tool namely GC-tune has been suggested, we need a more systematic way to use this tool. Since GC-tune finds the optimal memory size by executing the target program with all the different possible GC options, the GC-tuning time takes too long. This paper suggests a basic divide-and-conquer method to reduce the number of GC-tune executions by reducing the search area by one-quarter for every searching step. Applying this method to two parallel programs, a maximally independent set and a K-means programs, the memory tuning time is reduced by 7.78 times with accuracy 98% on average.