• Title/Summary/Keyword: manycore

Search Result 27, Processing Time 0.028 seconds

Trends of Operating Systems for Manycore (Manycore 운영체제 동향)

  • Jeong, J.H.;Koh, K.W.;Cha, S.J.;Kim, K.H.;Kim, J.M.;Jung, S.J.
    • Electronics and Telecommunications Trends
    • /
    • v.29 no.5
    • /
    • pp.176-185
    • /
    • 2014
  • 최근 프로세서는 회로의 집적도 기술을 동작속도를 높이는 것에서 코어의 수를 늘리는 것으로 활용하고 있다. 근래에는 4코어, 8코어가 널리 쓰이고 있으며 서버급에서는 15코어, 18코어까지 출시되고 있다. 또한 향후 몇 년 안에 128코어를 넘어서서 수백 혹은 수천 코어의 Manycore 시스템까지 예상되고 있다. 이에 반해 프로세서를 관리하는 소프트웨어인 운영체제는 아직은 적은 수의 코어에 최적화되어 있는 것이 현실이다. 본 논문에서는 현재의 운영체제가 Manycore 시스템에서 어떠한 문제가 있는지를 알아보고, 세계 여러 연구소에서 이러한 문제를 해결하기 위해 제시한 몇 가지 운영체제를 소개함으로써 Manycore 시스템에 대응하는 운영체제의 변화를 살펴본다.

  • PDF

Research Status and Plan for Manycore Operating System (매니코어 운영체제 연구현황 및 계획)

  • Jung, Sungin;Kim, Taesoo;Min, Changwoo;Park, Sungyong;Byun, Sugwoo;Seo, Euiseong;Woo, Gyun;Lee, Kyoungwoo;Lee, Jaewook;Rim, Sung-Soo;Im, Eun-Jin;Jo, Heeseung;Jin, Hyun-Wook
    • Electronics and Telecommunications Trends
    • /
    • v.32 no.6
    • /
    • pp.83-95
    • /
    • 2017
  • The trend of manycore hardware has recently evolved more quickly than expected. However, an operating system, which is software used for managing computer resources, is still optimized for a multicore system. To handle this issue, we started a research project called 'Research on High Performance and Scalable Manycore Operating Systems' in 2014. This article briefly examines the technology trends of manycore hardware and operating systems, and introduces the research areas and outcomes during the first stage of the project(2014-2017). The core technologies improving the performance scalability of manycore systems are publicly available, and anyone can use the source code or apply the ideas of the core technique to other research activities. In addition, the research plans of the second stage of the project(2018-2021) are also included.

Meshfree/GFEM in hardware-efficiency prospective

  • Tian, Rong
    • Interaction and multiscale mechanics
    • /
    • v.6 no.2
    • /
    • pp.197-210
    • /
    • 2013
  • A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.

RPSim: A Generic Real-Time Performance Simulator for Manycore (RPSim: Manycore 를 위한 범용 실시간 성능 시뮬레이터)

  • Byung Kwan Jung;Sunwoo Lee;Jimin Kim;Minsoo Ryu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.924-927
    • /
    • 2008
  • 실시간 시스템 개발에 있어서 태스크들의 응답시간을 예측하는 것은 가장 중요한 문제로 인식 되고 있다. 그러나 manycore 환경에서는 응답시간을 예측하는 것이 몹시 어려워 만족할 만한 결과를 이끌어내지 못하고 있다. 과거에 스케줄링과 동기화 정책을 고려하여 최악 응답시간을 예측하는 방법이 제시되기도 했지만, 상당히 제한적인 태스크 모델을 가정하여 실제로 적용하기에는 어려울 뿐만 아니라 예측한 결과도 시스템의 정확한 응답시간과 상당한 괴리가 있다. 반면, 시뮬레이션 기법은 시스템의 스케줄링 상태를 시뮬레이션해 봄으로써, 상대적으로 정확한 응답시간을 예측하는 것을 가능하게 한다. 따라서 본 논문에서는 범용적이면서도 매우 효과적인 manycore를 위한 시뮬레이션 기법을 제안한다. 제안하는 기법의 우수성은 시스템 모델의 변화에 따라 소요되는 시뮬레이션 시간을 측정하는 실험을 통해서 확인한다.

Trends in Unikernel and Its Application to Manycore Systems (유니커널의 동향과 매니코어 시스템에 적용)

  • Cha, S.J.;Jeon, S.H.;Ramneek, Ramneek;Kim, J.M.;Jeong, Y.J.;Jung, S.I.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.6
    • /
    • pp.129-138
    • /
    • 2018
  • As recent applications are requiring more CPUs for their performance, manycore systems have evolved. Since existing operating systems do not provide performance scalability in manycore systems, Azalea, a multi-kernel based system, has been developed for supporting performance scalability. Unikernel is a new operating system technology starting with the concept of a library OS. Applying unikernel to Azalea enables an improvement in performance. In this paper, we first analyze the current technology trends of unikernel, and then discuss the applications and effects of unikernel to Azalea. Azalea-unikernel was built in a single image consisting of libOS, runtime libraries, and an application, and executed with the desired number of cores and memory size in bare-metal. In particular, it supports source and binary compatibility such that existing linux binaries can be rebuilt and executed in Azalea-unikernel, and already built binaries can be run immediately without modification with a better performance. It not only achieves a performance enhancement, it is also a more secure OS for manycore systems.

Area-constrained NTC Manycore Architecture Design Methodology (면적 제약 조건을 고려한 NTC 매니코어 설계 방법론)

  • Chang, Jin Kyu;Han, Tae Hee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.866-869
    • /
    • 2015
  • With the advance in semiconductor technology, the number of elements that can be integrated in system-on-chip(SoC) increases exponentially, and thus voltage scaling is indispensable to enhance energy efficiency. Near-threshold voltage computing(NTC) improves the energy efficiency by an order of degree, hence it is able to overcome the limitation of conventional super-threshold voltage computing(STC). Although NTC-based low performance manycore system can be used to maximize energy efficiency, it demands more number of cores to sustain the performance, which results in considerable increase of area. In this paper, we analyze NTC manycore architecture considering the trade-offs between performance, power, and area. Therefore, we propose an algorithmic methodology that can optimize power consumption and area while satisfying the required performance by determining the constrained number of cores and size of caches and clusters in NTC environment. Experimental results show that proposed NTC architecture can reduce power consumption by approximately 16.5 % while maintaining the performance of STC core under area constraint.

  • PDF

Voltage and Frequency Tuning Methodology for Near-Threshold Manycore Computing using Critical Path Delay Variation

  • Li, Chang-Lin;Kim, Hyun Joong;Heo, Seo Weon;Han, Tae Hee
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.6
    • /
    • pp.678-684
    • /
    • 2015
  • Near-threshold computing (NTC) is now regarded as a promising candidate for innovative power reduction, which cannot be achieved with conventional super-threshold computing (STC). However, performance degradation and vulnerability to process variation in the NTC regime are the primary concerns. In this paper, we propose a voltage- and frequency-tuning methodology for mitigating the process-variation-induced problems in NTC-based manycore architectures. To implement the proposed methodology, we build up multiple-voltage multiple-frequency (MVMF) islands and apply a voltage-frequency tuning algorithm based on the critical-path monitoring technique to reduce the effects of process variation and maximize energy efficiency in the post-silicon stage. Experimental results show that the proposed methodology reduces overall power consumption by 8.2-20.0%, compared to existing methods in variation-sensitive NTC environments.

System-Call-Level Core Affinity for Improving Network Performance (네트워크 성능향상을 위한 시스템 호출 수준 코어 친화도)

  • Uhm, Junyong;Cho, Joong-Yeon;Jin, Hyun-Wook
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.1
    • /
    • pp.80-84
    • /
    • 2017
  • Existing operating systems experience scalability issues as the number of cores increases. The network I/O performance on manycore systems is faced with the major limiting factors of cache consistency costs and locking overheads. Legacy methods resolve this issue include the new microkernel-like operating system or modification of existing kernels; however, these solutions are not fully application transparent. In this study, we proposed a library that improves the network performance by separating system call context from user context and by applying the core affinity without any kernel and application modifications. Experiment results showed that our implementation can improve the network throughput of Apache by up to 30%.

Using the On-Package Memory of Manycore Processor for Improving Performance of MPI Intra-Node Communication (MPI 노드 내 통신 성능 향상을 위한 매니코어 프로세서의 온-패키지 메모리 활용)

  • Cho, Joong-Yeon;Jin, Hyun-Wook;Nam, Dukyun
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.124-131
    • /
    • 2017
  • The emerging next-generation manycore processors for high-performance computing are equipped with a high-bandwidth on-package memory along with the traditional host memory. The Multi-Channel DRAM (MCDRAM), for example, is the on-package memory of the Intel Xeon Phi Knights Landing (KNL) processor, and theoretically provides a four-times-higher bandwidth than the conventional DDR4 memory. In this paper, we suggest a mechanism to exploit MCDRAM for improving the performance of MPI intra-node communication. The experiment results show that the MPI intra-node communication performance can be improved by up to 272 % compared with the case where the DDR4 is utilized. Moreover, we analyze not only the performance impact of different MCDRAM-utilization mechanisms, but also that of core affinity for processes.

Tuning the Performance of Haskell Parallel Programs Using GC-Tune (GC-Tune을 이용한 Haskell 병렬 프로그램의 성능 조정)

  • Kim, Hwamok;An, Hyungjun;Byun, Sugwoo;Woo, Gyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.459-465
    • /
    • 2017
  • Although the performance of computer hardware is increasing due to the development of manycore technologies, software lacking a proportional increase in throughput. Functional languages can be a viable alternative to improve the performance of parallel programs since such languages have an inherent parallelism in evaluating pure expressions without side-effects. Specifically, Haskell is notably popular for parallel programming because it provides easy-to-use parallel constructs based on monads. However, the scalability of parallel programs in Haskell tends to fluctuate as the number of cores increases, and the garbage collector is suspected to be the source of this fluctuations because it affects both the space and the time needed to execute the programs. This paper uses the tuning tool, GC-Tune, to improve the scalability of the performance. Our experiment was conducted with a parallel plagiarism detection program, and the scalability improved. Specifically, the fluctuation range of the speedup was narrowed down by 39% compared to the original execution of the program without any tuning.