• Title/Summary/Keyword: heterogeneous multicore

Search Result 6, Processing Time 0.021 seconds

Reevaluating the overhead of data preparation for asymmetric multicore system on graphics processing

  • Pei, Songwen;Zhang, Junge;Jiang, Linhua;Kim, Myoung-Seo;Gaudiot, Jean-Luc
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.7
    • /
    • pp.3231-3244
    • /
    • 2016
  • As processor design has been transiting from homogeneous multicore processor to heterogeneous multicore processor, traditional Amdahl's law cannot meet the new challenges for asymmetric multicore system. In order to further investigate the impact factors related to the Overhead of Data Preparation (ODP) for Asymmetric multicore systems, we evaluate an asymmetric multicore system built with CPU-GPU by measuring the overheads of memory transfer, computing kernel, cache missing and synchronization. This paper demonstrates that decreasing the overhead of data preparation is a promising approach to improve the whole performance of heterogeneous system.

Analysis on the Performance Impact of Partitioned LLC for Heterogeneous Multicore Processors (이종 멀티코어 프로세서에서 분할된 공유 LLC가 성능에 미치는 영향 분석)

  • Moon, Min Goo;Kim, Cheol Hong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.2
    • /
    • pp.39-49
    • /
    • 2019
  • Recently, CPU-GPU integrated heterogeneous multicore processors have been widely used for improving the performance of computing systems. Heterogeneous multicore processors integrate CPUs and GPUs on a single chip where CPUs and GPUs share the LLC(Last Level Cache). This causes a serious cache contention problem inside the processor, resulting in significant performance degradation. In this paper, we propose the partitioned LLC architecture to solve the cache contention problem in heterogeneous multicore processors. We analyze the performance impact varying the LLC size of CPUs and GPUs, respectively. According to our simulation results, the bigger the LLC size of the CPU, the CPU performance improves by up to 21%. However, the GPU shows negligible performance difference when the assigned LLC size increases. In other words, the GPU is less likely to lose the performance when the LLC size decreases. Because the performance degradation due to the LLC size reduction in GPU is much smaller than the performance improvement due to the increase of the LLC size of the CPU, the overall performance of heterogeneous multicore processors is expected to be improved by applying partitioned LLC to CPUs and GPUs. In addition, if we develop a memory management technique that can maximize the performance of each core in the future, we can greatly improve the performance of heterogeneous multicore processors.

Hybrid parallel programming for Heterogeneous Multi-core performance optimization (헤테로지니어스 멀티코어 성능 최적화를 위한 하이브리드 병렬 프로그래밍)

  • Lim, Ju-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.7-9
    • /
    • 2012
  • CPU는 싱글 코어 구조에서 클록 속도를 높여 성능을 향상 시키려는 노력을 해왔으나 한계에 도달하자 하나의 칩에 코어를 여러 개 둔 멀티코어 형태로 발전하였다. CPU의 성능 향상을 위해 이제는 3D그래픽을 연산처리하기 위해 만들어진 GPU와 결합하기에 이르렀다. CPU와 GPU의 결합은 CPU간의 결합보다 훨씬 더 좋은 성능을 보였고 전력의 사용량도 더 적었으며 비용면에서도 경제적이라는 장점을 가지고 있다. 본 논문에서는 CPU와 GPU의 Heterogeneous multicore상에서 성능을 최적화하기 위해 기존의 병렬화 모델을 조합하고 최적화를 시도하였다. CPU상에서는 성능 향상을 위해 기존의 병렬 프로그램 모델인 SIMD와 공유메모리 병렬 프로그래밍 모델 그리고 메시지 패싱 병렬 프로그래밍 모델을 조합하는 실험을 했다. GPU에서는 CUDA를 최적화 하였다. 이렇게 CPU와 GPU를 최적화하고 조합하여 고성능 연산을 요구하는 어플리케이션을 위한 Heterogeneous multicore 성능 최적화 방법을 제안한다.

Heterogeneous multi-core simulator based on SMP for the efficient application development at the heterogenous multi-core environment (효과적인 이기종 다중코어 응용 개발을 위한 SMP기반 이기종 다중코어 시뮬레이터)

  • SaKong, June;Shin, Dongha
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.111-117
    • /
    • 2018
  • Heterogeneous multi-core environment integrated with different functional cores is the powerful tool for the embedded system that became more complex and diverse. Specialized application requires one chip solution with different operating system over different cores. But this heterogeneity causes difficult configuration of the development environment, makes hard to develop and test software. We show the environment of heterogeneous multi-core processing can be mapped to symmetric multi-core environment. We construct Linux based RPMsg for the data exchange between processes similar with the heterogeneous multi-core RPMsg and experiment that the proposed environment can be used to reduce the steps of the heterogeneous multi-core application development. With this simplification, we suggest simulation method for easy development and debugging the heterogeneous multicore environment that makes complex steps to simple.

An Efficient Load Balancing Technique in a Multicore Mobile System (멀티코어 모바일 시스템에서 효과적인 부하 균등화 기법)

  • Cho, Jungseok;Cho, Doosan
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.5
    • /
    • pp.153-160
    • /
    • 2015
  • The effectiveness of multicores depends on how well a scheduler can assign tasks onto the cores efficiently. In a heterogeneous multicore platform, the execution time of an application depends on which core it executes on. That is to say, the effectiveness of task assignment is one of the important components for a multicore systems' performance. This work proposes a load scheduling technique that analyzes execution time of each task by profiling. The profiling result provides a basic information to predict which task-to-core mapping is likely to provide the best performance. By using such information, the proposed technique is about 26% performance gain.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.