• 제목/요약/키워드: Multicore processors

검색결과 49건 처리시간 0.028초

2차원 구조와 3차원 구조에 따른 멀티코어 프로세서의 온도 분석 (Thermal Pattern Comparison between 2D Multicore Processors and 3D Multicore Processors)

  • 최홍준;안진우;장형범;김종면;김철홍
    • 한국컴퓨터정보학회논문지
    • /
    • 제16권9호
    • /
    • pp.1-10
    • /
    • 2011
  • 동작 주파수의 증가는 싱글코어 프로세서의 성능을 크게 향상시키는 반면 전력 소모 증가와 높은 온도로 인한 신뢰성 저하 문제를 유발하고 있다. 최근에는 싱글코어 프로세서의 한계점을 극복하기 위한 대안으로 멀티코어 프로세서가 주로 사용되고 있다. 하지만, 멀티코어 프로세서를 2차원 구조로 설계하는 경우에는 내부 연결망에서의 전송 지연 현상으로 인해 프로세서의 성능 향상이 제약을 받고 있다. 내부 연결망에서의 전송 지연을 줄이기 위한 방안으로 멀티코어 프로세서를 3차원 구조로 설계하는 연구가 최근 큰 주목을 받고 있다. 2차원 구조 멀티코어 프로세서와 비교하여 3차원 구조 멀티코어 프로세서는 성능 향상과 전력 소모 감소의 장점을 지닌 반면, 높은 전력 밀도로 인해 발생된 발열 문제가 프로세서의 신뢰성을 위협하는 문제가 되고 있다. 3차원 멀티코어 프로세서에서 발생되는 발열 문제에 대한 상세한 분석이 제공된다면, 프로세서의 신뢰성을 확보하기 위한 연구 진행에 큰 도움이 될 것으로 기대된다. 그러므로 본 논문에서는 3차원 멀티코어 프로세서의 온도에 밀접하게 연관된 요소인 작업량, 방열판과의 거리, 그리고 적층되는 다이의 개수와 온도 사이의 관계를 자세히 살펴보고 높은 온도가 프로세서의 성능에 미치는 영향 또한 분석하고자 한다. 특히, 2차원 구조 멀티코어 프로세서와 3차원 구조 멀티코어 프로세서에서의 온도 문제를 함께 분석함으로써, 온도 측면에서 효율적인 프로세서 설계를 위한 가이드라인을 제시하고자 한다.

Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제9권2호
    • /
    • pp.51-72
    • /
    • 2015
  • Time predictability is crucial in hard real-time and safety-critical systems. Cache memories, while useful for improving the average-case memory performance, are not time predictable, especially when they are shared in multicore processors. To achieve time predictability while minimizing the impact on performance, this paper explores several time-predictable scratch-pad memory (SPM) based architectures for multicore processors. To support these architectures, we propose the dynamic memory objects allocation based partition, the static allocation based partition, and the static allocation based priority L2 SPM strategy to retain the characteristic of time predictability while attempting to maximize the performance and energy efficiency. The SPM based multicore architectural design and the related allocation methods thus form a comprehensive solution to hard real-time multicore based computing. Our experimental results indicate the strengths and weaknesses of each proposed architecture and the allocation method, which offers interesting on-chip memory design options to enable multicore platforms for hard real-time systems.

Multicore Real-Time Scheduling to Reduce Inter-Thread Cache Interferences

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제7권1호
    • /
    • pp.67-80
    • /
    • 2013
  • The worst-case execution time (WCET) of each real-time task in multicore processors with shared caches can be significantly affected by inter-thread cache interferences. The worst-case inter-thread cache interferences are dependent on how tasks are scheduled to run on different cores. Therefore, there is a circular dependence between real-time task scheduling, the worst-case inter-thread cache interferences, and WCET in multicore processors, which is not the case for single-core processors. To address this challenging problem, we present an offline real-time scheduling approach for multicore processors by considering the worst-case inter-thread interferences on shared L2 caches. Our scheduling approach uses a greedy heuristic to generate safe schedules while minimizing the worst-case inter-thread shared L2 cache interferences and WCET. The experimental results demonstrate that the proposed approach can reduce the utilization of the resulting schedule by about 12% on average compared to the cyclic multicore scheduling approaches in our theoretical model. Our evaluation indicates that the enhanced scheduling approach is more likely to generate feasible and safe schedules with stricter timing constraints in multicore real-time systems.

멀티 코어 프로세서를 위한 저전력 필터 캐쉬 설계 기법 (Low-power Filter Cache Design Technique for Multicore Processors)

  • 박영진;김종면;김철홍
    • 한국컴퓨터정보학회논문지
    • /
    • 제14권12호
    • /
    • pp.9-16
    • /
    • 2009
  • 최신의 멀티코어 프로세서를 설계할 때에는 성능과 함께 전력 효율성이 반드시 고려되어야 한다. 본 논문에서는 싱글 코어 프로세서의 명령어 캐쉬에서 소비되는 전력을 줄이기 위해 사용되는 대표적 기법중 하나인 필터 캐쉬 구조를 멀티 코어 프로세서에 적용하기 위한 새로운 방안을 제시하고자 한다. 명령어 캐쉬는 프로세서 전체에서 소비되는 전력의 상당 부분을 차지하고 있기 때문에, 변형 필터 캐쉬 구조를 이용한 저전력 명령어 캐쉬 설계는 멀티 코어 프로세서의 전력 소비를 줄이는데 있어서 중요한 역할을 담당할 수 있다. 제안하는 변형 필터 캐쉬 구조는 멀티코어 프로세서에서 필터 캐쉬에 대한 희생 캐쉬를 추가함으로써 1차 명령어 캐쉬에 대한 접근 횟수를 감소시키는 방법을 이용하여 명령어 캐쉬에서 소비되는 총전력을 줄일 수 있다. 제안하는 명령어 캐쉬 구조의 효율성을 분석하기 위한 모의 실험 도구로 SimpleScalar시뮬레이터와 CACTI를 사용한다. 모의실험 결과, 제안하는 기술은 멀티코어 프로세서의 명령어 캐쉬에서 소비되는 전력을 기존의 필터 캐쉬 구조와 비교하여 최대 3.4% 감소시킬 수 있음을 확인할 수 있다. 더욱이 제안하는 구조는 기존의 필터 캐쉬 구조에 비해 보다 우수한 성능을 보여준다.

비대칭적 멀티코어 프로세서의 성능 연구 (Performance Study of Asymmetric Multicore Processor Architectures)

  • 이종복
    • 한국인터넷방송통신학회논문지
    • /
    • 제14권3호
    • /
    • pp.163-169
    • /
    • 2014
  • 현재 범용 컴퓨터 시스템을 구축할 때 성능을 높이기 위하여 멀티코어 프로세서가 널리 이용되고 있으며, 멀티코어 프로세서의 구조는 크게 대칭적 구조와 비대칭적 구조로 나뉜다. 비대칭적 멀티코어 프로세서는 크고 복잡한 고성능의 코어와, 작고 간단한 저성능의 프로세서들로 구성되며, 대칭적 멀티코어 프로세서에 비하여 더욱 성능과 효율이 높은 것으로 알려져 있다. 본 논문에서는 다양한 구성을 갖는 비대칭적 쿼드코어 및 옥타코어 프로세서에 대하여 SPEC 2000 벤치마크를 통하여 모의실험을 수행하여 그 성능을 측정하고, 대칭적 쿼드코어 및 옥타코어 프로세서와 그 성능을 비교하였다.

Exploiting Static Non-Uniform Cache Architectures for Hard Real-Time Computing

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제9권4호
    • /
    • pp.177-189
    • /
    • 2015
  • High-performance processors using Non-Uniform Cache Architecture (NUCA) are increasingly used to deal with the growing wire delays in multicore/manycore processors. Due to the convergence of high-performance computing with embedded computing, NUCA caches are expected to benefit high-end embedded systems as well. However, for real-time systems that use multicore processors with NUCA caches, it is crucial to bound worst-case execution time (WCET) accurately and safely. In this paper, we developed a WCET analysis approach by considering the effect of static NUCA caches on WCET. We compared the WCET in real-time applications with different topologies of static NUCA caches. Our experimental results demonstrated that the static NUCA cache could improve the worst-case performance of realtime applications using multicore processor compared to the cache with uniform access time.

Counter-Based Approaches for Efficient WCET Analysis of Multicore Processors with Shared Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제7권4호
    • /
    • pp.285-299
    • /
    • 2013
  • To enable hard real-time systems to take advantage of multicore processors, it is crucial to obtain the worst-case execution time (WCET) for programs running on multicore processors. However, this is challenging and complicated due to the inter-thread interferences from the shared resources in a multicore processor. Recent research used the combined cache conflict graph (CCCG) to model and compute the worst-case inter-thread interferences on a shared L2 cache in a multicore processor, which is called the CCCG-based approach in this paper. Although it can compute the WCET safely and accurately, its computational complexity is exponential and prohibitive for a large number of cores. In this paper, we propose three counter-based approaches to significantly reduce the complexity of the multicore WCET analysis, while achieving absolute safety with tightness close to the CCCG-based approach. The basic counter-based approach simply counts the worst-case number of cache line blocks mapped to a cache set of a shared L2 cache from all the concurrent threads, and compares it with the associativity of the cache set to compute the worst-case cache behavior. The enhanced counter-based approach uses techniques to enhance the accuracy of calculating the counters. The hybrid counter-based approach combines the enhanced counter-based approach and the CCCG-based approach to further improve the tightness of analysis without significantly increasing the complexity. Our experiments on a 4-core processor indicate that the enhanced counter-based approach overestimates the WCET by 14% on average compared to the CCCG-based approach, while its averaged running time is less than 1/380 that of the CCCG-based approach. The hybrid approach reduces the overestimation to only 2.65%, while its running time is less than 1/150 that of the CCCG-based approach on average.

On-Chip Debug Architecture for Multicore Processor

  • Park, Hyeong-Bae;Xu, Jing-Zhe;Kim, Kil-Hyun;Park, Ju-Sung
    • ETRI Journal
    • /
    • 제34권1호
    • /
    • pp.44-54
    • /
    • 2012
  • Because of the intrinsic lack of internal-system observability and controllability in highly integrated multicore processors, very restricted access is allowed for the debugging of erroneous chip behavior. Therefore, the building of an efficient debug function is an important consideration in the design of multicore processors. In this paper, we propose a flexible on-chip debug architecture that embeds a special logic supporting the debug functionality in the multicore processor. It is designed to support run-stop-type debug functions that can halt and control the execution of the multicore processor at breakpoint events and inspect the possible causes of any errors. The debug architecture consists of the following three functional components: the core debug support block, the multicore debug support block, and the debug interface and control block. By embedding this debug infrastructure, the embedded processor cores within the multicore processor can be debugged simultaneously as well as independently. The debug control is performed by employing a JTAG-based scanning operation. We apply this on-chip debug architecture to build a debugger for a prototype multicore processor and demonstrate the validity and scalability of our approach.

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제6권1호
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

Comparing Separate and Statically-Partitioned Caches for Time-Predictable Multicore Processors

  • Wu, Lan;Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제8권1호
    • /
    • pp.25-33
    • /
    • 2014
  • In this paper, we quantitatively compare two different time-predictable multicore cache architectures, separate and statically-partitioned caches, through extensive simulation. Current research trends primarily focus on partitioned-cache architectures in order to achieve time predictability for hard real-time multicore based systems, and our experiments reveal that separate caches actually lead to much better performance and energy efficiency when compared to statically-partitioned caches, and both of them are adequate for timing analysis for real-time multicore applications.