• Title/Summary/Keyword: 3차원 멀티코어 프로세서

Search Result 8, Processing Time 0.025 seconds

Thermal Pattern Comparison between 2D Multicore Processors and 3D Multicore Processors (2차원 구조와 3차원 구조에 따른 멀티코어 프로세서의 온도 분석)

  • Choi, Hong-Jun;Ahn, Jin-Woo;Jang, Hyung-Beom;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.1-10
    • /
    • 2011
  • Unfortunately, in current microprocessors, increasing the frequency causes increased power consumption and reduced reliability whereas it improves the performance. To overcome the power and thermal problems in the singlecore processors, multicore processors has been widely used. For 2D multicore processors, interconnection is regarded as one of the major constraints in performance and power efficiency. To reduce the performance degradation and the power consumption in 2D multicore processors, 3D integrated design technique has been studied by many researchers. Compared to 2D multicore processors, 3D multicore processors get the benefits of performance improvement and reduced power consumption by reducing the wire length significantly. However, 3D multicore processors have serious thermal problems due to high power density, resulting in reliability degradation. Detailed thermal analysis for multicore processors can be useful in designing thermal-aware processors. In this paper, we analyze the impact of workload distribution, distance to the heat sink, and number of stacked dies on the processor temperature. We also analyze the effects of the temperature on overall system performance. Especially, this paper presents the guideline for thermal-aware multicore processor design by analyzing the thermal problems in 2D multicore processors and 3D multicore processors.

Analysis of Performance, Energy-efficiency and Temperature for 3D Multi-core Processors according to Floorplan Methods (플로어플랜 기법에 따른 3차원 멀티코어 프로세서의 성능, 전력효율성, 온도 분석)

  • Choi, Hong-Jun;Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • The KIPS Transactions:PartA
    • /
    • v.17A no.6
    • /
    • pp.265-274
    • /
    • 2010
  • As the process technology scales down and integration densities continue to increase, interconnection has become one of the most important factors in performance of recent multi-core processors. Recently, to reduce the delay due to interconnection, 3D architecture has been adopted in designing multi-core processors. In 3D multi-core processors, multiple cores are stacked vertically and each core on different layers are connected by direct vertical TSVs(through-silicon vias). Compared to 2D multi-core architecture, 3D multi-core architecture reduces wire length significantly, leading to decreased interconnection delay and lower power consumption. Despite the benefits mentioned above, 3D design technique cannot be practical without proper solutions for hotspots due to high temperature. In this paper, we propose three floorplan schemes for reducing the peak temperature in 3D multi-core processors. According to our simulation results, the proposed floorplan schemes are expected to mitigate the thermal problems of 3D multi-core processors efficiently, resulting in improved reliability. Moreover, processor performance improves by reducing the performance degradation due to DTM techniques. Power consumption also can be reduced by decreased temperature and reduced execution time.

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.

Analysis on the Temperature of 3D Multi-core Processors according to Vertical Placement of Core and L2 Cache (코어와 L2 캐쉬의 수직적 배치 관계에 따른 3차원 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Park, Jae-Hyung;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.6
    • /
    • pp.1-10
    • /
    • 2011
  • In designing multi-core processors, interconnection delay is one of the major constraints in performance improvement. To solve this problem, the 3-dimensional integration technology has been adopted in designing multi-core processors. The 3D multi-core architecture can reduce the physical wire length by stacking cores vertically, leading to reduced interconnection delay and reduced power consumption. However, the power density of 3D multi-core architecture is increased significantly compared to the traditional 2D multi-core architecture, resulting in the increased temperature of the processor. In this paper, the floorplan methods which change the forms of vertical placement of the core and the level-2 cache are analyzed to solve the thermal problems in 3D multi-core processors. According to the experimental results, it is an effective way to reduce the temperature in the processor that the core and the level-2 cache are stacked adjacently. Compared to the floorplan where cores are stacked adjacently to each other, the floorplan where the core is stacked adjacently to the level-2 cache can reduce the temperature by 22% in the case of 4-layers, and by 13% in the case of 2-layers.

Analysis on the Performance and Temperature of 3D Multi-core Processors according to TLB Architecture (TLB 구조에 따른 3차원 멀티코어 프로세서의 성능, 온도 분석)

  • Son, Dong-Oh;Choi, Hong-Jun;Kim, Cheol-Hong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06b
    • /
    • pp.5-8
    • /
    • 2011
  • 3차원 멀티코어 프로세서는 기존의 멀티코어 프로세서에서 문제가 되던 연결망 지연시간과 전력문제를 해결할 수 있는 새로운 프로세서 설계기술이다. 하지만, 전력밀도의 증가로 인해 발생하는 열섬현상은 3차원 멀티코어 프로세서의 새로운 문제점으로 두드러지고 있다. 이러한 문제를 해결하기 위해서 동적 온도 관리 기법이 사용되지만, 동적 온도 관리 기법을 적용하면 시스템에 성능 저하가 발생하게 된다. 따라서 본 논문에서는 3차원 멀티코어 프로세서에서 문제가 되는 열섬현상을 해결하기 위해 고온의 유닛을 대상으로 동적 온도 관리 기법을 적용하고자 한다. 실험대상으로는 시스템 성능에 많은 영향을 미치고 높은 접근 때문에 고온이 발생하는 TLB 유닛을 사용하고자 한다. 특히, 시스템의 성능 저하를 줄이기 위해서 기존의 시스템보다 낮은 성능을 보이는 마이크로 TLB 구조를 적용해 보고자 한다. 성능이 낮은 구조의 경우 일반적으로 더 낮은 온도 분포를 보이며 동적 온도 관리 기법에 영향을 덜 받기 때문에 동적 온도 관리 기법만 적용한 구조보다 더 낮은 성능 저하를 보일 수 있다. 실험결과 동적 온도 관리 기법을 적용한 경우 기존의 시스템에 비해 23.4%의 성능 저하가 발생하고 마이크로 TLB 구조를 적용한 경우 27.1%의 성능 저하가 발생함을 알 수 있다.

Performance Study of Multicore Digital Signal Processor Architectures (멀티코어 디지털 신호처리 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.171-177
    • /
    • 2013
  • Due to the demand for high speed 3D graphic rendering, video file format conversion, compression, encryption and decryption technologies, the importance of digital signal processor system is growing rapidly. In order to satisfy the real-time constraints, high performance digital signal processor is required. Therefore, as in general purpose computer systems, digital signal processor should be designed as multicore architecture as well. Using UTDSP benchmarks as input, the trace-driven simulation has been performed and analyzed for the 2 to 16-core digital signal processor architectures with the cores from simple RISC to in-order and out-of-order superscalar processors for the various window sizes, extensively.

Analysis on the Thermal Efficiency of Branch Prediction Techniques in 3D Multicore Processors (3차원 구조 멀티코어 프로세서의 분기 예측 기법에 관한 온도 효율성 분석)

  • Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • The KIPS Transactions:PartA
    • /
    • v.19A no.2
    • /
    • pp.77-84
    • /
    • 2012
  • Speculative execution for improving instruction-level parallelism is widely used in high-performance processors. In the speculative execution technique, the most important factor is the accuracy of branch predictor. Unfortunately, complex branch predictors for improving the accuracy can cause serious thermal problems in 3D multicore processors. Thermal problems have negative impact on the processor performance. This paper analyzes two methods to solve the thermal problems in the branch predictor of 3D multi-core processors. First method is dynamic thermal management which turns off the execution of the branch predictor when the temperature of the branch predictor exceeds the threshold. Second method is thermal-aware branch predictor placement policy by considering each layer's temperature in 3D multi-core processors. According to our evaluation, the branch predictor placement policy shows that average temperature is $87.69^{\circ}C$, and average maximum temperature gradient is $11.17^{\circ}C$. And, dynamic thermal management shows that average temperature is $89.64^{\circ}C$ and average maximum temperature gradient is $17.62^{\circ}C$. Proposed branch predictor placement policy has superior thermal efficiency than the dynamic thermal management. In the perspective of performance, the proposed branch predictor placement policy degrades the performance by 3.61%, while the dynamic thermal management degrades the performance by 27.66%.

A Research about Open Source Distributed Computing System for Realtime CFD Modeling (SU2 with OpenCL and MPI) (실시간 CFD 모델링을 위한 오픈소스 분산 컴퓨팅 기술 연구)

  • Lee, Jun-Yeob;Oh, Jong-woo;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.171-171
    • /
    • 2017
  • 전산유체역학(CFD: Computational Fluid Dynamics)를 이용한 스마트팜 환경 내부의 정밀 제어 연구가 진행 중이다. 시계열 데이터의 난해한 동적 해석을 극복하기위해, 비선형 모델링 기법의 일종인 인공신경망을 이용하는 방안을 고려하였다. 선행 연구를 통하여 환경 데이터의 비선형 모델링을 위한 Tensorflow활용 방법이 하드웨어 가속 기능을 바탕으로 월등한 성능을 보임을 확인하였다. 그럼에도 오프라인 일괄(Offline batch)처리 방식의 한계가 있는 인공신경망 모델링 기법과 현장 보급이 불가능한 고성능 하드웨어 연산 장치에 대한 대안 마련이 필요하다고 판단되었다. CFD 해석을 위한 Solver로 SU2(http://su2.stanford.edu)를 이용하였다. 운영 체제 및 컴파일러는 1) Mac OS X Sierra 10.12.2 Apple LLVM version 8.0.0 (clang-800.0.38), 2) Windows 10 x64: Intel C++ Compiler version 16.0, update 2, 3) Linux (Ubuntu 16.04 x64): g++ 5.4.0, 4) Clustered Linux (Ubuntu 16.04 x32): MPICC 3.3.a2를 선정하였다. 4번째 개발환경인 병렬 시스템의 경우 하드웨어 가속는 OpenCL(https://www.khronos.org/opencl/) 엔진을 이용하고 저전력 ARM 프로세서의 일종인 옥타코어 Samsung Exynos5422 칩을 장착한 ODROID-XU4(Hardkernel, AnYang, Korea) SBC(Single Board Computer)를 32식 병렬 구성하였다. 분산 컴퓨팅을 위한 환경은 Gbit 로컬 네트워크 기반 NFS(Network File System)과 MPICH(http://www.mpich.org/)로 구성하였다. 공간 분해능을 계측 주기보다 작게 분할할 경우 발생하는 미지의 바운더리 정보를 정의하기 위하여 3차원 Kriging Spatial Interpolation Method를 실험적으로 적용하였다. 한편 병렬 시스템 구성이 불가능한 1,2,3번 환경의 경우 내부적으로 이미 존재하는 멀티코어를 활용하고자 OpenMP(http://www.openmp.org/) 라이브러리를 활용하였다. 64비트 병렬 8코어로 동작하는 1,2,3번 운영환경의 경우 32비트 병렬 128코어로 동작하는 환경에 비하여 근소하게 2배 내외로 연산 속도가 빨랐다. 실시간 CFD 수행을 위한 분산 컴퓨팅 기술이 프로세서의 속도 및 운영체제의 정보 분배 능력에 따라 결정된다고 판단할 수 있었다. 이를 검증하기 위하여 4번 개발환경에서 운영체제를 64비트로 개선하여 5번째 환경을 구성하여 검증하였다. 상반되는 결과로 64비트 72코어로 동작하는 분산 컴퓨팅 환경에서 단일 프로세서 기반 멀티 코어(1,2,3번) 환경보다 보다 2.5배 내외 연산속도 향상이 있었다. ARM 프로세서용 64비트 운영체제의 완성도가 낮은 시점에서 추후 성공적인 실시간 CFD 모델링을 위한 지속적인 검토가 필요하다.

  • PDF