• Title/Summary/Keyword: Many-core architecture

Search Result 136, Processing Time 0.024 seconds

A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs (SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.2
    • /
    • pp.252-256
    • /
    • 2013
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.

New Thermal-Aware Voltage Island Formation for 3D Many-Core Processors

  • Hong, Hyejeong;Lim, Jaeil;Lim, Hyunyul;Kang, Sungho
    • ETRI Journal
    • /
    • v.37 no.1
    • /
    • pp.118-127
    • /
    • 2015
  • The power consumption of 3D many-core processors can be reduced, and the power delivery of such processors can be improved by introducing voltage island (VI) design using on-chip voltage regulators. With the dramatic growth in the number of cores that are integrated in a processor, however, it is infeasible to adopt per-core VI design. We propose a 3D many-core processor architecture that consists of multiple voltage clusters, where each has a set of cores that share an on-chip voltage regulator. Based on the architecture, the steady state temperature is analyzed so that the thermal characteristic of each voltage cluster is known. In the voltage scaling and task scheduling stages, the thermal characteristics and communication between cores is considered. The consideration of the thermal characteristics enables the proposed VI formation to reduce the total energy consumption, peak temperature, and temperature gradients in 3D many-core processors.

Design Space Exploration of Many-Core Architecture for Sound Synthesis of Guitar on Portable Device (휴대 장치용 기타 음 합성을 위한 매니코어 아키텍처의 디자인 공간 탐색)

  • Kang, Myeongsu;Kim, Jong-Myon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.1-4
    • /
    • 2014
  • Although physical modeling synthesis is becoming more and more efficient in rich and natural high-quality sound synthesis, its high computational complexity limits its use in portable devices. This constraint motivated research of single-instruction multiple-data many-core architectures that support the tremendous amount of computations by exploiting massive parallelism inherent in physical modeling synthesis. Since no general consensus has been reached which grain sizes of many-core processors and memories provide the most efficient operation for sound synthesis, design space exploration is conducted for seven processing element (PE) configurations. To find an optimal PE configuration, each PE configuration is evaluated in terms of execution time, area and energy efficiencies. Experimental results show that all PE configurations are satisfied with the system requirements to be implemented in portable devices.

  • PDF

Performance Evaluation and Analysis for Discrete Wavelet Transform on Many-Core Processors (매니코어 프로세서 상에서 이산 웨이블릿 변환을 위한 성능 평가 및 분석)

  • Park, Yong-Hun;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.5
    • /
    • pp.277-284
    • /
    • 2012
  • To meet the usage of discrete wavelet transform (DWT) on potable devices, this paper implements 2-level DWT using a reference many-core processor architecture and determine the optimal many-core processor. To explore the optimal many-core processor, we evaluate the impacts of a data-per-processing element ratio that is defined as the amount of data mapped directly to each processing element (PE) on system performance, energy efficiency, and area efficiency, respectively. This paper utilized five PE configurations (PEs=16, 64, 256, 1,024, and 4,096) that were implemented in 130nm CMOS technology with a 720MHz clock frequency. Experimental results indicated that maximum energy and area efficiencies were achieved at PEs=1,024. However, the system area must be limited 140mm2 and the power should not exceed 3 watts in order to implement 2-level DWT on portable devices. When we consider these restrictions, the most reasonable energy and area efficiencies were achieved at PEs=256.

Architecture Exploration of Optimal Many-Core Processors for a Vector-based Rasterization Algorithm (래스터화 알고리즘을 위한 최적의 매니코어 프로세서 구조 탐색)

  • Son, Dong-Koo;Kim, Cheol-Hong;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.1
    • /
    • pp.17-24
    • /
    • 2014
  • In this paper, we implement and evaluate the performance of a vector-based rasterization algorithm for 3D graphics by using a SIMD (single instruction multiple data) many-core processor architecture. In addition, we evaluate the impact of a data-per-processing elements (DPE) ratio that is defined as the amount of data directly mapped to each processing element (PE) within many-core in terms of performance, energy efficiency, and area efficiency. For the experiment, we utilize seven different PE configurations by varying the DPE ratio (or the number PEs), which are implemented in the same 130 nm CMOS technology with a 500 MHz clock frequency. Experimental results indicate that the optimal PE configuration is achieved as the DPE ratio is in the range from 16,384 to 256 (or the number of PEs is in the range from 16 and 1,024), which meets the requirements of mobile devices in terms of the optimal performance and efficiency.

A Study on the Topological characteristics of the Korean Traditional Architecture (한국 전통건축 공간에 나타난 위상기하학적 특성에 관한 연구)

  • Bae Kang-Won;Kim Moon-Duck
    • Korean Institute of Interior Design Journal
    • /
    • v.13 no.6
    • /
    • pp.74-81
    • /
    • 2004
  • Much evidence points to the fact that Korean traditional architecture has long reflected traditional Korean philosophy. If what this evidence points to Is true, there is much more insight to be gained about this connection. It is important to begin with the idea that Korean culture stemmed from Confucianism, Buddhism, and Taoism. All three share similar ideas, and this study will set out to prove that topology, an anti-Euclidean school of thought created at the end of the 19th century, shares many of the same core ideas as the three mentioned above. Transitively, if Korean traditional culture is reflected in Korean traditional architecture, and topology shares many of the same core ideas, it seems that topology should be accepted into the mainstream of architectural design. This study will aim to interpret space structure forms and space constructions of the Korean traditional architecture from the topological perspective.

Optimal Many-core Processor Architecture for Different Ultrasonic Image Resolutions (초음파 영상선호의 크기 변화에 따른 최적의 매니코어 프로세서 구조)

  • Kang, Seong-Mo;Kim, Jong-Myon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.13 no.1
    • /
    • pp.50-55
    • /
    • 2012
  • This paper proposes an optima] many-core processor architecture that meets the requirements of low power and high performance for different ultrasonic image resolutions in hand-held ultrasonic devices. To identify the optimal many-core architecture, seven different PE configurations are simulated for processing ultrasonic images in terms of execution performance and energy consumption. Experimental results indicate that the highest energy efficiencies are achieved at PEs=1,024, 64, and 256 for ultrasonic images at $256{\times}256$, $320{\times}240$, and $800{\times}480$ resolutions, respectively. In addition, the maximum area efficiencies are obtained at PEs=256 (for $256{\times}256$ and $800{\times}480$ image resolutions) and 64 (for $320{\times}240$ image resolution).

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

Development of Reconfigurable and Evolvable Architecture for Intelligence Implement (시스템 재설정 및 진화를 위한 지능형 아키택처 개발)

  • Na Jin Hee;Ahn Ho Seok;Park Myeong Su;Choi Jin Young
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.11a
    • /
    • pp.500-503
    • /
    • 2005
  • Many researches on intelligent system have been performed and various intelligent algorithms have been developed, which are effective under an assumed specific environment and purpose. But in an real environment, the performance of these algorithms can be largely degraded. In this Paper, we Proposed an Evolvable and Reconfigurable(ERI) Architecture based on intelligent Macro Core(IMC) so that various and new algorithms can be easily added incrementally and construct the reconfigured intelligent system easily. We apply the proposed ERI Architecture to face detection and recognition system to show its usefulness.

  • PDF

Development of Reconfigurable and Evolvable Architecture for Intelligence Implement (시스템 재설정 및 진화를 위한 지능형 아키텍처 개발)

  • Na Jin Hee;Ahn Ho Seok;Park Myoung Soo;Choi Jin Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.7
    • /
    • pp.823-827
    • /
    • 2005
  • Many researches on intelligent system have been performed and various intelligent algorithms have been developed, which are effective under an assumed specific environment and purpose. But in an real environment, the Performance of these algorithms can be largely degraded. In this paper, we proposed an Evolvable and Reconfigurable(ERI) Architecture based on intelligent Macro Core(IMC) so that various and new algorithms can be easily added incrementally and construct the reconfigured intelligent system easily. We apply the proposed ERI Architecture to face detection and recognition system to show its usefulness.