• 제목/요약/키워드: coarse-grained reconfigurable architecture (CGRA)

검색결과 9건 처리시간 0.021초

Efficient Fault-Recovery Technique for CGRA-based Multi-Core Architecture

  • Kim, Yoonjin;Sohn, Seungyeon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제15권2호
    • /
    • pp.307-311
    • /
    • 2015
  • In this paper, we propose an efficient fault-recovery technique for CGRA (Coarse-Grained Reconfigurable Architecture) based multi-core architecture. The proposed technique is intra/inter-CGRA co-reconfiguration technique based on a ring-based sharing fabric (RSF) and it enables exploiting the inherent redundancy and reconfigurability of the multi-CGRA for fault-recovery. Experimental results show that the proposed approaches achieve up to 73% fault recoverability when compared with completely connected fabric (CCF).

Reconfigurable Multi-Array Architecture for Low-Power and High-Speed Embedded Systems

  • Kim, Yoon-Jin
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제11권3호
    • /
    • pp.207-220
    • /
    • 2011
  • Coarse-grained reconfigurable architecture (CGRA) based embedded systems aims to achieve high system performance with sufficient flexibility to map a variety of applications. However, the CGRA has been considered as prohibitive one due to its significant area/power overhead and performance bottleneck. In this work, I propose reconfigurable multi-array architecture to reduce power/area and enhance performance in configurable embedded systems. The CGRA-based embedded systems that consist of hierarchical configurable computing arrays with varying size and communication speed were examined for multimedia and other applications. Experimental results show that the proposed approach reduces on-chip area by 22%, execution time by up to 72% and reduces power consumption by up to 55% when compared with the conventional CGRA-based architectures.

Dynamic Redundancy-based Fault-Recovery Scheme for Reliable CGRA-based Multi-Core Architecture

  • Kim, Yoonjin;Sohn, Seungyeon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제15권6호
    • /
    • pp.615-628
    • /
    • 2015
  • CGRA (Coarse-Grained Reconfigurable Architecture) based multi-core architecture can be considered as a suitable solution for the fault-tolerant computing. However, there have been a few research projects based on fault-tolerant CGRA without exploiting the strengths of CGRA as well as their works are limited to single CGRA. Therefore, in this paper, we propose two approaches to enable exploiting the inherent redundancy and reconfigurability of the multi-CGRA for fault-recovery. One is a resilient inter-CGRA fabric that is ring-based sharing fabric (RSF) with minimal interconnection overhead. Another is a novel intra/inter-CGRA reconfiguration technique on RSF for maximizing utilization of the resources when faults occur. Experimental results show that the proposed approaches achieve up to 94% faulty recoverability with reducing area/delay/power by up to 15%/28.6%/31% when compared with completely connected fabric (CCF).

Energy-Efficient and High Performance CGRA-based Multi-Core Architecture

  • Kim, Yoonjin;Kim, Heesun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제14권3호
    • /
    • pp.284-299
    • /
    • 2014
  • Coarse-grained reconfigurable architecture (CGRA)-based multi-core architecture aims at achieving high performance by kernel level parallelism (KLP). However, the existing CGRA-based multi-core architectures suffer from much energy and performance bottleneck when trying to exploit the KLP because of poor resource utilization caused by insufficient flexibility. In this work, we propose a new ring-based sharing fabric (RSF) to boost their flexibility level for the efficient resource utilization focusing on the kernel-stream type of the KLP. In addition, based on the RSF, we introduce a novel inter-CGRA reconfiguration technique for the efficient pipelining of kernel-stream on CGRA-based multi-core architectures. Experimental results show that the proposed approaches improve performance by up to 50.62 times and reduce energy by up to 50.16% when compared with the conventional CGRA-based multi-core architectures.

1-D CGRA에서의 H.264/AVC 디블록킹 필터 구현 (Implementation of H.264/AVC Deblocking Filter on 1-D CGRA)

  • 송세현;김기철
    • 전기전자학회논문지
    • /
    • 제17권4호
    • /
    • pp.418-427
    • /
    • 2013
  • 본 논문에서는 H.264/AVC 비디오 코덱용 디블록킹 필터의 병렬 알고리즘을 제안한다. 디블록킹 필터는 BS(boundary strength)에 따라 다른 필터 연산을 수행하며, 각 필터 연산은 다양한 조건 연산을 필요로 한다. 또한 각 경계면의 연산 순서가 정해져 있기 때문에 병렬 처리가 쉽지 않다. 본 논문에서 제안하는 디블록킹 필터 알고리즘은 최근에 소개된 1-D CGRA (coarse grained reconfigurable architecture)인 PRAGRAM (pipelined reconfigurable arrays with assistant manager groups)에서 처리된다. 디블록킹 필터 연산은 PRAGRAM의 단방향 파이프라인 PE 배열 구조를 이용하여 각 필터 연산을 고속으로 수행하고, dynamic reconfiguration 및 conditional reconfiguration을 이용하여 필터 선택과 조건 연산을 효율적으로 처리한다. 디블록킹 필터의 병렬 알고리즘은 매크로블록 당 225 사이클을 소요한다. 이는 동작주파수 150 MHz에서 full HD급 영상을 처리할 수 있는 성능이다.

Scalable Application Mapping for SIMD Reconfigurable Architecture

  • Kim, Yongjoo;Lee, Jongeun;Lee, Jinyong;Paek, Yunheung
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제15권6호
    • /
    • pp.634-646
    • /
    • 2015
  • Coarse-Grained Reconfigurable Architecture (CGRA) is a very promising platform that provides fast turn-around-time as well as very high energy efficiency for multimedia applications. One of the problems with CGRAs, however, is application mapping, which currently does not scale well with geometrically increasing numbers of cores. To mitigate the scalability problem, this paper discusses how to use the SIMD (Single Instruction Multiple Data) paradigm for CGRAs. While the idea of SIMD is not new, SIMD can complicate the mapping problem by adding an additional dimension of iteration mapping to the already complex problem of operation and data mapping, which are all interdependent, and can thus significantly affect performance through memory bank conflicts. In this paper, based on a new architecture called SIMD reconfigurable architecture, which allows SIMD execution at multiple levels of granularity, we present how to minimize bank conflicts considering all three related sub-problems, for various RA organizations. We also present data tiling and evaluate a conflict-free scheduling algorithm as a way to eliminate bank conflicts for a certain class of mapping problem.

CGRA를 위한 전력이 고려된 어플리케이션 매핑에 관한 연구 (A Study on Power-aware Application Mapping for CGRA)

  • 윤종희;김용주;박상현;조두산;이종원;김경원;백윤흥
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2009년도 춘계학술발표대회
    • /
    • pp.875-876
    • /
    • 2009
  • 최근에 응용프로그램의 복잡도가 증가함에 따라 이를 빠르게 처리하기 위하여 각종 멀티미디어 SoC에서 Coarse Grained Reconfigurable Architecture (CGRA)들이 사용되고 있다. CGRA가 제공하는 병렬성을 극대화하기 위한 많은 어플리케이션 매핑 알고리즘이 연구되어 왔으나 CGRA에서 소모되는 전력을 줄이기 위한 노력은 거의 없는 상태이다. 이러한 문제를 극복하기 위해 본 논문에서는 기존의 매핑 알고리즘을 기반으로 누설전력을 줄이기 위한 방법에 대해 다루고자 한다.

데이터를 고려한 저전력 소모 CGRA 매핑 알고리즘 (Low Power Mapping Algorithm Considering Data Transfer Time for CGRA)

  • 김용주;윤종희;조두산;백윤흥
    • 정보처리학회논문지A
    • /
    • 제19A권1호
    • /
    • pp.17-22
    • /
    • 2012
  • 모바일 시장 및 소형 전자기기 시장의 발달에 따라 고성능 프로세서에 대한 요구 또한 커지게 되었다. 재구성형 프로세서(CGRA)는 고성능과 저전력 소모를 동시에 만족시키는 프로세서로 ASIC의 고성능 저전력을 대체하면서도 하드웨어를 쉽게 재디자인 할 수 있도록 구성된 프로세서이다. 어플리케이션의 구조에 따라 CGRA의 전체수행시간이 프로세서 자체의 수행시간보다 데이터의 전송시간에 종속되는 경우가 있다. 이 논문에서는 데이터 전송시간에 따라 수행에 사용되는 자원을 최적화 함으로써 전력소모를 줄이는 매핑 알고리즘을 제안하였다. 제안된 알고리즘을 사용한 경우, 기존의 방식보다 최대 73%, 평균 56.4%의 전력소모를 줄일 수 있었다.

Efficient Use of On-chip Memory through Profile-Driven Array Reorganization

  • Cho, Doosan;Youn, Jonghee
    • 대한임베디드공학회논문지
    • /
    • 제6권6호
    • /
    • pp.345-359
    • /
    • 2011
  • In high performance embedded systems, the use of multiple on-chip memories is an essential architectural feature for exploiting inherent parallelism in multimedia applications. This feature allows multiple data accesses to be executed in parallel. However, it remains difficult to effectively exploit of multiple on-chip memories. The successful use of this architecture strongly depends on how to efficiently detect and exploit memory parallelism in target applications. In this paper, we propose a technique based on a linear array access descriptor [1], which is generated from profiled data, to detect and exploit memory parallelism. The proposed technique tackles an array reorganization problem to maximize memory parallelism in multimedia applications. We present preliminary experiments applying the proposed technique onto a representative coarse grained reconfigurable array processor (CGRA) with multimedia kernel codes. Our experimental results demonstrate that our technique optimizes data placement by putting independent data on separate storage. The results exhibit 9.8% higher performance on average compared to the existing method.