• Title/Summary/Keyword: VLIW Architecture

Search Result 19, Processing Time 0.024 seconds

PASC Processor Architecture for Enhanced Loop Execution (루프를 효과적으로 처리하는 PASC 프로세서 구조)

  • Ji, Seung-Hyeon;Park, No-Gwang;Jeon, Jung-Nam;Kim, Seok-Il
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1225-1240
    • /
    • 1999
  • This paper proposes PASC(PArtitioned SCHeduler) processor architecture that equips with a number of functional unit and an individual scheduler paris. Every scheduler of the PASC processor can determine whether a unit instruction can be issued to the associated functional unit or it is to be waited until next cycle caused by a resource collision or data dependencies. In the PASC processor, only the functional unit with a resource collision or data dependencies waits by executing a NOP(No OPeration) instruction and the other functional units execute their own instructions. Therefore we can expect the code compaction effect on the PASC processor. Thus, the last instruction of a loop at certain iteration and the very first instruction of the loop at the next iteration can be scheduled simultaneously if the two instructions do not incur any resource collision or data dependencies. Therefore, we can expect that such two instructions without any resource collision and data dependencies are packed into the same very long instruction word and thus, the two instructions are executed concurrently at run time. As a result, we can shorten execution cycles of a loop comparing to the execution of the loop on a traditional VLIW or SVLIW processor architecture. Simulation result also promises faster execution of loops on a PASC processor architecture than those on a VLIW and SVLIW processor architecture.

  • PDF

Soft Error Detection & Correction for VLIW Architecture (VLIW 프로세서를 위한 소프트에러 검출 및 수정 기법)

  • Li, Yunrong;Lee, Jongwon;Heo, Ingoo;Kwon, Yongin;Lee, Kyoungwoo;Paek, Yunheung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.9-10
    • /
    • 2011
  • 임베디드 시스템에서 저전력 공급, 칩사이즈 축소, 낮은 노이즈 마진 등 설계기법이 날로 향상됨에 따라 소프트에러가 기하급수적으로 늘어나고 있다. 본 논문에서는 VLIW 아키텍처에서 치명적인 오류를 일으키는 이런 소프트에러들을 검출하고 수정하는 기법을 제안하고자 한다.

EIS Processor Architecture for Enhanced Instruction Processing (빠른 명령어 처리가 가능한 EIS 프로세서 구조)

  • 지승현;전중남;김석일
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.12B
    • /
    • pp.1967-1978
    • /
    • 2000
  • 본 논문에서는 실행 시에 긴명령어를 구성하는 각 단위 명령어를 독립적으로 스케줄링할 수 있는 EIS 프로세서 구조를 제안하였다. 단위 명령어별 독립적인 수행을 위해서, EIS 프로세서 구조는 여러 개의 연산처리기와 스케줄러의 쌍으로 구성된다. EIS 프로세서 구조내의 모든 스케줄러는 독립적으로 자료종속성이나 자원충돌 여부를 검사하여 단위 명령어를 실행할지 혹은 다음 파이프라인 사이클동안 실행을 지연시킬지를 결정한다. 또한 EIS프로세서용 목적코드는 단위 명령어들간 동기화를 위해서 모든 단위 명령어에 종속성정보를 삽입하는 특징을 지닌다. 즉, EIS 프로세서 구조는 긴명령어내의 각 단위 명령어를 독립적으로 실행시킬 수 있으므로 기존의 VLIW 프로세서 구조나 SVLIW 프로세서 구조에서의 실행지연 시간을 제거할 수 있다. 시뮬레이션을 통해서도 EIS 프로세서 구조의 실행사이클이 VLIW 프로세서 구조나 SVLIW 프로세서 구조에서의 경우보다 더 빠름을 입증할 수 있었다. 특히 실수 명령어 분포가 높은 프로그램에서 EIS 프로세서에서의 실행사이클이 다른 프로세서 구조의 경우에 비하여 현저하게 줄어드는 것을 확인할 수 있었다.

  • PDF

Fine-Grain Real-Time Code Scheduling for VLIW Architecture

  • Chung, Tai M.;Hwang, Dae J.
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.1
    • /
    • pp.118-128
    • /
    • 1996
  • In safety critical hard real-time systems, a timing fault may yield catastrophic results. In order to eliminate the timing faults from the fast responsive real-time control systems, it is necessary to schedule a code based on high precision timing analysis. Further, the schedulability enhancement by having multiple processors is of wide spread interest. However, although an instruction level parallel processing is quite effective to improve the schedulability of such a system, none of the real-time applications employ instruction level parallel scheduling techniques because most of the real-time scheduling models have not been designed for fine-grain execution. In this paper, we present a timing constraint model specifying high precision timing constraints, and a practical approach for constructing static schedules for a VLIW execution model. The new model and analysis can guarantee timing accuracy to within a single machine clock cycle.

  • PDF

Design of a Parallel Pipelined Processor Architecture (병렬 파이프라인 프로세서 아키덱처의 설계)

  • 이상정;김광준
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.3
    • /
    • pp.11-23
    • /
    • 1995
  • In this paper, a parallel pipelined processor model which acts as a small VLIW processor architecture and a scheduling algorithm for extracting instruction-level parallelism on this architecture are proposed. The proposed model has a dual-instruction mode which has maximum 4 basic operations being executed in parallel. By combining these basic operations, variable instruction set can be designed for various applications. The scheduling algorithm schedules basic operations for parallel execution and removes pipeline hazards by examining data dependency and resource conflict relations. In order to examine operation and evaluate the performance,a C compiler and a simulator are developed. By simulating various test programs with the compiler and the simulator, the characteristics and the performance result of the proposed architecture are measured.

  • PDF

Energy-efficient Reconfigurable FEC Processor for Multi-standard Wireless Communication Systems

  • Li, Meng;der Perre, Liesbet Van;van Thillo, Wim;Lee, Youngjoo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.17 no.3
    • /
    • pp.333-340
    • /
    • 2017
  • In this paper, we describe HW/SW co-optimizations for reconfigurable application specific instruction-set processors (ASIPs). Based on our previous very long instruction word (VLIW) ASIP, the proposed framework realizes various forward error-correction (FEC) algorithms for wireless communication systems. In order to enhance the energy efficiency, we newly introduce several design methodologies including high-radix algorithms, task-level out-of-order executions, and intensive resource allocations with loop-level rescheduling. The case study on the radix-4 turbo decoding shows that the proposed techniques improve the energy efficiency by 3.7 times compared to the previous architecture.

Design and Verification of PCI Controller in a Multimedia Processor (멀티미디어 프로세서의 PCI 컨트롤러 디자인 및 검증)

  • 이준희;남상준;김병운;임연호;권영수;경종민
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.499-502
    • /
    • 1999
  • This paper presents a PCI (Peripheral Component Interconnect) controller embedded in a multimedia processor, called FLOVA (FLOating point VLIW Architecture), targeting for 3D graphics applications. Fast I/O interfaces are essential for multimedia processors which usually handle large amount of multimedia data. Therefore, in FLOVA, PCI bus is adopted for I/O interface due to fast burst transaction. However, there are several problems in implementation and verification to use burst transaction of PCI. It is difficult to handle data transaction between two units which have two different operating frequency. FLOVA has more higher operating frequency about 100MHz than that of PCI local bus and it makes lower utilization of FLOVA bus. Also, traditional simulation is not sufficient for verification of PCI functionality. In this paper, we propose buffering schemes to implement the PCI controller with wide bandwidth and high bus utilization. Also, this paper shows how to verify the PCI controller using real PCI bus environments before its fabrication.

  • PDF

A Software And Hardware Scheme For Reducing The Branch Penalty In Parallel Computers (병렬구조 컴퓨터에서 Branch penalty를 감소시키기 위한 소프트웨어와 하드웨어 방법)

  • 함찬숙;조종현;조영일
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.11
    • /
    • pp.11-16
    • /
    • 1993
  • VLIW architecture capable of testing multiple conditions in a cycle must support an efficient mechanism for multi-way branches. This paper proposes a mechanism to speed up the execution of multi-way branches and an efficient memory packing method of instructions, which reduced the wasted memory space. Also, we develops a new compiler technique which can transform program segments that are not applied to multi-way branches into ones that are applied to multi-way branches. The benefits gained by the transformation are to reduce branch penalty and to increase instruction-level parallelism.

  • PDF

Ultra-low-power DSP for Audio Signal Processing (오디오 신호 처리를 위한 초저전력 DSP 프로세서)

  • Kwon, Kiseok;Ahn, Minwook;Jo, Seokhwan;Lee, Yeonbok;Lee, Seungwon;Park, Young-Hwan;Kim, Sukjin;Kim, Do-Hyung;Kim, Jaehyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.157-159
    • /
    • 2014
  • In this paper, we introduce SlimSRP, an ultra-low-power digital signal processor (DSP) solution for mobile audio and voice applications. So far, application processors (APs) have taken charge of all the tasks in mobile devices. However, they have suffered from short battery life problems to deal with complex usage scenarios, such as always-on voice trigger with continuous audio playback. From extensive analysis of audio and voice application characteristics, SlimSRP is designed to relive the performance and power burden of APs. It employs three-issue VLIW architecture, and the major low-power and high-performance techniques include: (1) an optimized register-file architecture friendly for constants generation, (2) a powerful instruction set to reduce the number of register file accesses and (3) a unique instruction compression scheme that contributes to saved memory size and reduced cache miss. An implementation of SlimSRP runs at up to 200MHz and the logic occupies 95K NAND2 gates in Samsung 28LPP process. The experimental results demonstrate that a MP3 decoder application with a 128kbps 44.1kHz input can run at 5.1MHz and the logic consumes only 22uW/MHz.

  • PDF