• Title/Summary/Keyword: floating-point processor

Search Result 99, Processing Time 0.031 seconds

HARP의 부동소숫점 연산기 구조설계

  • Jo, Jeong-Yeon
    • ETRI Journal
    • /
    • v.10 no.3
    • /
    • pp.36-48
    • /
    • 1988
  • 본 논문에서는 부동소숫점연산 프로세서들의 최근 동향을 설명하면서 부동소숫점 연산기의 중요성을 강조하고, 한국전자통신연구소 프로세서구조연구실에서 개발하고 있는 HARP(High-performance Architecture for RISC type Processor)의 개발전략에 따른 부동소숫점 연산기(Floating-Point Unit : FPU)의 구조를 정의한다. 또한 HARP FPU의 설계구현을 마이크로 구조측면에서 설명한다. HARP의 CPU와 동일 칩상에 구현될 HARP FPU는 고유의 구조를 가지며 모든 부동소숫점 연산은 IEEE-754 표준을 따른다. HARP FPU는 고속의 부동소숫점 연산 유니트이며, HARP의 IPU(Integer Processing Unit)와는 독립적으로 동작되도록 설계되어서 HARP CPU의 전체적인 파이프라인 기능에 가능한 한 페날티를 주지 않도록 동작된다.

  • PDF

A Design of a 8-Thread Graphics Processor Unit with Variable-Length Instructions

  • Lee, Kwang-Yeob;Kwak, Jae-Chang
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.3
    • /
    • pp.285-288
    • /
    • 2008
  • Most of multimedia processors for 2D/3D graphics acceleration use a lot of integer/floating point arithmetic units. We present a new architecture with an efficient ALU, built in a smaller chip size. It reduces instruction cycles significantly based on a foundation of multi-thread operation, variable length instruction words, dual phase operation, and phase instruction's coordination. We can decrease the number of instruction cycles up to 50%, and can achieve twice better performance.

Meshfree/GFEM in hardware-efficiency prospective

  • Tian, Rong
    • Interaction and multiscale mechanics
    • /
    • v.6 no.2
    • /
    • pp.197-210
    • /
    • 2013
  • A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.

An Implementation of Stabilizing Controller for 2-Axis Platform using Adaptive Fuzzy Control and DSP

  • Ryu, Gi-Seok;Kim, Jin-Kyu;Park, Jang-Ho;Kim, Dae-Young;Kim, Jong-Hwa
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.71.3-71
    • /
    • 2001
  • Passive Stabilization method and active stabilization method are mainly used to comprise a control system of platform stabilizer. Passive Stabilization method has demerits because of size and weight except that control structure is simple while active stabilization method using sensors can reduce size and weight, it requires high sensor technique and control algorithm. In this paper, a stabilizing controller using adaptive fuzzy control technique and floating-point processor(DSP) is suggested.

  • PDF

A Hardware Implementation of Ogg Vorbis Audio Decoder with Embedded Processor

  • Kosaka, Atsushi;Yamaguchi, Satoshi;Okuhata, Hiroyuki;Onoye, Takao;Shirakawa, Isao
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.94-97
    • /
    • 2002
  • A VLSI architecture of an Ogg Vorbis decoder is proposed : which is dedicated to portable audio appliances. Referring to the computational cost analysis of the decoding processes, the LSP (Line Spectrum Pair) process, which takes more than 50% of the total processing time, can be regarded as a bottleneck to achieve realtime processing by embedded Processors. Thus in our decoder a specific hardware architecture is devised for the LSP process so as to be integrated into a single chip together with an ARM7TDMI processor. In addition, in order to reduce the total hardware cost, instead of the floating point arithmetic, the fixed point arithmetic is adopted. The LSP module has been implemented with 9,740 gates by using a Virtual Silicon 0.l5$\mu\textrm{m}$ CMOS technology, which operates at 58.8MHz with the total CPU load reduced by 57%. It is also verified that the use of the fixed point arithmetic does not incur any significant sound distortion.

  • PDF

Design and Verification of PCI Controller in a Multimedia Processor (멀티미디어 프로세서의 PCI 컨트롤러 디자인 및 검증)

  • 이준희;남상준;김병운;임연호;권영수;경종민
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.499-502
    • /
    • 1999
  • This paper presents a PCI (Peripheral Component Interconnect) controller embedded in a multimedia processor, called FLOVA (FLOating point VLIW Architecture), targeting for 3D graphics applications. Fast I/O interfaces are essential for multimedia processors which usually handle large amount of multimedia data. Therefore, in FLOVA, PCI bus is adopted for I/O interface due to fast burst transaction. However, there are several problems in implementation and verification to use burst transaction of PCI. It is difficult to handle data transaction between two units which have two different operating frequency. FLOVA has more higher operating frequency about 100MHz than that of PCI local bus and it makes lower utilization of FLOVA bus. Also, traditional simulation is not sufficient for verification of PCI functionality. In this paper, we propose buffering schemes to implement the PCI controller with wide bandwidth and high bus utilization. Also, this paper shows how to verify the PCI controller using real PCI bus environments before its fabrication.

  • PDF

A Study on Development of Digital Protective Relay Simulator using Digital Signal Processor (DSP를 이용한 디지털 보호 계전기의 시뮬레이터에 관한 연구)

  • Lee, J.J.;Jung, H.S.;Park, C.W.;Shin, M.C.;An, T.P.;Ko, I.S.
    • Proceedings of the KIEE Conference
    • /
    • 2001.07a
    • /
    • pp.237-239
    • /
    • 2001
  • This paper describes the digital relay simulator system using digital signal processor. The simulator system has two parts, one is software and the other is hardware part. The simulation software has variety calculation engines ; EMTP simulation data file conversion, user define simulation data generation, sequence data generation, data analysis engines. etc, these are designed upon GUI. And simulator software provides easy control interface for users, the simulator software performs on every MS Windows OS. The simulator hardware design uses 32bit floating point DSP(TMS320C32) architecture to achieve flexibility and high speed operation.

  • PDF

A Study on the Implementation of Hopfield Model using Array Processor (어레이 프로세서를 이용한 홉필드 모델의 구현에 관한 연구)

  • 홍봉화;이지영
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.4
    • /
    • pp.94-100
    • /
    • 1999
  • This paper concerns the implementation of a digital neural network which performs the high speed operation of Hopfield model's arithmetic operation. It is also designed to use a look-up table and produce floating point arithmetic of nonlinear function with high speed operation. The arithmetic processing of Hopfleld is able to describe the matrix-vector operation, which is adaptable to design the array processor because of its recursive and iterative operation .The proposed method is expected to be applied to the field of real neural networks because of the realization of the current VLSI techniques.

  • PDF

A Study on Cycle Based Simulator of a 32 bit floating point DSP (32비트 부동소수점 DSP의 Cycle Based Simulator에 관한 연구)

  • 우종식;양해용;안철홍;박주성
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.11
    • /
    • pp.31-38
    • /
    • 1998
  • This paper deals with CBS(Cycle Base Simulator) design of a 32 bit floating point DSP(Digital Signal Processor). The CBS has been developed for TMS320C30 compatible DSP and will be used to confirm the architecture, functions of sub-blocks, and control signals of the chip before the detailed logic design starts with VHDL. The outputs from CBS are used as important references at gate level design step because they give us control signals, output values of important blocks, values from internal buses and registers at each pipeline step, which are not available from the commercial simulator of DSP. In addition to core functions, it has various interfaces for efficient execution and convenient result display, CBS is verified through comparison with results from the commercial simulator for many application algorithms and its simulation speed is as fast as several tenth of that of logic simulation with VHDL. CBS in this work is for a specific DSP, but the concept may be applicable to other VLSI design.

  • PDF

Design of Transformation Engine for Mobile 3D Graphics (모바일 3차원 그래픽을 위한 기하변환 엔진 설계)

  • Kim, Dae-Kyoung;Lee, Jee-Myong;Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.49-54
    • /
    • 2007
  • As digital contents based on 3D graphics are increased, the requirement for low power 3D graphic hardware for mobile devices is increased. We design a transformation engine for mobile 3D graphic processor. We propose a simplified transformation engine for mobile 3D graphic processor. The area of the transformation engine is reduced by merging a mapping transformation unit into a projective transformation unit and by replacing a clipping unit with a selection unit. It consists of a viewing transformation unit a projective transformation unit a divide by w nit, and a selection unit. It can process 32 bit floating point format of the IEEE-754 standard or a reduced 24 bit floating point format. It has a pipelined architecture so that a vertex is processed every 4 cycles except for the initial latency. The RTL code is verified using an FPGA.