• Title/Summary/Keyword: Floating-Point Unit

Search Result 76, Processing Time 0.025 seconds

Hardware Design of Arccosine Function for Mobile Vector Graphics Processor (모바일 벡터 그래픽 프로세서용 역코사인 함수의 하드웨어 설계)

  • Choi, Byeong-Yoon;Lee, Jong-Hyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.727-736
    • /
    • 2009
  • In this paper, the $arccos(cos^{-1})$ arithmetic unit for mobile graphics accelerator is designed. The mobile vector graphics applications need tight area, execution time, power dissipation, and accuracy constraints compared to desktop PC applications. The designed processor adopts 2nd-order polynomial approximation scheme based on IEEE floating point data format to satisfy speed and accuracy conditions and reduces area via hardware sharing structure. The arccosine processor consists of 15,280 gates and its estimated operating frequency is about 125Mhz at operating condition of $0.35{\mu}m$ CMOS technology. Because the processor can execute arccosine function within 7 clock cycles, it has about 17 MOPS(million arccos operations per second) execution rate and can be applicable to mobile OpenVG processor. And because of its flexible architecture, it can be applicable to the various transcendental functions such as exponential, trigonometric and logarithmic functions via replacement of ROM and minor hardware modification.

Study on Real-Time Digital Filter Design as Function of Scanning Frequency of Focused Electron Beam (집속 전자 빔 장치에서 스캔 주파수에 따른 실시간 디지털 필터 설계에 관한 연구)

  • Kim, Seung-Jae;Oh, Se-Kyu;Yang, Kyung-Sun;Jung, Kwang-Oh;Kim, Dong-Hwan
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.35 no.5
    • /
    • pp.479-485
    • /
    • 2011
  • To acquire images in a thermionic-scanning electron-beam system, a scanning unit is needed to control the electron beam emitted from the tungsten filament source. In scanning the electron beam on the solid surface, the signalto-noise ratio depends on the scanning frequency. We used a digital filter to reduce noise by analyzing the real-time frequency of a secondary electron signal. The noise and the true image signal were well separated. We designed the digital filter via a DSP floating-point operation, and the noise elimination resulted in enhanced image quality in a highresolution mode.

Design and Simulation for Out-of-Order Execution Processor of a Fully Pipelined Scheme (완전한 파이프라인 방식의 비순차실행 프로세서의 설계 및 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.143-149
    • /
    • 2020
  • Currently, a multi-core processor is mainly used as a central processing unit of a computer system, and a high-performance out-of-order processor is adopted as each core to maximize system performance. The early out-of-order execution processor with Tomasulo algorithm aimed at floating-point instructions, and it took several cycles to execute by the use of complex structures such as reorder buffer and reservation station. However, in order for the processor to properly utilize out-of-order execution and increase the throughput of instructions, it must operate in a fully pipelined manner. In this paper, a fully pipelined out-of-order processor with speculative execution is designed with VHDL and verified with GHDL. As a result of the simulation, a program composed of ARM instructions is successfully performed.

Design of a Virtual Machine based on the Lua interpreter for the On-Board Control Procedure Execution Environment (탑재운영절차서 실행환경을 위한 Lua 인터프리터 기반의 가상머신 설계)

  • Kang, Sooyeon;Koo, Cheolhea;Ju, Gwanghyeok;Park, Sihyeong;Kim, Hyungshin
    • Journal of Satellite, Information and Communications
    • /
    • v.9 no.4
    • /
    • pp.127-133
    • /
    • 2014
  • In this paper, we present the design, functions and performance analysis of the virtual machine (VM) based on the Lua interpreter for On-Board Control Procedure Execution Environment (OEE). The development of the OEE has been required in order to operate the lunar explorer mission autonomously which is planned by Korea Aerospace Research Institute (KARI) autonomously. The concept of On-Board Control Procedure (OBCP) is already being applied to the deep space missions with a long propagation delay and a limited data transmission capacity since it ensure he autonomy of the mission without the ground intervention. The interpreter is the execution engine in the VM and it interpreters high-level programming codes line by line and executes the VM instructions. So the execution speed is very more slower than that of natively compiled codes. In order to overcome it, we design and implement OEE using register-based Lua interpreter for execution engine in OEE. We present experimental results on a range of additional hardware configurations such as usages of cache and floating point unit. We expect those to utilized to the OBCP scheduling policy and the system with Lua interpreter.

A Parallel Processing Technique for Large Spatial Data (대용량 공간 데이터를 위한 병렬 처리 기법)

  • Park, Seunghyun;Oh, Byoung-Woo
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.1-9
    • /
    • 2015
  • Graphical processing unit (GPU) contains many arithmetic logic units (ALUs). Because many ALUs can be exploited to process parallel processing, GPU provides efficient data processing. The spatial data require many geographic coordinates to represent the shape of them in a map. The coordinates are usually stored as geodetic longitude and latitude. To display a map in 2-dimensional Cartesian coordinate system, the geodetic longitude and latitude should be converted to the Universal Transverse Mercator (UTM) coordinate system. The conversion to the other coordinate system and the rendering process to represent the converted coordinates to screen use complex floating-point computations. In this paper, we propose a parallel processing technique that processes the conversion and the rendering using the GPU to improve the performance. Large spatial data is stored in the disk on files. To process the large amount of spatial data efficiently, we propose a technique that merges the spatial data files to a large file and access the file with the method of memory mapped file. We implement the proposed technique and perform the experiment with the 747,302,971 points of the TIGER/Line spatial data. The result of the experiment is that the conversion time for the coordinate systems with the GPU is 30.16 times faster than the CPU only method and the rendering time is 80.40 times faster than the CPU.

An Efficient Adaptive Loop Filter Design for HEVC Encoder (HEVC 부호화기를 위한 효율적인 적응적 루프 필터 설계)

  • Shin, Seung-yong;Park, Seung-yong;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.295-298
    • /
    • 2014
  • In this paper, an efficient design of HEVC Adaptive Loop Filter(ALF) for filter coefficients estimation is proposed. The ALF performs Cholesky decomposition of $10{\times}10$ matrix iteratively to estimate filter coefficients. The Cholesky decomposition of the ALF consists of root and division operation which is difficult to implement in a hardware design because it needs to many computation rate and processing time due to floating-point unit operation of large values of the Maximum 30bit in a LCU($64{\times}64$). The proposed hardware architecture is implemented by designing a root operation based on Cholesky decomposition by using multiplexer, subtracter and comparator. In addition, The proposed hardware architecture of efficient and low computation rate is implemented by designing a pipeline architecture using characteristic operation steps of Cholesky decomposition. An implemented hardware is designed using Xilinx ISE 14.3 Vertex-6 XC6VCX240T FPGA device and can support a frame rate of 40 4K Ultra HD($4096{\times}2160$) frames per second at maximum operation frequency 150MHz.

  • PDF