• Title/Summary/Keyword: 그래픽 프로세서

Search Result 133, Processing Time 0.025 seconds

VHDL Design for Out-of-Order Superscalar Processor of A Fully Pipelined Scheme (완전한 파이프라인 방식의 비순차실행 수퍼스칼라 프로세서의 VHDL 설계)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.1
    • /
    • pp.99-105
    • /
    • 2021
  • Today, a superscalar processor is the basic unit or an essential component of a multi-core processor, SoCs, and GPUs. Hence, a high-performance out-of-order superscalar processor must be adopted for these systems to maximize its performance. The superscalar processor fetches, issues, executes, and writes back multiple instructions per cycle by utilizing reorder buffers and reservation stations to dynamically schedule instructions in a pipelined scheme. In this paper, a fully pipelined out-of-order superscalar processor with speculative execution is designed with VHDL and verified with GHDL. As a result of the simulation, the program composed of ARM instructions is successfully performed.

Analysis on Memory Characteristics of Graphics Processing Units for Designing Memory System of General-Purpose Computing on Graphics Processing Units (범용 그래픽 처리 장치의 메모리 설계를 위한 그래픽 처리 장치의 메모리 특성 분석)

  • Choi, Hongjun;Kim, Cheolhong
    • Smart Media Journal
    • /
    • v.3 no.1
    • /
    • pp.33-38
    • /
    • 2014
  • Even though the performance of microprocessor is improved continuously, the performance improvement of computing system becomes hard to increase, in order to some drawbacks including increased power consumption. To solve the problem, general-purpose computing on graphics processing units(GPGPUs), which execute general-purpose applications by using specialized parallel-processing device representing graphics processing units(GPUs), have been focused. However, the characteristics of applications related with graphics is substantially different from the characteristics of general-purpose applications. Therefore, GPUs cannot exploit the outstanding computational resources sufficiently due to various constraints, when they execute general-purpose applications. When designing GPUs for GPGPU, memory system is important to effectively exploit the GPUs since typically general-purpose applications requires more memory accesses than graphics applications. Especially, external memory access requiring long latency impose a big overhead on the performance of GPUs. Therefore, the GPU performance must be improved if hierarchical memory architecture which can reduce the number of external memory access is applied. For this reason, we will investigate the analysis of GPU performance according to hierarchical cache architectures in executing various benchmarks.

제2세대 웍스테이션 "RISC"시스템 6000

  • 김은현
    • Computational Structural Engineering
    • /
    • v.3 no.3
    • /
    • pp.62-65
    • /
    • 1990
  • RISC System/6000은 유닉스 시스템인 AIX를 오퍼레이팅 시스템으로 채택하였고, 기존의 RISC기술에 혁신적인 진보를 이룩하여 가격 대 성능비를 크게 높임과 동시에 시스템의 기능을 극도로 최적화 시킨 새로운 차원의 아이비엠의 고성능 시스템패밀리이다. 이 시스템은 새로운 RISC 시스템 구조인 POWER(Performance Optimization With Enhanced RISC) 개념과 제2세대 수퍼스칼라 기법 및 마이크로 채널 아키텍쳐로 설계되어 있다. 특히 하나의 사이클에서 4개 이상의 명령어를 병렬처리 하도록 설계된 수퍼스칼라 기능을 통하여 복잡한 그래픽 또는 이미지 처리 및 고도의 수치해석 기능이 뛰어나다. RISC시스템/6000은 과학기술계산업무나 멀티사용자의 일반 비즈니스용으로도 모두 뛰어난 범용 컴퓨터로 그래픽 프로세서의 선택과 함께 CAD/CAM이나 그래픽/애니메이션전용 시스템을 구성할 수 있으며, 최고 512 사용자에 이르는 멀티 사용자 시스템을 구성하여 사용할 수 있다. 이전의 유닉스 시스템에 있어서 큰 약점이었던 사용자 인터페이스와 멀티 사용자 및 테스킹이 크게 강화 되었으며, 기존의 IBM 시스템 및 타 기종과도 네트워크 구성이 용이하고 수백여종의 과학기술 적용업무를 이용할 수 있다.

  • PDF

Performance Analysis of Futurebus+ based Multiprocessor Systems with MESI Cache Coherence Protocol (MESI 캐쉬 코히어런스 프로토콜을 사용하는 Futurebus+ 기반 멀티프로세서 시스템의 성능 평가)

  • 고석범;강인곤;박성우;김영천
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1815-1827
    • /
    • 1993
  • In this paper, we evaluate the performance of a Futurebus based multiprocessor system with MESI cache coherence protocol for four bus transaction types. Graphical symbols and compiler of SLAM II are used in modeling and simulation. A steady-state probability of each state for MESI protocol is computed by a Markov chain. The probability of each state is used as an input value for a correct simulation. Processor utilization, memory utilization, bus utilization, and the waiting time for bus arbitration are measured in terms of the number of processors, the hit ratio of cache memory, the probability of internal operation, and bus bandwidth.

  • PDF

Design of a Variable-Length Instruction based on a OpenGL ES 2.0 API (OpenGL ES 2.0 API 기반 가변길이 명령어 설계)

  • Lee, Kwang-Yeob
    • Journal of IKEEE
    • /
    • v.12 no.2
    • /
    • pp.118-123
    • /
    • 2008
  • The Khronos group releases OpenGL ES 2.0 API specification bringing streamlined shader programming to graphics processor of embedded system. For this reason, the mobile devices have need of graphics processor for supporting a OpenGL ES 2.0 API. We need to extend instruction`s length to support OpenGLES 2.0 API, so it needs more memory size. In this paper, we propose a new instruction format that offers availability for use the instructions. This proposed instruction adopt a variable length method and unit instruction architecture. This proposed instruction architecture that support to OpenGLES 2.0 API has consist of 32bit unit instructions up to 4 which can be combined for embellishing each other. Therefore, it can execute flexible instruction combination and reduce waste of instruction fields.

  • PDF

A Study on Design Schemes of Extracting Control Signals for a CD-G System (디지틀 오디오용 그래픽 시스템의 실시간 제어신호 추출을 위한 설계방식 연구)

  • 이용석;정화자;김용득
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.10
    • /
    • pp.1063-1073
    • /
    • 1992
  • This paper deals with a method for extracting picture signals from CD graphics with a conventional CD player, schemes for designing circuits for the effective extraction of control signals, and the implementation of such circuits using commercially available logic components, thereby achieving cost-effectiveness. This paper also presents an implementation and evaluation of the CD-G system, which requires extracting picture signals, deinterleaving the extracted signals and analyzing control commands and displaying them on a screen. The CD-G system implemented using the extraction circuit presented herein has been observed to operate well in real time.

  • PDF

An Architecture of a high efficient ALU for 3D Graphics Shader Processor (3D 그래픽 쉐이더 프로세서를 위한 고효율 연산기 구조)

  • Kim, Woo-Young;Lee, Bo-Haeng;Lee, Kwang-Yeob;Park, Tae-ryung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.229-232
    • /
    • 2009
  • In this paper, we propose a new programmable shader architecture based on an effective ALU operation. Today's mobile devices need the programmable shader processor for a three-dimensional(3D) graphics. The programmable shader processors require a lager ALU than a fixed pipeline ALU used previously. The proposed ALU architecture is able to execute two different arithmetic operations at the same time. Two instructions which need exclusive ALU operations are inserted into instruction decoders in parallel. Experimental results show the number of instruction cycles can be substantially reduced up to 40%.

  • PDF

Development of symbol generator software (심볼 생성기용 소프트웨어 개발)

  • Park,Deok-Bae;Lee,Jae-Eok
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.31 no.9
    • /
    • pp.94-102
    • /
    • 2003
  • This paper describes the development and implementation for the SYMBOLGEN(SYMBOL GENerator) software. The SYMBOL-GEN software is for improving graphic processing speed and decreasing data communication load in ASC by genera ting and downloading off-line symbol file for HUD and MFD , which are the main display equipments in military aircraft. The SYMBOL-GEN is developed on PC using C++ language and MS Visual Studio 6.0 development tool. It is also designed to be modified and extended easily by introducing object-oriented software development technique.

Implementation of a 3D Graphics Simulator for GP-GPU (GP-GPU 개발을 위한 3차원 그래픽 시뮬레이터 구현)

  • Yeo, Dong-young;Kim, Woo-young;Jung, Hyung-Ki;Lee, Kwang-Yeob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.337-340
    • /
    • 2009
  • Since a hardware accelerator for 3D graphics processing GPU(Graphics Processing Unit)'s performance has been improving constantly. This is the efficient way was introduced for complex graphics application, but it is rarely used to utilize 100% resources on GPU. GP-GPU(general-purpose GPU), including operations on the GPU and supporting common operations can be handled by the processor, is noted by depending on the distribution of resources that can be effectively controlled. In this paper, the simulator was implemented that supports virtual environment of GP-GPU and available for program design and debugging. Through this, the co-design development environment support simultaneous design fast and reliable verification that are available to build the interface of three-dimensional graphics display.

  • PDF

Hardware Design of Special-Purpose Arithmetic Unit for 3-Dimensional Graphics Processor (3차원 그래픽프로세서용 특수 목적 연산장치의 하드웨어 설계)

  • Choi, Byeong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.140-142
    • /
    • 2011
  • In this paper, special purpose arithmetic unit for mobile graphics accelerator is designed. The designed processor supports six operations, such as $1/{\chi}$, $\frac{1}{{\sqrt{x}}$, $log_2x$, $2^x$, $sin(x)$, $cos(x)$. The processor adopts 2nd-order polynomial minimax approximation scheme based on IEEE floating point data format to satisfy accuracy conditions and has 5-stage pipeline structure to meet high operational rates. The SFAU processor consists of 23,000 gates and its estimated operating frequency is about 400 Mhz at operating condition of 65nm CMOS technology. Because the processor can execute all operations with 5-stage pipeline scheme, it has about 400 MOPS(million operations per second) execution rate. Thus, it can be applicable to the 3D mobile graphics processors.

  • PDF