• Title/Summary/Keyword: GPU program

Search Result 53, Processing Time 0.021 seconds

Design and Implementation of GPU Based Time-Variant Volume Rendering Program and User-Friendly Transfer Function Editor (GPU 기반의 Time-Variant 볼륨 렌더링 프로그램과 사용자 친화적인 전이함수 에디터의 설계 및 구현)

  • Lee, Joong-Youn;Hur, Young-Ju;Koo, Gee-Bum
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.1025-1030
    • /
    • 2007
  • 여러 학계와 산업계로부터 인체영상과 같은 정적인 볼륨 데이터뿐만 아니라, 유체 흐름과 같은 동적으로 움직이는 Time-Variant 볼륨 데이터에 대한 실시간 렌더링의 요구가 계속되고 있다. 일반적으로 Time-Variant 데이터는 그 크기가 정적 볼륨 데이터의 수배에서 수백 배에 이르러, 이를 실시간으로 가시화하는 데에 많은 어려움이 있어왔다. 한편, PC 그래픽스 하드웨어의 급격한 발전에 따라 슈퍼컴퓨터나 다수의 컴퓨터들을 이용한 병렬/분산 렌더링으로나 가능했던 Time-Variant 볼륨 데이터의 실시간 볼륨 렌더링을 한대의 일반 PC에서 수행하려는 시도가 계속되고 있다. GPU의 꼭지점 및 프래그먼트 쉐이더(vertex & fragment shader)는 수치 계산에 최적화된 벡터 연산과 사용자 프로그래밍 기능으로 빠른 볼륨 렌더링을 일반 PC에서도 가능하게 했다. 본 논문에서는 GPU를 이용해서 Time-Variant 볼륨 데이터를 빠르게 가시화하고, 이렇게 개발한 GPU 볼륨 렌더링 프로그램을 사용자가 사용하기 편리하도록 사용자 친화적인 유저 인터페이스를 설계하고 구현하였다. 특히, 시간에 따라 동적으로 변화해야 하는 전이함수를 최대한 편리하게 생성할 수 있도록 전이함수 에디터에 중점을 두었다.

  • PDF

A GPU scheduling framework for applications based on dataflow specification (데이터 플로우 기반 응용들을 위한 GPU 스케줄링 프레임워크)

  • Lee, Yongbin;Kim, Sungchan
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.10
    • /
    • pp.1189-1197
    • /
    • 2014
  • Recently, general purpose graphic processing units(GPUs) are being widely used in mobile embedded systems such as smart phone and tablet PCs. Because of architectural limitations of mobile GPGPUs, only a single program is allowed to occupy a GPU at a time in a non-preemptive way. As a result, it is difficult to meet performance requirements of applications such as frame rate or response time if applications running on a GPU are not scheduled properly. To tackle this difficulty, we propose to specify applications using synchronous data flow model of computation such that applications are formed with edges and nodes. Then nodes of applications are scheduled onto a GPU unlike conventional scheduling an application as a whole. This approach allows applications to share a GPU at a finer granularity, node (or task)-level, providing several benefits such as eliminating need for manually partitioning applications and better GPU utilization. Furthermore, any scheduling policy can be applied in response to the characteristics of applications.

Calculation Effect of GPU Parallel Programing for Planar Multibody System Dynamics (평면 다물체 동역학 해석에서 GPU 병렬 프로그래밍의 계산효과)

  • Jun, C.W.;Sohn, J.H.
    • Journal of Power System Engineering
    • /
    • v.16 no.4
    • /
    • pp.12-16
    • /
    • 2012
  • In this paper, the equations of motions for planar multibody dynamics are established for considering the parallel programming based on GPU. Cartesian coordinates are used to formulate the equations of motion and implicit integration method called HHT-alpha is employed. Open chain multibody system is considered for computer simulation. CUDA toolkit is employed for establishing the GPU parallel programming. The exactness of the analysis is verified from the comparison with ADAMS. The results from parallel computing based on GPU are compared with the results from the sequential programming based on CPU in terms of calculation time. The multiple pendulum with bodies and joints is employed for the computer simulation. In the pendulum system that has 290 bodies, the parallel program indicates an improved efficiency of about 25.5 second(15.5% improvement). It is noted that the larger the size of system is, the time efficiency is better.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.

Design and Implementation of High-Speed Software Cryptographic Modules Using GPU (GPU를 활용한 고속 소프트웨어 암호모듈 설계 및 구현)

  • Song, JinGyo;An, SangWoo;Seo, Seog Chung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1279-1289
    • /
    • 2020
  • To securely protect users' sensitive information and national secrets, the importance of cryptographic modules has been emphasized. Currently, many companies and national organizations are actively using cryptographic modules. In Korea, To ensure the security of these cryptographic modules, the cryptographic module has been verified through the Korea Certificate Module Validation Program(KCMVP). Most of the domestic cryptographic modules are CPU-based software (S/W). However, CPU-based cryptographic modules are difficult to use in servers that need to process large amounts of data. In this paper, we propose an S/W cryptographic module that provides a high-speed operation using GPU. We describe the configuration and operation of the S/W cryptographic module using GPU and present the changes in the cryptographic module security requirements by using GPU. In addition, we present the performance improvement compared to the existing CPU S/W cryptographic module. The results of this paper can be used for cryptographic modules that provide cryptography in servers that manage IoT (Internet of Things) or provide cloud computing.

Power Modeling Approach for GPU Source Program

  • Li, Junke;Guo, Bing;Shen, Yan;Li, Deguang;Huang, Yanhui
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.1
    • /
    • pp.181-191
    • /
    • 2018
  • Rapid development of information technology makes our environment become smarter and massive high performance computers are providing powerful computing for that. Graphics Processing Unit (GPU) as a typical high performance component is being widely used for both graphics and general-purpose applications. Although it can greatly improve computing power, it also delivers significant power consumption and need sufficient power supplies. To make high performance computing more sustainable, the important step is to measure it. Current power technologies for GPU have some drawbacks, such as they are not applicable for power estimation at the early stage. In this article, we present a novel power technology to correlate power consumption and the characteristics at the programmer perspective, and then to estimate power consumption of source program without prerunning. We conduct experiments on Nvidia's GT740 platform; the results show that our power model is more accurately than regression model and has an average error of 2.34% and the maximum error of 9.65%.

Benchmark Results of a Radio Spectrometer Based on Graphics Processing Unit

  • Kim, Jongsoo;Wagner, Jan
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.40 no.2
    • /
    • pp.44.1-44.1
    • /
    • 2015
  • We set up a project to make spectrometers for single dish observations of the Korean VLBI Network (KVN), a new future multi-beam receiver of the ASTE (Atacama Submillimeter Telescope Experiment), and the total power (TP) antennas of the Atacama Large Millimeter/submillimeter Array (ALMA). Traditionally, spectrometers based on ASIC (Application-Specific Integrated circuit) and FPGA (Field-Programmable Gate Array) have been used in radio astronomy. It is, however, that a Graphics Processing Unit (GPU) technology is now viable for spectrometers due to the rapid improvement of its performance. A high-resolution spectrometer should have the following functions: poly-phase filter, data-bit conversion, fast Fourier transform, and complex multiplication. We wrote a program based on CUDA (Compute Unified Device Architecture) for a GPU spectrometer. We measured its performance using two GPU cards, Titan X and K40m, from NVIDIA. A non-optimized GPU code can process a data stream of around 2 GHz bandwidth, which is enough for the KVN spectrometer and promising for the ASTE and ALMA TP spectrometers.

  • PDF

A Reconfigurable Lighting Engine for Mobile GPU Shaders

  • Ahn, Jonghun;Choi, Seongrim;Nam, Byeong-Gyu
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.1
    • /
    • pp.145-149
    • /
    • 2015
  • A reconfigurable lighting engine for widely used lighting models is proposed for low-power GPU shaders. Conventionally, lighting operations that involve many complex arithmetic operations were calculated by the shader programs on the GPU, which led to a significant energy overhead. In this letter, we propose a lighting engine to improve the energy-efficiency by supporting the widely used advanced lighting models in hardware. It supports the Blinn-Phong, Oren-Nayar, and Cook-Torrance models, by exploiting the logarithmic arithmetic and optimizing the trigonometric function evaluations for the energy-efficiency. Experimental results demonstrate 12.7%, 42.5%, and 35.5% reductions in terms of power-delay product from the shader program implementations for each lighting model. Moreover, our work shows 10.1% higher energy-efficiency for the Blinn-Phong model compared to the prior art.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

Acceleration of 2D Image Based Flow Visualization using GPU (GPU를 이용한 2차원 영상 기반 유동 가시화 기법의 가속)

  • Lee, Joong-Youn
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.543-546
    • /
    • 2007
  • Flow visualization is one of visualization techniques and it means a visual expression of vector data using 2D or 3D graphics. It aims for human to easily find and understand a special feature of the vector data. The Image Based Flow Visualization (IBFV) is one of the fastest technique in the dense integration based flow visualization techniques. In this paper, IBFV is accelerated and implemented using commodity GPU. Especially, mesh advection is accelerated at the vertex program.

  • PDF