• Title/Summary/Keyword: Graphics processing unit(GPU)

Search Result 153, Processing Time 0.19 seconds

Multiple GPU Scheduling for Improved Acquisition of Real-Time 360 VR Game Video (실시간 360 VR 스테레오 게임 영상 획득 성능 개선을 위한 다중 GPU 스케줄링에 관한 연구)

  • Lee, Junsuk;Paik, Joonki
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.974-982
    • /
    • 2019
  • Real-time 360 VR (Virtual Reality) stereo image acquisition technique based on game engine was proposed. However, GPU (Graphics Processing Unit) resource is not fully utilized due to bottlenecks. In this paper, we propose an improved GPU scheduling technique to solve the bottleneck of the existing technique and measure the performance of the proposed technique using the sample games of the commercial game engine. As a result, proposed technique showed an improvement of performance up to 70% and usage of GPU resources more evenly compared existing technique.

Use of High-performance Graphics Processing Units for Power System Demand Forecasting

  • He, Ting;Meng, Ke;Dong, Zhao-Yang;Oh, Yong-Taek;Xu, Yan
    • Journal of Electrical Engineering and Technology
    • /
    • v.5 no.3
    • /
    • pp.363-370
    • /
    • 2010
  • Load forecasting has always been essential to the operation and planning of power systems in deregulated electricity markets. Various methods have been proposed for load forecasting, and the neural network is one of the most widely accepted and used techniques. However, to obtain more accurate results, more information is needed as input variables, resulting in huge computational costs in the learning process. In this paper, to reduce training time in multi-layer perceptron-based short-term load forecasting, a graphics processing unit (GPU)-based computing method is introduced. The proposed approach is tested using the Korea electricity market historical demand data set. Results show that GPU-based computing greatly reduces computational costs.

High-Speed Implementations of Block Ciphers on Graphics Processing Units Using CUDA Library (GPU용 연산 라이브러리 CUDA를 이용한 블록암호 고속 구현)

  • Yeom, Yong-Jin;Cho, Yong-Kuk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.3
    • /
    • pp.23-32
    • /
    • 2008
  • The computing power of graphics processing units(GPU) has already surpassed that of CPU and the gap between their powers is getting wider. Thus, research on GPGPU which applies GPU to general purpose becomes popular and shows great success especially in the field of parallel data processing. Since the implementation of cryptographic algorithm using GPU was started by Cook et at. in 2005, improved results using graphic libraries such as OpenGL and DirectX have been published. In this paper, we present skills and results of implementing block ciphers using CUDA library announced by NVIDIA in 2007. Also, we discuss a general method converting source codes of block ciphers on CPU to those on GPU. On NVIDIA 8800GTX GPU, the resulting speeds of block cipher AES, ARIA, and DES are 4.5Gbps, 7.0Gbps, and 2.8Gbps, respectively which are faster than the those on CPU.

Memory Delay Comparison between 2D GPU and 3D GPU (2차원 구조 대비 3차원 구조 GPU의 메모리 접근 효율성 분석)

  • Jeon, Hyung-Gyu;Ahn, Jin-Woo;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.7
    • /
    • pp.1-11
    • /
    • 2012
  • As process technology scales down, the number of cores integrated into a processor increases dramatically, leading to significant performance improvement. Especially, the GPU(Graphics Processing Unit) containing many cores can provide high computational performance by maximizing the parallelism. In the GPU architecture, the access latency to the main memory becomes one of the major reasons restricting the performance improvement. In this work, we analyze the performance improvement of the 3D GPU architecture compared to the 2D GPU architecture quantitatively and investigate the potential problems of the 3D GPU architecture. In general, memory instructions account for 30% of total instructions, and global/local memory instructions constitutes 60% of total memory instructions. Therefore, the performance of the 3D GPU is expected to be improved significantly compared to the 2D GPU by reducing the delay of memory instructions. However, according to our experimental results, the 3D architecture improves the GPU performance only by 2% compared to the 2D architecture due to the memory bottleneck, since the performance reduction due to memory bottleneck in the 3D GPU architecture increases by 245% compared to the 2D architecture. This paper provides the guideline for suitable memory design by analyzing the efficiency of the memory architecture in 3D GPU architecture.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

Implememtation of Fast Rasterizer processing using GPGPU based on SIMT structure (SIMT 구조 기반 GPGPU를 이용한 고속 Rasterizer 구현)

  • Kim, Chiyong
    • Journal of IKEEE
    • /
    • v.21 no.3
    • /
    • pp.276-279
    • /
    • 2017
  • In this paper, SIMT structure based GPGPU (General Purpose Computing on Graphics Processing Units) is used for accelerating the Rasterizer which constitutes the screen of the display device in pixel unit. The GPU has a large number of ALUs, and the processing is very fast because of parallel processing. Therefore, in this paper, we implemented a rasterizer that generates a 3D graphics model using a CPU that performs operations sequentially and a GPU that performs operations in parallel. We confirmed that proposed rasterizer in this paper is 1.45 times better than rasterizer using Intel CPU when generating one frame.

Trends of Mobile GPU (모바일 GPU 동향)

  • Han, J.H.;Byun, J.G.;Eum, N.W.
    • Electronics and Telecommunications Trends
    • /
    • v.28 no.2
    • /
    • pp.50-57
    • /
    • 2013
  • 스마트폰 및 태블릿 PC에 들어가는 핵심 부품인 AP(Application Processor)는 모두 GPU(Graphics Processing Unit)를 내장하고 있다. 이는 칩 면적의 제약과 사용 가능한 전력의 한계로 데스크톱의 그래픽 카드에 탑재된 고성능 GPU와는 다른 설계 제약을 받는다. 본고에서는 고성능 GPU와 다른 설계 조건을 갖는 mobile GPU 기술에 대해서 알아보았고 대표적인 commercial mobile GPU인 Imagination, ARM, Qualcomm, NVidia사의 mobile GPU의 특징 및 성능에 대해서 알아보았다.

  • PDF

Fast Double Random Phase Encoding by Using Graphics Processing Unit (GPU 컴퓨팅에 의한 고속 Double Random Phase Encoding)

  • Saifullah, Saifullah;Moon, In-Kyu
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2012.05a
    • /
    • pp.343-344
    • /
    • 2012
  • With the increase of sensitive data and their secure transmission and storage, the use of encryption techniques has become widespread. The performance of encoding majorly depends on the computational time, so a system with less computational time suits more appropriate as compared to its contrary part. Double Random Phase Encoding (DRPE) is an algorithm with many sub functions which consumes more time when executed serially; the computation time can be significantly reduced by implementing important functions in a parallel fashion on Graphics Processing Unit (GPU). Computing convolution using Fast Fourier transform in DRPE is the most important part of the algorithm and it is shown in the paper that by performing this portion in GPU reduced the execution time of the process by substantial amount and can be compared with MATALB for performance analysis. NVIDIA graphic card GeForce 310 is used with CUDA C as a programming language.

  • PDF

GPU-Based Optimization of Self-Organizing Map Feature Matching for Real-Time Stereo Vision

  • Sharma, Kajal;Saifullah, Saifullah;Moon, Inkyu
    • Journal of information and communication convergence engineering
    • /
    • v.12 no.2
    • /
    • pp.128-134
    • /
    • 2014
  • In this paper, we present a graphics processing unit (GPU)-based matching technique for the purpose of fast feature matching between different images. The scale invariant feature transform algorithm developed by Lowe for various feature matching applications, such as stereo vision and object recognition, is computationally intensive. To address this problem, we propose a matching technique optimized for GPUs to perform computations in less time. We optimize GPUs for fast computation of keypoints to make our system quick and efficient. The proposed method uses a self-organizing map feature matching technique to perform efficient matching between the different images. The experiments are performed on various image sets to examine the performance of the system under varying conditions, such as image rotation, scaling, and blurring. The experimental results show that the proposed algorithm outperforms the existing feature matching methods, resulting in fast feature matching due to the optimization of the GPU.

Efficient Computation of Isosurface Curvatures on GPUs Based on the de Boor Algorithm (드 부어 알고리즘을 이용한 GPU에서의 효율적인 등가면 곡률 계산)

  • Kim, Minho
    • Journal of the Korea Computer Graphics Society
    • /
    • v.23 no.3
    • /
    • pp.47-54
    • /
    • 2017
  • In this paper, we propose an improved curvature-based GPU (Graphics Processing Unit) isosurface ray-casting technique. Our method adopts the fast evaluation method proposed by Sigg et al. [1] to find the isosurface, but replaces the computation of the gradient and Hessian with the de Boor algorithm. In this way, we can reduce the number of additional texture fetches from 84 to 27 thus improving the performance by up to ${\approx}30%$, depending on the platforms.