• Title/Summary/Keyword: General purpose computing

Search Result 160, Processing Time 0.028 seconds

GPGPU Task Management Technique to Mitigate Performance Degradation of Virtual Machines due to GPU Operation in Cloud Environments (클라우드 환경에서 GPU 연산으로 인한 가상머신의 성능 저하를 완화하는 GPGPU 작업 관리 기법)

  • Kang, Jihun;Gil, Joon-Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.9
    • /
    • pp.189-196
    • /
    • 2020
  • Recently, GPU cloud computing technology applying GPU(Graphics Processing Unit) devices to virtual machines is widely used in the cloud environment. In a cloud environment, GPU devices assigned to virtual machines can perform operations faster than CPUs through massively parallel processing, which can provide many benefits when operating high-performance computing services in a variety of fields in a cloud environment. In a cloud environment, a GPU device can help improve the performance of a virtual machine, but the virtual machine scheduler, which is based on the CPU usage time of a virtual machine, does not take into account GPU device usage time, affecting the performance of other virtual machines. In this paper, we test and analyze the performance degradation of other virtual machines due to the virtual machine that performs GPGPU(General-Purpose computing on Graphics Processing Units) task in the direct path based GPU virtualization environment, which is often used when assigning GPUs to virtual machines in cloud environments. Then to solve this problem, we propose a GPGPU task management method for a virtual machine.

Spark Framework Based on a Heterogenous Pipeline Computing with OpenCL (OpenCL을 활용한 이기종 파이프라인 컴퓨팅 기반 Spark 프레임워크)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.2
    • /
    • pp.270-276
    • /
    • 2018
  • Apache Spark is one of the high performance in-memory computing frameworks for big-data processing. Recently, to improve the performance, general-purpose computing on graphics processing unit(GPGPU) is adapted to Apache Spark framework. Previous Spark-GPGPU frameworks focus on overcoming the difficulty of an implementation resulting from the difference between the computation environment of GPGPU and Spark framework. In this paper, we propose a Spark framework based on a heterogenous pipeline computing with OpenCL to further improve the performance. The proposed framework overlaps the Java-to-Native memory copies of CPU with CPU-GPU communications(DMA) and GPU kernel computations to hide the CPU idle time. Also, CPU-GPU communication buffers are implemented with switching dual buffers, which reduce the mapped memory region resulting in decreasing memory mapping overhead. Experimental results showed that the proposed Spark framework based on a heterogenous pipeline computing with OpenCL had up to 2.13 times faster than the previous Spark framework using OpenCL.

Hologram Generation Acceleration Method Using GPGPU (GPGPU를 이용한 홀로그램 생성 가속화 방법)

  • Lee, Yoon-Hyuk;Kim, Dong-Wook;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.800-807
    • /
    • 2017
  • A large amount of computation is required to generate a hologram using a computer. In order to accelerate the computation, many methods of acceleration by parallel programming using GPGPU(General Purpose computing on Graphic Process Unit) have been researched. In this paper, we propose a method of reducing the bottleneck caused by hologram pixel based parallel processing and using the shareable variables. We also propose how to optimize using Visual Profiler supported by nVidia's CUDA to make threads work optimally. The experimental results show that the proposed method reduces the calculation time by up to 40% compared with the existing research.

Enhancement of H.264/AVC Encoding Speed and Reduction of CPU Load through Parallel Programming Based on CUDA (CUDA 기반의 병렬 프로그래밍을 통한 H.264/AVC 부호화 속도 향상 및 CPU 부하 경감)

  • Jang, Eun-Been;Ha, Yun-Su
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.858-863
    • /
    • 2010
  • In order to enhance encoding speed in dynamic image encoding using H.264/AVC, reducing the time for motion estimation which takes a large portion of the processing time is very important. An approach using graphics processing unit(GPU) as a coprocessor to assist the central processing unit(CPU) in computing massive data, will be a way to reduce the processing time. In this paper, we present an efficient block-level parallel algorithm for the motion estimation(ME) on a computer unified device architecture(CUDA) platform developed in general-purpose computation on GPU. Experiments are carried out to verify the effectiveness of the proposed algorithm.

Implememtation of Fast Rasterizer processing using GPGPU based on SIMT structure (SIMT 구조 기반 GPGPU를 이용한 고속 Rasterizer 구현)

  • Kim, Chiyong
    • Journal of IKEEE
    • /
    • v.21 no.3
    • /
    • pp.276-279
    • /
    • 2017
  • In this paper, SIMT structure based GPGPU (General Purpose Computing on Graphics Processing Units) is used for accelerating the Rasterizer which constitutes the screen of the display device in pixel unit. The GPU has a large number of ALUs, and the processing is very fast because of parallel processing. Therefore, in this paper, we implemented a rasterizer that generates a 3D graphics model using a CPU that performs operations sequentially and a GPU that performs operations in parallel. We confirmed that proposed rasterizer in this paper is 1.45 times better than rasterizer using Intel CPU when generating one frame.

Implementation of IQ/IDCT in H.264/AVC Decoder Using Mobile Multi-Core GPGPU (모바일 멀티 코어 GP-GPU를 이용한 H.264/AVC 디코더 구현)

  • Kim, Dong-Han;Lee, Kwang-Yeob;Jeong, Jun-Mo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.321-324
    • /
    • 2010
  • There have been lots of researches on a multi-core processor. The enhancement has been performed through parallelization method. Multi-core architecture in the mobile environment has emerged. But, there is a limit to a mobile CPU's performance. GP-GPU(General-Purpose computing on Graphics Processing Units) can improve performance without adding other dedicated hardware. This paper presents the implementation of Inverse Quantization, Inverse DCT and Color Space Conversion module in H.264/AVC decoder using Multi-Core GP-GPU for a mobile environments. The proposed architecture improves approximately 50% of performance when it use all the features.

  • PDF

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU (GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법)

  • Kim, Mincheol;Lee, Kwangyeob
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.10
    • /
    • pp.935-943
    • /
    • 2017
  • CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.

Analysis of the Difference of End-User Computing Success Factors for each Business Area (End-User Computing 성공요인의 업종별 차이 분석)

  • 김성언;신영균
    • The Journal of Information Systems
    • /
    • v.6 no.2
    • /
    • pp.121-145
    • /
    • 1997
  • The purpose of this research is to find out the difference of success factors of End-User Computing (EUC) which are applicable to our corporations. There have been many researches related to EUC success factors, however, as the researches have been concentrated on limited type of business areas or have been studied without distinguishing the differences of characteristics of business areas, these researches are not sufficient to cover the situation of the corporations, in general. In this research, the range of companies to be investigated was widely extended to such as electric and electronic company, banking, construction company, petroleum and chemistry company, machinery and metal company, and beverage and food company. EUC success factors which influence the factors that are used to evaluate the EUC performance for each business area were found to be different. The result of this research can be useful for a company which is seeking EUC success factors for its own business area.

  • PDF

A PRICING METHOD OF HYBRID DLS WITH GPGPU

  • YOON, YEOCHANG;KIM, YONSIK;BAE, HYEONG-OHK
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.20 no.4
    • /
    • pp.277-293
    • /
    • 2016
  • We develop an efficient numerical method for pricing the Derivative Linked Securities (DLS). The payoff structure of the hybrid DLS consists with a standard 2-Star step-down type ELS and the range accrual product which depends on the number of days in the coupon period that the index stay within the pre-determined range. We assume that the 2-dimensional Geometric Brownian Motion (GBM) as the model of two equities and a no-arbitrage interest model (One-factor Hull and White interest rate model) as a model for the interest rate. In this study, we employ the Monte Carlo simulation method with the Compute Unified Device Architecture (CUDA) parallel computing as the General Purpose computing on Graphic Processing Unit (GPGPU) technology for fast and efficient numerical valuation of DLS. Comparing the Monte Carlo method with single CPU computation or MPI implementation, the result of Monte Carlo simulation with CUDA parallel computing produces higher performance.

Analyzing delay of Kernel function owing to GPU memory input from multiple VMs in RPC-based GPU virtualization environments (RPC 기반 GPU 가상화 환경에서 다중 가상머신의 GPU 메모리 입력으로 인한 커널 함수의 지연 문제 분석)

  • Kang, Jihun;Kim, Soo Kyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.541-542
    • /
    • 2021
  • 클라우드 컴퓨팅 환경에서는 고성능 컴퓨팅을 지원하기 위해 사용자에게 GPU(Graphic Processing Unit)가 할당된 가상머신을 제공하여 사용자가 고성능 응용을 실행할 수 있도록 지원한다. 일반적인 컴퓨팅 환경에서 한 명의 사용자가 GPU를 독점해서 사용하기 때문에 자원 경쟁으로 인한 문제가 상대적으로 적게 발생하지만 독립적인 여러 사용자가 컴퓨팅 자원을 공유하는 클라우드 환경에서는 자원 경쟁으로 인해 서로 성능 영향을 미치는 문제를 발생시킨다. 본 논문에서는 여러 개의 가상머신이 단일 GPU를 공유하는 RPC(Remote Procedure Call) 기반 GPU 가상화 환경에서 다수의 가상머신이 GPGPU(General Purpose computing on Graphics Processing Units) 작업을 수행할 때 GPU 메모리 입력 경쟁으로 인해 발생하는 커널 함수의 실행 지연 문제를 분석한다.

  • PDF