• Title/Summary/Keyword: Graphics Processing Unit : GPU

Search Result 154, Processing Time 0.028 seconds

A study on the Computational Efficiency Improvement for the Conjunction Screening Algorithm (접근물체 선별 알고리즘 계산 효율성 향상 연구)

  • Kim, Hyoung-Jin;Kim, Hae-Dong;Seong, Jae-Dong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.40 no.9
    • /
    • pp.818-826
    • /
    • 2012
  • In this paper, the improvement methods of the computational efficiency of the conjunction screening algorithm, which calculates the closest distance between primary satellite and space objects are presented. First method is to use GPU(Graphics Processing Unit) that has high computing power and handles quickly large amounts of data. Second method is to use Apogee/Perigee filter which excludes non-threatening objects that have low probability of collision and/or minimum distance rather than that of thresh hold. Third method is to combine first method with second method. As a result, the computational efficiency has been improved 34 times and 3 times for the first method only and second method only, respectively. On the contrary, the computational efficiency has been dramatically improved 163 times when two kinds of methods are combined.

A design of GPU container co-execution framework measuring interference among applications (GPU 컨테이너 동시 실행에 따른 응용의 간섭 측정 프레임워크 설계)

  • Kim, Sejin;Kim, Yoonhee
    • KNOM Review
    • /
    • v.23 no.1
    • /
    • pp.43-50
    • /
    • 2020
  • As General Purpose Graphics Processing Unit (GPGPU) recently plays an essential role in high-performance computing, several cloud service providers offer GPU service. Most cluster orchestration platforms in a cloud environment using containers allocate the integer number of GPU to jobs and do not allow a node shared with other jobs. In this case, resource utilization of a GPU node might be low if a job does not intensively require either many cores or large size of memory in GPU. GPU virtualization brings opportunities to realize kernel concurrency and share resources. However, performance may vary depending on characteristics of applications running concurrently and interference among them due to resource contention on a node. This paper proposes GPU container co-execution framework with multiple server creation and execution based on Kubernetes, container orchestration platform for measuring interference which may be occurred by sharing GPU resources. Performance changes according to scheduling policies were investigated by executing several jobs on GPU. The result shows that optimal scheduling is not possible only considering GPU memory and computing resource usage. Interference caused by co-execution among applications is measured using the framework.

Implementation of Neural Networks using GPU (GPU를 이용한 신경망 구현)

  • Oh Kyoung-su;Jung Keechul
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.735-742
    • /
    • 2004
  • We present a new use of common graphics hardware to perform a faster artificial neural network. And we examine the use of GPU enhances the time performance of the image processing system using neural network, In the case of parallel computation of multiple input sets, the vector-matrix products become matrix-matrix multiplications. As a result, we can fully utilize the parallelism of GPU. Sigmoid operation and bias term addition are also implemented using pixel shader on GPU. Our preliminary result shows a performance enhancement of about thirty times faster using ATI RADEON 9800 XT board.

Implementation of Real-time Interactive Ray Tracing on GPU (GPU 기반의 실시간 인터렉티브 광선추적법 구현)

  • Bae, Sung-Min;Hong, Hyun-Ki
    • Journal of Korea Game Society
    • /
    • v.7 no.3
    • /
    • pp.59-66
    • /
    • 2007
  • Ray tracing is one of the classical global illumination methods to generate a photo-realistic rendering image with various lighting effects such as reflection and refraction. However, there are some restrictions on real-time applications because of its computation load. In order to overcome these limitations, many researches of the ray tracing based on GPU (Graphics Processing Unit) have been presented up to now. In this paper, we implement the ray tracing algorithm by J. Purcell and combine it with two methods in order to improve the rendering performance for interactive applications. First, intersection points of the primary ray are determined efficiently using rasterization on graphics hardware. We then construct the acceleration structure of 3D objects to improve the rendering performance. There are few researches on a detail analysis of improved performance by these considerations in ray tracing rendering. We compare the rendering system with environment mapping based on GPU and implement the wireless remote rendering system. This system is useful for interactive applications such as the realtime composition, augmented reality and virtual reality.

  • PDF

Parallel Implementation and Performance Evaluation of the SIFT Algorithm Using a Many-Core Processor (매니코어 프로세서를 이용한 SIFT 알고리즘 병렬구현 및 성능분석)

  • Kim, Jae-Young;Son, Dong-Koo;Kim, Jong-Myon;Jun, Heesung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.9
    • /
    • pp.1-10
    • /
    • 2013
  • In this paper, we implement the SIFT(Scale-Invariant Feature Transform) algorithm for feature point extraction using a many-core processor, and analyze the performance, area efficiency, and system area efficiency of the many-core processor. In addition, we demonstrate the potential of the proposed many-core processor by comparing the performance of the many-core processor with that of high-performance CPU and GPU(Graphics Processing Unit). Experimental results indicate that the accuracy result of the SIFT algorithm using the many-core processor was same as that of OpenCV. In addition, the many-core processor outperforms CPU and GPU in terms of execution time. Moreover, this paper proposed an optimal model of the SIFT algorithm on the many-core processor by analyzing energy efficiency and area efficiency for different octave sizes.

AN EFFICIENT INCOMPRESSIBLE FREE SURFACE FLOW SIMULATION USING GPU (GPU를 이용한 효율적인 비압축성 자유표면유동 해석)

  • Hong, H.E.;Ahn, H.T.;Myung, H.J.
    • Journal of computational fluids engineering
    • /
    • v.17 no.2
    • /
    • pp.35-41
    • /
    • 2012
  • This paper presents incompressible Navier-Stokes solution algorithm for 2D Free-surface flow problems on the Cartesian mesh, which was implemented to run on Graphics Processing Units(GPU). The INS solver utilizes the variable arrangement on the Cartesian mesh, Finite Volume discretization along Constrained Interpolation Profile-Conservative Semi-Lagrangian(CIP-CSL). Solution procedure of incompressible Navier-Stokes equations for free-surface flow takes considerable amount of computation time and memory space even in modern multi-core computing architecture based on Central Processing Units(CPUs). By the recent development of computer architecture technology, Graphics Processing Unit(GPU)'s scientific computing performance outperforms that of CPU's. This paper focus on the utilization of GPU's high performance computing capability, and presents an efficient solution algorithm for free surface flow simulation. The performance of the GPU implementations with double precision accuracy is compared to that of the CPU code using an representative free-surface flow problem, namely. dam-break problem.

Implementation and Performance Evaluation of a Video-Equipped Real-Time Fire Detection Method at Different Resolutions using a GPU (GPU를 이용한 다양한 해상도의 비디오기반 실시간 화재감지 방법 구현 및 성능평가)

  • Shon, Dong-Koo;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.1
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose an efficient parallel implementation method of a widely used complex four-stage fire detection algorithm using a graphics processing unit (GPU) to improve the performance of the algorithm and analyze the performance of the parallel implementation method. In addition, we use seven different resolution videos (QVGA, VGA, SVGA, XGA, SXGA+, UXGA, QXGA) as inputs of the four-stage fire detection algorithm. Moreover, we compare the performance of the GPU-based approach with that of the CPU implementation for each different resolution video. Experimental results using five different fire videos with seven different resolutions indicate that the execution time of the proposed GPU implementation outperforms that of the CPU implementation in terms of execution time and takes a 25.11ms per frame for the UXGA resolution video, satisfying real-time processing (30 frames per second, 30fps) of the fire detection algorithm.

A Study of solving the bottleneck between CPU and GPU (CPU와 GPU 간의 병목현상 해결에 관한 연구)

  • Lee, Jin-Ho;Cho, Han-Jin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.3-4
    • /
    • 2020
  • 본 논문에서는 컴퓨팅 시스템에서 발생 할 수 있는, CPU와 GPU 간의 병목현상을 개선방안으로 통신 방식에 대해 비교 분석하였다. CPU와 GPU 간에 발생할 수 있는 병목현상의 해결방법으로, 두 구성 요소 간의 성능 구성 외의 통신방식을 개선 방법으로 PCIe와 NVLink를 비교하고, 성능 극대화 방안을 모색한다. NVLink 연결 방식의 통신 방식을 변경하였을 때 성능을 비교해 봄으로써 병목현상 해소 및 성능 향상에 우수한 결과를 낼 수 있다.

  • PDF

GPU-Based Dynamic Remeshing to Simulate Cloth Tearing (옷감 찢기 시뮬레이션을 표현하는 GPU기반 동적 재메쉬)

  • Seong-Hyeok Moon;Jong-Hyun Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.353-356
    • /
    • 2023
  • 본 논문에서는 GPU 기반으로 옷감을 찢는 데 필요한 동적 재메쉬 기법에 대해서 제안한다. 일반적으로 메쉬를 파괴(Fracture)하거나 찢는 시뮬레이션에서는 안정적인 동역학 계산하는데 있어서 동적 재 메쉬과정에 매우 중요하며 이 과정이 계산양이 가장 크다. 본 논문에서는 GPU 친화적인 동적 메쉬 알고리즘을 새롭게 제안함으로써 옷감 찢기 시뮬레이션을 실시간으로 보여준다.

  • PDF

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.