Search | Korea Science

Power Modeling Approach for GPU Source Program

Li, Junke;Guo, Bing;Shen, Yan;Li, Deguang;Huang, Yanhui
- Journal of Electrical Engineering and Technology
- /
- v.13 no.1
- /
- pp.181-191
- /
- 2018
Rapid development of information technology makes our environment become smarter and massive high performance computers are providing powerful computing for that. Graphics Processing Unit (GPU) as a typical high performance component is being widely used for both graphics and general-purpose applications. Although it can greatly improve computing power, it also delivers significant power consumption and need sufficient power supplies. To make high performance computing more sustainable, the important step is to measure it. Current power technologies for GPU have some drawbacks, such as they are not applicable for power estimation at the early stage. In this article, we present a novel power technology to correlate power consumption and the characteristics at the programmer perspective, and then to estimate power consumption of source program without prerunning. We conduct experiments on Nvidia's GT740 platform; the results show that our power model is more accurately than regression model and has an average error of 2.34% and the maximum error of 9.65%.
https://doi.org/10.5370/JEET.2018.13.1.181 인용 PDF KSCI HTML

IPC-based Dynamic SM management on GPGPU for Executing AES Algorithm

Son, Dong Oh;Choi, Hong Jun;Kim, Cheol Hong
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.2
- /
- pp.11-19
- /
- 2020
Modern GPU can execute general purpose computation on the graphic processing unit, and provide high performance by exploiting many core on GPU. To run AES algorithm efficiently, parallel computational resources are required. However, computational resource of CPU architecture are not enough to cryptographic algorithm such as AES whereas GPU architecture has mass parallel computation resources. Therefore, this paper reduce the time to execute AES by employing parallel computational resource on GPGPU. Unfortunately, AES cannot utilize computational resource on GPGPU since it isn't suitable to GPGPU architecture. In this paper, IPC based dynamic SM management technique are proposed to efficiently execute AES on GPGPU. IPC based dynamic SM management can increase and decrease the number of active SMs by using IPC in run-time. According to simulation results, proposed technique improve the performance by increasing resource utilization compared to baseline GPGPU architecture. The results show that AES improve the performance by 41.2% on average.
https://doi.org/10.9708/jksci.2020.25.02.011 인용 PDF KSCI

Real-time Full-view 3D Human Reconstruction using Multiple RGB-D Cameras

Yoon, Bumsik;Choi, Kunwoo;Ra, Moonsu;Kim, Whoi-Yul
- IEIE Transactions on Smart Processing and Computing
- /
- v.4 no.4
- /
- pp.224-230
- /
- 2015
This manuscript presents a real-time solution for 3D human body reconstruction with multiple RGB-D cameras. The proposed system uses four consumer RGB/Depth (RGB-D) cameras, each located at approximately $90^{\circ}$ from the next camera around a freely moving human body. A single mesh is constructed from the captured point clouds by iteratively removing the estimated overlapping regions from the boundary. A cell-based mesh construction algorithm is developed, recovering the 3D shape from various conditions, considering the direction of the camera and the mesh boundary. The proposed algorithm also allows problematic holes and/or occluded regions to be recovered from another view. Finally, calibrated RGB data is merged with the constructed mesh so it can be viewed from an arbitrary direction. The proposed algorithm is implemented with general-purpose computation on graphics processing unit (GPGPU) for real-time processing owing to its suitability for parallel processing.
https://doi.org/10.5573/IEIESPC.2015.4.4.224 인용 PDF KSCI

Design of a SIMT architecture GP-GPU Using Tile based on Graphic Pipeline Structure (타일 기반 그래픽 파이프라인 구조를 사용한 SIMT 구조 GP-GPU 설계)

Kim, Do-Hyun;Kim, Chi-Yong
- Journal of IKEEE
- /
- v.20 no.1
- /
- pp.75-81
- /
- 2016
This paper proposes a design of the tile based on graphic pipeline to improve the graphic application performance in SIMT based GP-GPU. The proposed Tile based on graphics pipeline avoids unnecessary graphic processing operation, and processes the rasterization step in parallel. The massive data processing in parallel through SIMT architecture improve the computational performance, thereby improving the 3D graphic pipeline performance. The more vertex data of 3D model, the higher performance. The proposed structure was confirmed to improve processing performance of up to 3 times from about 1.18 times as compared to 'RAMP' and previous studies.
https://doi.org/10.7471/ikeee.2016.20.1.075 인용 PDF KSCI

Implementation and Performance Evaluation of the Faddev-Leverrier Algorithm using GPGPU (GPGPU를 이용한 파데브-레브리어 알고리즘 구현 및 성능 분석)

Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
- IEMEK Journal of Embedded Systems and Applications
- /
- v.8 no.3
- /
- pp.171-178
- /
- 2013
In this paper, we implement the Faddev-Leverier algorithm using GPGPU (General-Purpose Graphics Processing Unit) to accelerate singular value decomposition. In addition, we compare the performance of the algorithm using CPU and CPU plus GPGPU for eleven ${\times}n$ matrix sizes in order to decompose singular values, where =4, 8, 16, 32, 64, 128, 256, 512, 1,024, 2,048, and 4,096. Experimental results indicate that CPU achieves better performance than CPU plus GPGPU for $n{\leq}64$ because of a large number of read and write operations between CPU and GPGPU. However, CPU plus GPGPU outperforms CPU exponentially in the execution time for $n{\geq}64$.
https://doi.org/10.14372/IEMEK.2013.8.3.171 인용 PDF KSCI

Accelerating the Retinex Algorithm with CUDA

Seo, Hyo-Seok;Kwon, Oh-Young
- Journal of information and communication convergence engineering
- /
- v.8 no.3
- /
- pp.323-327
- /
- 2010
Recently, the television market trend is change to HD television and the need of the study on HD image enhancement is increased rapidly. To enhancement of image quality, the retinex algorithm is commonly used. That's why we studied how to accelerate the retinex algorithm with CUDA on GPGPU (general purpose graphics processing unit). Calculating average part in retinex algorithm is similar to pyramidal calculation. We parallelize this recursive pyramidal average calculating for all layers, map the average data into the 2D plane and reduce the calculating time dramatically. Sequential C code takes 8948ms to get the average values for all layers in $1024{\times}1024$ image, but proposed method takes only only about 0.9ms for the same image. We are going to study about the real-time HD video rendering and image enhancement.
https://doi.org/10.6109/jicce.2010.8.3.323 인용 PDF KSCI

스마트폰에서의 영상처리를 위한 GPU 활용

Park, In-Gyu;Choe, Ho-Yeol
- Information and Communications Magazine
- /
- v.29 no.4
- /
- pp.46-51
- /
- 2012
본 기고에서는 최근 스마트폰에서 요구되는 다양한 멀티미디어 어플리케이션을 embedded GPU(Graphics Processing Unit)를 이용하여 고속 병렬처리하기 위한 GPGPU (General-Purpose Computing on GPU) 기술 및 영상처리 분야의 응용 사례를 소개한다. 일반적인 데스크탑 컴퓨팅 환경과 달리 제약사항이 많은 embedded 환경에서의 GPGPU 응용 기술은 아직 초기단계이다. 그러나 급격히 발전하는 embedded GPU IP와 OpenCL과 같은 API의 등장으로 embedded GPU를 이용한 고속 병렬처리 환경이 수 년 이내에 일반화 될 것이다. 본 기고에서는 그 가능성을 점검하기 위하여 embedded GPU에서의 영상처리를 위한 최신 하드웨어와 소프트웨어 환경의 발전 동향을 소개한다. 더불어 최신 스마트폰에서의 GPGPU기술을 사용한 영상처리 사례와 영상처리 알고리즘의 GPGPU 알고리즘 구현시 고려해야 할 주요 사항을 정리한다.
PDF KSCI

Accelerating Molecular Dynamics Simulation Using Graphics Processing Unit

Myung, Hun-Joo;Sakamaki, Ryuji;Oh, Kwang-Jin;Narumi, Tetsu;Yasuoka, Kenji;Lee, Sik
- Bulletin of the Korean Chemical Society
- /
- v.31 no.12
- /
- pp.3639-3643
- /
- 2010
We have developed CUDA-enabled version of a general purpose molecular dynamics simulation code for GPU. Implementation details including parallelization scheme and performance optimization are described. Here we have focused on the non-bonded force calculation because it is most time consuming part in molecular dynamics simulation. Timing results using CUDA-enabled and CPU versions were obtained and compared for a biomolecular system containing 23558 atoms. CUDA-enabled versions were found to be faster than CPU version. This suggests that GPU could be a useful hardware for molecular dynamics simulation.
https://doi.org/10.5012/bkcs.2010.31.12.3639 인용 PDF KSCI

The Pixel Shading on Multi Core GP-GPU with Dual Phase Architecture (듀얼 페이즈 구조의 멀티 코어 GP-GPU를 이용한 픽셀 셰이딩)

Kim, Jun-Seo;Park, Tae-Ryong;Lee, Kwang-Yeob
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2010.10a
- /
- pp.339-342
- /
- 2010
최근 프로세서가 클럭 향상의 한계에 부딪힘에 따라, 프로세서의 성능을 향상시키기 위해 멀티 코어 기반의 병렬처리를 이용한 방법들이 제안 되고 있다. 본 논문은 여러개의 연산기를 한 명령어 사이클에 동시에 사용할 수 있는 MIMD(Multiple Instruction, Multiple Data) 구조를 가지며, Scratch Counter를 이용해 멀티 코어와 멀티 스레드의 작업을 할당하는 구조의 GP-GPU(General Purpose - Graphics Processing Unit)를 활용해 멀티 코어, 멀티 스레드 환경에서의 효율적인 픽셀 셰이딩 방법을 설계 하였다. 선형 안개 픽셀 셰이딩의 경우 싱글코어에서 18.3 FPS이며 4개의 멀티코어 GP-GPU에서는 4배가 증가한 73.2 FPS 결과를 얻었다.
PDF

System Implementation for Generating High Quality Digital Holographic Video using Vertical Rig based on Depth+RGB Camera (Depth+RGB 카메라 기반의 수직 리그를 이용한 고화질 디지털 홀로그래픽 비디오 생성 시스템의 구)

Koo, Ja-Myung;Lee, Yoon-Hyuk;Seo, Young-Ho;Kim, Dong-Wook
- Journal of Broadcast Engineering
- /
- v.17 no.6
- /
- pp.964-975
- /
- 2012
Recently the attention on digital hologram that is regarded as to be the final goal of the 3-dimensional video technology has been increased. A digital hologram can be generated with a depth and a RGB image. We proposed a new system to capture RGB and depth images and to convert them to digital holograms. First a new cold mirror was designed and produced. It has the different transmittance ratio against various wave length and can provide the same view and focal point to the cameras. After correcting various distortions with the camera system, the different resolution between depth and RGB images was adjusted. The interested object was extracted by using the depth information. Finally a digital hologram was generated with the computer generated hologram (CGH) algorithm. All algorithms were implemented with C/C++/CUDA and integrated in LabView environment. A hologram was calculated in the general-purpose computing on graphics processing unit (GPGPU) for high-speed operation. We identified that the visual quality of the hologram produced by the proposed system is better than the previous one.
https://doi.org/10.5909/JBE.2012.17.6.964 인용 PDF KSCI

Search Result 48, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)