Search | Korea Science

The Need of Cache Partitioning on Shared Cache of Integrated Graphics Processor between CPU and GPU (내장형 GPU 환경에서 CPU-GPU 간의 공유 캐시에서의 캐시 분할 방식의 필요성)

Sung, Hanul;Eom, Hyeonsang;Yeom, HeonYoung
- KIISE Transactions on Computing Practices
- /
- v.20 no.9
- /
- pp.507-512
- /
- 2014
Recently, Distributed computing processing begins using both CPU(Central processing unit) and GPU(Graphic processing unit) to improve the performance to overcome darksilicon problem which cannot use all of the transistors because of the electric power limitation. There is an integrated graphics processor that CPU and GPU share memory and Last level cache(LLC). But, There is no LLC access rules between CPU and GPU, so if GPU and CPU processes run together at the same time, performance of both processes gets worse because of the contention on the LLC. This Paper gives evidence to prove the need of the Cache Partitioning and is mentioned about the cache partitioning design using page coloring to allocate the L3 Cache space only for the GPU process to guarantee GPU process performance.
https://doi.org/10.5626/KTCP.2014.20.9.507 인용

A Study On The Virtual Space Simulation Expression Using Graphic Process Unit (GPU를 활용한 공간 가상 시뮬레이션 표현에 관한 연구)

Kim, Jong-Hyun;Kim, Suk-Tae
- Proceedings of the Korean Institute of Interior Design Conference
- /
- 2004.11a
- /
- pp.80-83
- /
- 2004
It is impossible to do real verification on design spaces before their completions due to the characteristics of building and interior space designs. So, in designing spaces, designers should reflect their real experiences in their lives into their design works. 3D games where GPU and other kinds of advanced technologies have bee applied first show their leads in technologies have bee applied first show their leads in technologies about 5 years than VRML. Those games which are produced reflecting real environment as it is could be regarded as the most excellent tool in their completeness level of physical environment due to their characteristics. This means that if 3D game engines employing GPU are used effectively they could be used as a presentation tool for virtual spaces. This study studies the expressions of virtual constructions through 3D game engines employing GPU, not in VRML-based virtual spaces on Webs but in immersion-type virtual spaces.
PDF

Accelerating the Sweep3D for a Graphic Processor Unit

Gong, Chunye;Liu, Jie;Chen, Haitao;Xie, Jing;Gong, Zhenghu
- Journal of Information Processing Systems
- /
- v.7 no.1
- /
- pp.63-74
- /
- 2011
As a powerful and flexible processor, the Graphic Processing Unit (GPU) can offer a great faculty in solving many high-performance computing applications. Sweep3D, which simulates a single group time-independent discrete ordinates (Sn) neutron transport deterministically on 3D Cartesian geometry space, represents the key part of a real ASCI application. The wavefront process for parallel computation in Sweep3D limits the concurrent threads on the GPU. In this paper, we present multi-dimensional optimization methods for Sweep3D, which can be efficiently implemented on the finegrained parallel architecture of the GPU. Our results show that the overall performance of Sweep3D on the CPU-GPU hybrid platform can be improved up to 4.38 times as compared to the CPU-based implementation.
https://doi.org/10.3745/JIPS.2011.7.1.063 인용 PDF KSCI

The Design of Parallel Processing S/W Using CUDA for Realtime 3D Laser Ladar Imaging System (실시간 3차원 레이저 레이더 영상 생성을 위한 CUDA 기반 병렬처리 소프트웨어 설계)

Cho, Yong Il;Ha, Choong Lim;Yang, Ji Hyeon;Kim, Jae Hyup
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.1
- /
- pp.1-10
- /
- 2013
In this paper, we propose a CUDA(Common Unified Device Architecture) based SW(software) design method for CPU(Central Processing Unit) and GPU(Graphic Processing Unit) parallel structure to implement real-time process in 3D Laser ladar(LADAR) imaging system. LADAR is a complex system to generate 3-dimensional image based on the laser ranging information, and requires massive process resources in each phase. Therefore, designing and implementing parallel structure are crucial to realize a real-time process within limited system resource. As a conclusion, we can meet the speed of required real-time process allocating separable work load to CUDA GPU by analyzing process algorithm in each phase and confirm the process speed increase by 46%.
https://doi.org/10.9708/jksci.2013.18.1.001 인용 PDF KSCI

An Analytical Model for Performance Prediction of AES on GPU Architecture (GPU 아키텍처의 AES 암호화 성능 예측 분석 모델)

Kim, Kyuwoon;Kim, Hyunwoo;Kim, Huijeong;Huh, Taeyoung;Jung, Sanghyuk;Song, Yong Ho
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.4
- /
- pp.89-96
- /
- 2013
The graphic processor unit (GPU) has been developed to process not only graphic data but also general system data. It shows a better performance than CPU in algorithm for 3D graphics and parallel program. In order to execute algorithm for CPU on GPU, we should understand about GPU architectures and rewrite program considering parallel processing capability and new memory model of GPU. For this reasons, a performance prediction model for the algorithm and its predicted performance through GPU system are required. These can predict problems in GPU application development or construct a performance evaluation standard for GPU. In this paper, we applied the AES encryption algorithms on our performance model and accomplished performance prediction with high accuracy under a heavy workload.
https://doi.org/10.5573/ieek.2013.50.4.089 인용 PDF KSCI

Fast Double Random Phase Encoding by Using Graphics Processing Unit (GPU 컴퓨팅에 의한 고속 Double Random Phase Encoding)

Saifullah, Saifullah;Moon, In-Kyu
- Proceedings of the Korea Multimedia Society Conference
- /
- 2012.05a
- /
- pp.343-344
- /
- 2012
With the increase of sensitive data and their secure transmission and storage, the use of encryption techniques has become widespread. The performance of encoding majorly depends on the computational time, so a system with less computational time suits more appropriate as compared to its contrary part. Double Random Phase Encoding (DRPE) is an algorithm with many sub functions which consumes more time when executed serially; the computation time can be significantly reduced by implementing important functions in a parallel fashion on Graphics Processing Unit (GPU). Computing convolution using Fast Fourier transform in DRPE is the most important part of the algorithm and it is shown in the paper that by performing this portion in GPU reduced the execution time of the process by substantial amount and can be compared with MATALB for performance analysis. NVIDIA graphic card GeForce 310 is used with CUDA C as a programming language.
PDF

Morphology Operations on CUDA To Remove Skull on MRI Images

Izmantoko, Yonny S.;Choi, Heung-Kook
- Proceedings of the Korea Multimedia Society Conference
- /
- 2012.05a
- /
- pp.205-208
- /
- 2012
Nowadays GPU (Graphic Process Unit) is not only used to show and render some images, but also for another computation. In this paper, we tried to use GPU to do some morphology operations to remove skull from axial MRI images. This skull removing process is an important step in brain segmentation because we would like to work with the brain only, without any skull on it. The result shows that simple morphology operations to remove skull has been successfully applied on MRI images, but there are still many parts that can be develop to get better images.
PDF

GPU-Accelerated Password Cracking of PDF Files

Kim, Keon-Woo;Lee, Sang-Su;Hong, Do-Won;Ryou, Jae-Cheol
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.5 no.11
- /
- pp.2235-2253
- /
- 2011
Digital document file such as Adobe Acrobat or MS-Office is encrypted by its own ciphering algorithm with a user password. When this password is not known to a user or a forensic inspector, it is necessary to recover the password to open the encrypted file. Password cracking by brute-force search is a perfect approach to discover the password but a time consuming process. This paper presents a new method of speeding up password recovery on Graphic Processing Unit (GPU) using a Compute Unified Device Architecture (CUDA). PDF files are chosen as a password cracking target, and the Abode Acrobat password recovery algorithm is examined. Experimental results show that the proposed method gives high performance at low cost, with a cluster of GPU nodes significantly speeding up the password recovery by exploiting a number of computing nodes. Password cracking performance is increased linearly in proportion to the number of computing nodes and GPUs.
https://doi.org/10.3837/tiis.2011.11.021 인용 PDF KSCI

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP (CUDA와 OPenMP를 이용한 빠르고 효율적인 신경망 구현)

Park, An-Jin;Jang, Hong-Hoon;Jung, Kee-Chul
- Journal of KIISE:Software and Applications
- /
- v.36 no.4
- /
- pp.253-260
- /
- 2009
Many algorithms for computer vision and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation has two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job that needs much cooperation between CPU and GPU, which is usual in image processing and pattern recognition contrary to the graphic area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results in effectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text extraction system using the proposed architecture, and the computational times showed about 15 times faster than implementation on only GPU without OpenMP.
PDF KSCI

Real-time Stereo Video Generation using Graphics Processing Unit (GPU를 이용한 실시간 양안식 영상 생성 방법)

Shin, In-Yong;Ho, Yo-Sung
- Journal of Broadcast Engineering
- /
- v.16 no.4
- /
- pp.596-601
- /
- 2011
In this paper, we propose a fast depth-image-based rendering method to generate a virtual view image in real-time using a graphic processor unit (GPU) for a 3D broadcasting system. Before the transmission, we encode the input 2D+depth video using the H.264 coding standard. At the receiver, we decode the received bitstream and generate a stereo video using a GPU which can compute in parallel. In this paper, we apply a simple and efficient hole filling method to reduce the decoder complexity and reduce hole filling errors. Besides, we design a vertical parallel structure for a forward mapping process to take advantage of the single instruction multiple thread structure of GPU. We also utilize high speed GPU memories to boost the computation speed. As a result, we can generate virtual view images 15 times faster than the case of CPU-based processing.
https://doi.org/10.5909/JEB.2011.16.4.596 인용 PDF KSCI

Search Result 16, Processing Time 0.01 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)