• Title/Summary/Keyword: GPU model

Search Result 164, Processing Time 0.026 seconds

CUDA-based Object Oriented Programming Techniques for Efficient Parallel Visualization of 3D Content (3차원 콘텐츠의 효율적인 병렬 시각화를 위한 CUDA 환경 기반 객체 지향 프로그래밍 기법)

  • Park, Tae-Jung
    • Journal of Digital Contents Society
    • /
    • v.13 no.2
    • /
    • pp.169-176
    • /
    • 2012
  • This paper presents a parallel object-oriented programming (OOP) platform for efficient visualization of three-dimensional content in CUDA environments. For this purpose, this paper discusses the features and limitations in implementing C++ object-oriented codes using CUDA and proposes the solutions. Also, it presents how to implement a 3D parallel visualization platform based on the MVC (Model/View/Controller) design pattern. Also, it provides sample implementations for integral MLS (iMLS) and signed distance fields (SDFs) based on the Marching Cubes and Raytracing. The proposed approach enables GPU parallel processing only by implementing simple interfaces. Based on this, developers can expect general benefits that are common in general OOP techniques including abstractization and inheritance. Though I implemented only two specific samples in this paper, I expect my approach can be widely applied to general computer graphics problems.

Suitability of Counter-current Model for Biogas Separation Processes using Cellulose Acetate Hollow Fiber Membrane (셀룰로오스 아세테이트 중공사 분리막을 이용한 바이오가스 분리에 대한 향류 흐름 모델의 적용성)

  • Jung, Sang-Chul;Kwon, Ki-Wook;Jeon, Mi-Jin;Jeon, Yong-Woo
    • Journal of the Korea Organic Resources Recycling Association
    • /
    • v.28 no.4
    • /
    • pp.43-52
    • /
    • 2020
  • As the membrane gas separation technology grows, various models were developed by numerous researchers to describe the separation process. In this work, the counter-current model was compared thoroughly with experimental data. Experimentally, hollow fiber membrane using CA module was prepared for the separation of biogas. The pure gas permeation properties of membrane module for methane, nitrogen, oxygen, and carbon dioxide were measured. The permeance of CO2 and CH4 were 25.82 GPU and 0.65 GPU, respectively. The high CO2/CH4 selectivity of 39.7 was obtained. the separation test for three different simulated mixed gases were carried out after pure gas test, and the gas concentration of the permeate at various stage-cut were measured from CA membrane module. Results showed that the experimental data agreed with the numerical simulation. A mathematical model has implemented in this study for the separation of biogas using a membrane module. The finite difference method (FDM) is applied to calculate the membrane biogas separation behaviors. Futhermore, the counter-current model can be considered as a convenient model for biogas separation process.

Optimizing Skyline Query Processing Algorithms on CUDA Framework (CUDA 프레임워크 상에서 스카이라인 질의처리 알고리즘 최적화)

  • Min, Jun;Han, Hwan-Soo;Lee, Sang-Won
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.275-284
    • /
    • 2010
  • GPUs are stream processors based on multi-cores, which can process large data with a high speed and a large memory bandwidth. Furthermore, GPUs are less expensive than multi-core CPUs. Recently, usage of GPUs in general purpose computing has been wide spread. The CUDA architecture from Nvidia is one of efforts to help developers use GPUs in their application domains. In this paper, we propose techniques to parallelize a skyline algorithm which uses a simple nested loop structure. In order to employ the CUDA programming model, we apply our optimization techniques to make our skyline algorithm fit into the performance restrictions of the CUDA architecture. According to our experimental results, we improve the original skyline algorithm by 80% with our optimization techniques.

GPU-accelerated Global Illumination for Point Set Rendering (GPU 가속을 이용한 점집합 렌더링을 위한 전역 조명기법)

  • Min, Heajung;Kim, Young J.
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.7-15
    • /
    • 2020
  • In the process of visualizing a point set representing a smooth manifold surface, global illumination techniques can be used to render a realistic scene with various effects of lighting. Thanks to the continuous demand for ray tracing and the development of graphics hardware, dedicated GPUs and programmable pipeline for ray tracing have been introduced in recent years. In this paper, real-time global illumination rendering is studied for a point-set model using ray-tracing GPUs. We apply the moving least-squares (MLS) method to approximate the point set to a smooth implicit surface and render it using global illumination by performing massive ray-intersection tests with the surface and generating shading effects at the intersection point. As a result, a complicated point-set scene consisting of more than 0.5M points can be generated in real-time.

Co-simulation of MultiBody Dynamics and Plenteous Sphere of Contacted Particles Using NVIDIA GPGPU (NVIDIA 의 GPGPU 를 이용한 수 많은 구형 접촉 입자가 포함된 다물체 동역학 해석)

  • Park, Ji-Soo;Yoon, Joon-Shik;Choi, Jin-Hwan;Rhim, Sung-Soo
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.36 no.4
    • /
    • pp.465-474
    • /
    • 2012
  • In this study, a dynamic simulation model that considers many spherical particles and multibody dynamics (MBD) entities is developed. Plenteous spherical particles are solved using the Discrete Element Method (DEM) technique and simulated on a GPU board in a PC. A fast algorithm is used to calculate the Hertzian contact forces between many spherical particles, and NVIDIA CUDA is used to increase the calculation speed. The explicit integration method is applied to solve the many spheres. MBD entities are simulated by recursive formulation. Constraints are reduced by recursive formulation, and the implicit generalized alpha method is applied to solve the dynamic model. A new algorithm is developed to simulate the DEM and MBD models simultaneously. As a numerical example, a truck car model and gear model are developed. The results show that the proposed algorithm using a general-purpose GPU in a PC has many advantages.

A Study of the Performance Prediction Models of Mobile Graphics Processing Units

  • Kim, Cheong Ghil
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.1
    • /
    • pp.123-128
    • /
    • 2019
  • Currently mobile services are on the verge of full commercialization ahead of 5G mobile communication (5G). The first goal could be to preempt the 5G market through realistic media services utilizing VR (Virtual Reality) and AR (Augmented Reality) technologies that users can most easily experience. Basically this movement is based on the advanced development of smart devices and high quality graphics processing computing power of mobile application processors. Accordingly, the importance of mobile GPUs is emerging and the most concern issue becomes a model for predicting the power and performance for smooth operation of high quality mobile contents. In many cases, the performance of mobile GPUs has been introduced in terms of power consumption of mobile GPUs using dynamic voltage and frequency scaling and throttling functions for power consumption and heat management. This paper introduces several studies of mobile GPU performance prediction model with user-friendly methods not like conventional power centric performance prediction models.

Kinematic Wave Rainfall-Runoff Model Using CUDA FORTRAN (CUDA FORTRAN을 이용한 운동파 강우유출모형)

  • Kim, Boram;Kim, Dae-Hong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.271-271
    • /
    • 2018
  • 그래픽 처리 장치(GPU: Graphic Processing Units)는 그래픽 처리에 특화된 수많은 산술논리연산자 (ALU: Arithmetic Logic Unit)와 이에 관련된 인스트럭션Instruction)으로 인해 중앙 처리 장치(CPU: Central Processing Units) 보다 훨씬 빠른 계산 처리를 수행할 수 있다. 최근에는 FORTRAN에 의해 구현된 많은 수치모형들이 현실적인 모델링 방법의 발달로 인해 더 많은 계산량과 계산시간을 필요로 한다. 이 연구에서는 GPU 상의 범용 계산GPGPU : General-Purpose computing on Graphics Processing Units) 기반 운동파 강우유출모형(Kinematic Wave Rainfall-Runoff Model)이 CUDA(Compute Unified Device Architecture) FORTRAN을 사용하여 구현되었다. CUDA FORTRAN 운동파 강우유출모형의 계산 결과는 검증된 CPU 기반 운동파 강우유출모형의 계산 결과와 비교하여 검증되었으며, 잘 일치함을 보여 주었다. CUDA FORTRAN 운동파 강우유출모형은 CPU 기반 모형에 비해 약 20 배 더 빠른 계산 시간을 보였다. 또한 계산 영역이 커짐에 따라 CPU 버전에 비해 CUDA FORTRAN 버전의 계산 효율이 향상되었다.

  • PDF

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

  • Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.96-102
    • /
    • 2022
  • We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.

Implementation of AWS-based deep learning platform using streaming server and performance comparison experiment (스트리밍 서버를 이용한 AWS 기반의 딥러닝 플랫폼 구현과 성능 비교 실험)

  • Yun, Pil-Sang;Kim, Do-Yun;Jeong, Gu-Min
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.6
    • /
    • pp.591-596
    • /
    • 2019
  • In this paper, we implemented a deep learning operation structure with less influence of local PC performance. In general, the deep learning model has a large amount of computation and is heavily influenced by the performance of the processing PC. In this paper, we implemented deep learning operation using AWS and streaming server to reduce this limitation. First, deep learning operations were performed on AWS so that deep learning operation would work even if the performance of the local PC decreased. However, with AWS, the output is less real-time relative to the input when computed. Second, we use streaming server to increase the real-time of deep learning model. If the streaming server is not used, the real-time performance is poor because the images must be processed one by one or by stacking the images. We used the YOLO v3 model as a deep learning model for performance comparison experiments, and compared the performance of local PCs with instances of AWS and GTX1080, a high-performance GPU. The simulation results show that the test time per image is 0.023444 seconds when using the p3 instance of AWS, which is similar to the test time per image of 0.027099 seconds on a local PC with the high-performance GPU GTX1080.

Acceleration of Anisotropic Elastic Reverse-time Migration with GPUs (GPU를 이용한 이방성 탄성 거꿀 참반사 보정의 계산가속)

  • Choi, Hyungwook;Seol, Soon Jee;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.18 no.2
    • /
    • pp.74-84
    • /
    • 2015
  • To yield physically meaningful images through elastic reverse-time migration, the wavefield separation which extracts P- and S-waves from reconstructed vector wavefields by using elastic wave equation is prerequisite. For expanding the application of the elastic reverse-time migration to anisotropic media, not only the anisotropic modelling algorithm but also the anisotropic wavefield separation is essential. The anisotropic wavefield separation which uses pseudo-derivative filters determined according to vertical velocities and anisotropic parameters of elastic media differs from the Helmholtz decomposition which is conventionally used for the isotropic wavefield separation. Since applying these pseudo-derivative filter consumes high computational costs, we have developed the efficient anisotropic wavefield separation algorithm which has capability of parallel computing by using GPUs (Graphic Processing Units). In addition, the highly efficient anisotropic elastic reverse-time migration algorithm using MPI (Message-Passing Interface) and incorporating the developed anisotropic wavefield separation algorithm with GPUs has been developed. To verify the efficiency and the validity of the developed anisotropic elastic reverse-time migration algorithm, a VTI elastic model based on Marmousi-II was built. A synthetic multicomponent seismic data set was created using this VTI elastic model. The computational speed of migration was dramatically enhanced by using GPUs and MPI and the accuracy of image was also improved because of the adoption of the anisotropic wavefield separation.