• Title/Summary/Keyword: GPU model

Search Result 164, Processing Time 0.024 seconds

Efficient Collaboration Method Between CPU and GPU for Generating All Possible Cases in Combination (조합에서 모든 경우의 수를 만들기 위한 CPU와 GPU의 효율적 협업 방법)

  • Son, Ki-Bong;Son, Min-Young;Kim, Young-Hak
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.9
    • /
    • pp.219-226
    • /
    • 2018
  • One of the systematic ways to generate the number of all cases is a combination to construct a combination tree, and its time complexity is O($2^n$). A combination tree is used for various purposes such as the graph homogeneity problem, the initial model for calculating frequent item sets, and so on. However, algorithms that must search the number of all cases of a combination are difficult to use realistically due to high time complexity. Nevertheless, as the amount of data becomes large and various studies are being carried out to utilize the data, the number of cases of searching all cases is increasing. Recently, as the GPU environment becomes popular and can be easily accessed, various attempts have been made to reduce time by parallelizing algorithms having high time complexity in a serial environment. Because the method of generating the number of all cases in combination is sequential and the size of sub-task is biased, it is not suitable for parallel implementation. The efficiency of parallel algorithms can be maximized when all threads have tasks with similar size. In this paper, we propose a method to efficiently collaborate between CPU and GPU to parallelize the problem of finding the number of all cases. In order to evaluate the performance of the proposed algorithm, we analyze the time complexity in the theoretical aspect, and compare the experimental time of the proposed algorithm with other algorithms in CPU and GPU environment. Experimental results show that the proposed CPU and GPU collaboration algorithm maintains a balance between the execution time of the CPU and GPU compared to the previous algorithms, and the execution time is improved remarkable as the number of elements increases.

Development of a dose estimation code for BNCT with GPU accelerated Monte Carlo and collapsed cone Convolution method

  • Lee, Chang-Min;Lee Hee-Seock
    • Nuclear Engineering and Technology
    • /
    • v.54 no.5
    • /
    • pp.1769-1780
    • /
    • 2022
  • A new method of dose calculation algorithm, called GPU-accelerated Monte Carlo and collapsed cone Convolution (GMCC) was developed to improve the calculation speed of BNCT treatment planning system. The GPU-accelerated Monte Carlo routine in GMCC is used to simulate the neutron transport over whole energy range and the Collapsed Cone Convolution method is to calculate the gamma dose. Other dose components due to alpha particles and protons, are calculated using the calculated neutron flux and reaction data. The mathematical principle and the algorithm architecture are introduced. The accuracy and performance of the GMCC were verified by comparing with the FLUKA results. A water phantom and a head CT voxel model were simulated. The neutron flux and the absorbed dose obtained by the GMCC were consistent well with the FLUKA results. In the case of head CT voxel model, the mean absolute percentage error for the neutron flux and the absorbed dose were 3.98% and 3.91%, respectively. The calculation speed of the absorbed dose by the GMCC was 56 times faster than the FLUKA code. It was verified that the GMCC could be a good candidate tool instead of the Monte Carlo method in the BNCT dose calculations.

Object Tracking Based on Gaussian Mixture Model Algorithm by Using Cuda (Cuda를 이용한 가우시언 믹스처 모델 기반 객체 추적 알고리즘)

  • Kim, In-Su;Choi, Hyung-Il
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2011.01a
    • /
    • pp.273-275
    • /
    • 2011
  • 본 논문에서는 효과적인 객체 추적을 위해 가우시언 믹스처 기반의 그림자 제거 알고리즘을 제안하고, GPGPU(General Purpose GPU) 아키텍처인 NVIDIA 사의 CUDA(Compute Unified Device Architecture)를 이용하여 기존의 객체 추적 알고리즘의 컴퓨팅 시간을 개선하는 모델을 제안한다. 이 시스템은 GPU를 이용한 가우시언 믹스처 모델 기반의 객체 추적 알고리즘으로 전경과 배경 분리 시 CPU와 GPU의 프로세스 시간을 적절히 분배하여 소모되는 연산시간을 줄이고, 고 해상도의 이미지에서의 객체 분리 및 추적의 시스템 처리량을 최대화 한다. 객체 추출 후 효과적인 추적을 위해 예측 모델인 칼만 필터를 사용한다.

  • PDF

A study on the visualization of the sound field by using GPGPU (GPGPU에 의한 음장의 가시화에 관한 연구)

  • Lee, Chai-Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.5
    • /
    • pp.421-427
    • /
    • 2010
  • In order to visualize the transfer of sound waves, we performed real-time processing with the fast operating system of GPU, the Graphics Processing Unit. Simulation by using the method of the discrete Huygens' model was also implemented. The sound waves were visualized by varying the real-time processing, the reflecting surfaces within the two-dimensional virtual sound field, and the states of the sound source. Experimental results have shown that reflection and diffraction patterns for the sound waves were identified at the reflecting objects.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

CPU-GPU2 Trigeneous Computing for Iterative Reconstruction in Computed Tomography

  • Oh, Chanyoung;Yi, Youngmin
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.4
    • /
    • pp.294-301
    • /
    • 2016
  • In this paper, we present methods to efficiently parallelize iterative 3D image reconstruction by exploiting trigeneous devices (three different types of device) at the same time: a CPU, an integrated GPU, and a discrete GPU. We first present a technique that exploits single instruction multiple data (SIMD) architectures in GPUs. Then, we propose a performance estimation model, based on which we can easily find the optimal data partitioning on trigeneous devices. We found that the performance significantly varies by up to 6.23 times, depending on how SIMD units in GPUs are accessed. Then, by using trigeneous devices and the proposed estimation models, we achieve optimal partitioning and throughput, which corresponds to a 9.4% further improvement, compared to discrete GPU-only execution.

Time Measurement on GPU-based LCTM Simulation (GPU 기반 LCTM 교통 시뮬레이션에서의 성능 측정)

  • KYUNG, MinGi;Shin, In-soo;Cho, Min-Kyu;Min, Dugki
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.141-143
    • /
    • 2019
  • 본 연구에서는 메소스코픽 교통 시뮬레이션 모델의 하나인 LCTM(Lane Cell Transmission Model) 모델을 GPU 기반의 병렬 교통 시뮬레이션의 형태로 구현하여, 수행한 시뮬레이션 시간을 측정하였다. 본 논문에서는 LCTM 교통 시뮬레이션의 병렬화 고려사항들을 언급하고, GPU 를 사용한 병렬 교통 시뮬레이션 구현 시, 성능에 영향을 미치는 요소들을 분석한 후, 측정하였다.

Parallel Self-Collision Detection for Large 3D Mesh Model using GPU (GPU를 이용한 대용량 3D 메쉬 모델에 대한 병렬 자체 충돌검사)

  • Park, Sung-Hun;Kim, Yangen;Choi, Yoo-Joo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.708-711
    • /
    • 2022
  • 본 논문은 3D 프린팅 출력 성공률을 높이기 위해 GPU를 이용한 대용량 3D 메쉬 모델에 대한 병렬 자체충돌 검사 방법을 제안한다. 강인하고 견고한 자체 충돌 검사를 위해 분리축 검사, 삼각형-삼각형 교차 검사, 메쉬 연결성 검사, 대용량 메쉬를 위한 분할 처리 기법의 절차를 제안한다. 이러한 자체 충돌 검사를 빠르게 수행하기 위하여 GPU 기반 병렬처리 구현 방법을 제시한다.

Content Based Dynamic Texture Analysis and Synthesis Based on SPIHT with GPU

  • Ghadekar, Premanand P.;Chopade, Nilkanth B.
    • Journal of Information Processing Systems
    • /
    • v.12 no.1
    • /
    • pp.46-56
    • /
    • 2016
  • Dynamic textures are videos that exhibit a stationary property with respect to time (i.e., they have patterns that repeat themselves over a large number of frames). These patterns can easily be tracked by a linear dynamic system. In this paper, a model that identifies the underlying linear dynamic system using wavelet coefficients, rather than a raw sequence, is proposed. Content based threshold filtering based on Set Partitioning in a Hierarchical Tree (SPIHT) helps to get another representation of the same frames that only have low frequency components. The main idea of this paper is to apply SPIHT based threshold filtering on different bands of wavelet transform so as to have more significant information in fewer parameters for singular value decomposition (SVD). In this case, more flexibility is given for the component selection, as SVD is independently applied to the different bands of frames of a dynamic texture. To minimize the time complexity, the proposed model is implemented on a graphics processing unit (GPU). Test results show that the proposed dynamic system, along with a discrete wavelet and SPIHT, achieve a highly compact model with better visual quality, than the available LDS, Fourier descriptor model, and higher-order SVD (HOSVD).

Implementation of Parallel Computer Generated Hologram Using Multi-GPGPU (다중 GPGPU를 이용한 컴퓨터 생성 홀로그램의 병렬화 구현)

  • Seo, Young-Ho;Lee, Yoon-Hyuk;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.5
    • /
    • pp.1177-1186
    • /
    • 2014
  • Computer-generated hologram (CGH) is to mathematically model optical phenomenon with digital computer. Because it requires huge amount of computational power, a fast and high performance technique is needed. In this paper, we proposed two parallelizations for CGH calculation. The first is to parallelize CGH algorithm in a GPU (general processing unit) and the second is to parallelize multiple GPUs. The proposed algorithm was implemented in GTX780 Ti GPU. It calculates a $1,024{\times}1,024$ hologram with 10K object points for about 24ms.