• Title/Summary/Keyword: Graphics processing unit

Search Result 191, Processing Time 0.026 seconds

Molecular Interaction Interface Computing Based on Voxel Map (복셀맵을 기반으로 한 분자 간 상호작용 인터페이스의 계산)

  • Choi, Jihoon;Kim, Byungjoo;Kim, Ku-jin
    • Journal of the Korea Computer Graphics Society
    • /
    • v.18 no.3
    • /
    • pp.1-7
    • /
    • 2012
  • In this paper, we propose a method to compute the interface between protein molecules. When a molecules is represented as a set of spheres with van der Waals radii, the distance from a spatial point p to the molecule corresponds to the distance from p to the closet sphere. The molecular interface is composed of equi-distant points from two molecules. Our algorithm decomposes the space into a set of voxels, and then constructs a voxel map by storing the information of spheres intersecting each voxel. By using the voxel map, we compute the distance between a point and the molecule. We also use GPU for the parallel processing, and efficiently approximate the interface of a pair of molecules.

Power Modeling Approach for GPU Source Program

  • Li, Junke;Guo, Bing;Shen, Yan;Li, Deguang;Huang, Yanhui
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.1
    • /
    • pp.181-191
    • /
    • 2018
  • Rapid development of information technology makes our environment become smarter and massive high performance computers are providing powerful computing for that. Graphics Processing Unit (GPU) as a typical high performance component is being widely used for both graphics and general-purpose applications. Although it can greatly improve computing power, it also delivers significant power consumption and need sufficient power supplies. To make high performance computing more sustainable, the important step is to measure it. Current power technologies for GPU have some drawbacks, such as they are not applicable for power estimation at the early stage. In this article, we present a novel power technology to correlate power consumption and the characteristics at the programmer perspective, and then to estimate power consumption of source program without prerunning. We conduct experiments on Nvidia's GT740 platform; the results show that our power model is more accurately than regression model and has an average error of 2.34% and the maximum error of 9.65%.

Real-time Full-view 3D Human Reconstruction using Multiple RGB-D Cameras

  • Yoon, Bumsik;Choi, Kunwoo;Ra, Moonsu;Kim, Whoi-Yul
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.224-230
    • /
    • 2015
  • This manuscript presents a real-time solution for 3D human body reconstruction with multiple RGB-D cameras. The proposed system uses four consumer RGB/Depth (RGB-D) cameras, each located at approximately $90^{\circ}$ from the next camera around a freely moving human body. A single mesh is constructed from the captured point clouds by iteratively removing the estimated overlapping regions from the boundary. A cell-based mesh construction algorithm is developed, recovering the 3D shape from various conditions, considering the direction of the camera and the mesh boundary. The proposed algorithm also allows problematic holes and/or occluded regions to be recovered from another view. Finally, calibrated RGB data is merged with the constructed mesh so it can be viewed from an arbitrary direction. The proposed algorithm is implemented with general-purpose computation on graphics processing unit (GPGPU) for real-time processing owing to its suitability for parallel processing.

Design of a SIMT architecture GP-GPU Using Tile based on Graphic Pipeline Structure (타일 기반 그래픽 파이프라인 구조를 사용한 SIMT 구조 GP-GPU 설계)

  • Kim, Do-Hyun;Kim, Chi-Yong
    • Journal of IKEEE
    • /
    • v.20 no.1
    • /
    • pp.75-81
    • /
    • 2016
  • This paper proposes a design of the tile based on graphic pipeline to improve the graphic application performance in SIMT based GP-GPU. The proposed Tile based on graphics pipeline avoids unnecessary graphic processing operation, and processes the rasterization step in parallel. The massive data processing in parallel through SIMT architecture improve the computational performance, thereby improving the 3D graphic pipeline performance. The more vertex data of 3D model, the higher performance. The proposed structure was confirmed to improve processing performance of up to 3 times from about 1.18 times as compared to 'RAMP' and previous studies.

An Improved Hybrid Approach to Parallel Connected Component Labeling using CUDA

  • Soh, Young-Sung;Ashraf, Hadi;Kim, In-Taek
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.16 no.1
    • /
    • pp.1-8
    • /
    • 2015
  • In many image processing tasks, connected component labeling (CCL) is performed to extract regions of interest. CCL was usually done in a sequential fashion when image resolution was relatively low and there are small number of input channels. As image resolution gets higher up to HD or Full HD and as the number of input channels increases, sequential CCL is too time-consuming to be used in real time applications. To cope with this situation, parallel CCL framework was introduced where multiple cores are utilized simultaneously. Several parallel CCL methods have been proposed in the literature. Among them are NSZ label equivalence (NSZ-LE) method[1], modified 8 directional label selection (M8DLS) method[2], and HYBRID1 method[3]. Soh [3] showed that HYBRID1 outperforms NSZ-LE and M8DLS, and argued that HYBRID1 is by far the best. In this paper we propose an improved hybrid parallel CCL algorithm termed as HYBRID2 that hybridizes M8DLS with label backtracking (LB) and show that it runs around 20% faster than HYBRID1 for various kinds of images.

Real-World Physical Length Comparison in Virtual Environments (가상환경에서의 실세계 물리적 길이 비교)

  • Jung, Chul-Hee;Im, Chang-Hyuck;Lee, Min-Geun;Lee, Myeong-Won
    • Journal of the Korea Computer Graphics Society
    • /
    • v.13 no.3
    • /
    • pp.19-24
    • /
    • 2007
  • In this paper, we describe a method of defining an object's real length in order to compare objects' lengths precisely using all real length units in the real world. The browser in our study represents an object's length by referencing to the physical length property defined at modeling when it displays the object. Since objects' lengths are appropriately scaled according to these units, objects can be precisely and visually compared in sire using real world length units. The concept of defining the real length unit is extended to the X3D specification. The units are ranged from $10^{-24}(yotta)\;to\;10^{24}(yocto)$. In addition, we explain the method for processing LOD (Levels Of Detail) and for applying the property of LOLD (Levels of Length Detail) when objects with different LOLD are read into the browser.

  • PDF

Research on the Development of an Integral Imaging System Framework and an Improved Viewpoint Vector Rendering Method Utilizing GPU (GPU를 이용한 개선된 뷰포인트 벡터 렌더링 방식의 집적영상시스템 프레임워크에 관한 연구)

  • Lee, Bin-Na-Ra;Park, Kyoung-Shin;Cho, Yong-Joo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.10
    • /
    • pp.1767-1772
    • /
    • 2006
  • Computer-generated integral imaging system is an auto-stereoscopic display system that users can see and feel the stereoscopic images when they see the pre-rendered elemental images through a lens array. The process of constructing elemental images using computer graphics is called image mapping. Viewpoint vector rendering (VVR) method is one of the image mapping algorithm specially designed for real-time graphics applications, which would not be affected by the size of the rendered objects or the number of elemental lenses used in the integral imaging system. This paper describes a new VVR framework which improved its rendering performance considerably. It also compares the previous VVR implementation with the new VVR work utilizing GPU and shows that newer implementation shows pretty big improvements over the old method.

Simple Spectral Calibration Method and Its Application Using an Index Array for Swept Source Optical Coherence Tomography

  • Jung, Un-Sang;Cho, Nam-Hyun;Kim, Su-Hwan;Jeong, Hyo-Sang;Kim, Jee-Hyun;Ahn, Yeh-Chan
    • Journal of the Optical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.386-393
    • /
    • 2011
  • In this study, we report an effective k-domain linearization method with a pre-calibrated indexed look-up table. The method minimizes k-domain nonlinear characteristics of a swept source optical coherence tomography (SS-OCT) system by using two arrays, a sample position shift index and an intensity compensation array. Two arrays are generated from an interference pattern acquired by connecting a Fabry-Perot interferometer (FPI) and an optical spectrum analyzer (OSA) to the system. At real time imaging, the sample position is modified by location movement and intensity compensation with two arrays for linearity of wavenumber. As a result of evaluating point spread functions (PSFs), the signal to noise ratio (SNR) is increased by 9.7 dB. When applied to infrared (IR) sensing card imaging, the SNR is increased by 1.29 dB and the contrast noise ratio (CNR) value is increased by 1.44. The time required for the linearization and intensity compensation is 30 ms for a multi thread method using a central processing unit (CPU) compared to 0.8 ms for compute unified device architecture (CUDA) processing using a graphics processing unit (GPU). We verified that our linearization method is appropriate for applying real time imaging of SS-OCT.

Analysis of GPU Performance and Memory Efficiency according to Task Processing Units (작업 처리 단위 변화에 따른 GPU 성능과 메모리 접근 시간의 관계 분석)

  • Son, Dong Oh;Sim, Gyu Yeon;Kim, Cheol Hong
    • Smart Media Journal
    • /
    • v.4 no.4
    • /
    • pp.56-63
    • /
    • 2015
  • Modern GPU can execute mass parallel computation by exploiting many GPU core. GPGPU architecture, which is one of approaches exploiting outstanding computational resources on GPU, executes general-purpose applications as well as graphics applications, effectively. In this paper, we investigate the impact of memory-efficiency and performance according to number of CTAs(Cooperative Thread Array) on a SM(Streaming Multiprocessors), since the analysis of relation between number of CTA on a SM and them provides inspiration for researchers who study the GPU to improve the performance. Our simulation results show that almost benchmarks increasing the number of CTAs on a SM improve the performance. On the other hand, some benchmarks cannot provide performance improvement. This is because the number of CTAs generated from same kernel is a little or the number of CTAs executed simultaneously is not enough. To precisely classify the analysis of performance according to number of CTA on a SM, we also analyze the relations between performance and memory stall, dram stall due to the interconnect congestion, pipeline stall at the memory stage. We expect that our analysis results help the study to improve the parallelism and memory-efficiency on GPGPU architecture.

Content Based Dynamic Texture Analysis and Synthesis Based on SPIHT with GPU

  • Ghadekar, Premanand P.;Chopade, Nilkanth B.
    • Journal of Information Processing Systems
    • /
    • v.12 no.1
    • /
    • pp.46-56
    • /
    • 2016
  • Dynamic textures are videos that exhibit a stationary property with respect to time (i.e., they have patterns that repeat themselves over a large number of frames). These patterns can easily be tracked by a linear dynamic system. In this paper, a model that identifies the underlying linear dynamic system using wavelet coefficients, rather than a raw sequence, is proposed. Content based threshold filtering based on Set Partitioning in a Hierarchical Tree (SPIHT) helps to get another representation of the same frames that only have low frequency components. The main idea of this paper is to apply SPIHT based threshold filtering on different bands of wavelet transform so as to have more significant information in fewer parameters for singular value decomposition (SVD). In this case, more flexibility is given for the component selection, as SVD is independently applied to the different bands of frames of a dynamic texture. To minimize the time complexity, the proposed model is implemented on a graphics processing unit (GPU). Test results show that the proposed dynamic system, along with a discrete wavelet and SPIHT, achieve a highly compact model with better visual quality, than the available LDS, Fourier descriptor model, and higher-order SVD (HOSVD).