• Title/Summary/Keyword: 워프

Search Result 14, Processing Time 0.028 seconds

A Novel Cooperative Warp and Thread Block Scheduling Technique for Improving the GPGPU Resource Utilization (GPGPU 자원 활용 개선을 위한 블록 지연시간 기반 워프 스케줄링 기법)

  • Thuan, Do Cong;Choi, Yong;Kim, Jong Myon;Kim, Cheol Hong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.5
    • /
    • pp.219-230
    • /
    • 2017
  • General-Purpose Graphics Processing Units (GPGPUs) build massively parallel architecture and apply multithreading technology to explore parallelism. By using programming models like CUDA, and OpenCL, GPGPUs are becoming the best in exploiting plentiful thread-level parallelism caused by parallel applications. Unfortunately, modern GPGPU cannot efficiently utilize its available hardware resources for numerous general-purpose applications. One of the primary reasons is the inefficiency of existing warp/thread block schedulers in hiding long latency instructions, resulting in lost opportunity to improve the performance. This paper studies the effects of hardware thread scheduling policy on GPGPU performance. We propose a novel warp scheduling policy that can alleviate the drawbacks of the traditional round-robin policy. The proposed warp scheduler first classifies the warps of a thread block into two groups, warps with long latency and warps with short latency and then schedules the warps with long latency before the warps with short latency. Furthermore, to support the proposed warp scheduler, we also propose a supplemental technique that can dynamically reduce the number of streaming multiprocessors to which will be assigned thread blocks when encountering a high contention degree at the memory and interconnection network. Based on our experiments on a 15-streaming multiprocessor GPGPU platform, the proposed warp scheduling policy provides an average IPC improvement of 7.5% over the baseline round-robin warp scheduling policy. This paper also shows that the GPGPU performance can be improved by approximately 8.9% on average when the two proposed techniques are combined.

MSHR-Aware Dynamic Warp Scheduler for High Performance GPUs (GPU 성능 향상을 위한 MSHR 활용률 기반 동적 워프 스케줄러)

  • Kim, Gwang Bok;Kim, Jong Myon;Kim, Cheol Hong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.5
    • /
    • pp.111-118
    • /
    • 2019
  • Recent graphic processing units (GPUs) provide high throughput by using powerful hardware resources. However, massive memory accesses cause GPU performance degradation due to cache inefficiency. Therefore, the performance of GPU can be improved by reducing thread parallelism when cache suffers memory contention. In this paper, we propose a dynamic warp scheduler which controls thread parallelism according to degree of cache contention. Usually, the greedy then oldest (GTO) policy for issuing warp shows lower parallelism than loose round robin (LRR) policy. Therefore, the proposed warp scheduler employs the LRR warp scheduling policy when Miss Status Holding Register(MSHR) utilization is low. On the other hand, the GTO policy is employed in order to reduce thread parallelism when MSHRs utilization is high. Our proposed technique shows better performance compared with LRR and GTO policy since it selects efficient scheduling policy dynamically. According to our experimental results, our proposed technique provides IPC improvement by 12.8% and 3.5% over LRR and GTO on average, respectively.

A new warp scheduling technique for improving the performance of GPUs by utilizing MSHR information (GPU 성능 향상을 위한 MSHR 정보 기반 워프 스케줄링 기법)

  • Kim, Gwang Bok;Kim, Jong Myon;Kim, Cheol Hong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.3
    • /
    • pp.72-83
    • /
    • 2017
  • GPUs can provide high throughput with latency hiding by executing many warps in parallel. MSHR(Miss Status Holding Registers) for L1 data cache tracks cache miss requests until required data is serviced from lower level memory. In recent GPUs, excessive requests for cache resources cause underutilization problem of GPU resources due to cache resource reservation fails. In this paper, we propose a new warp scheduling technique to reduce stall cycles under MSHR resource shortage. Cache miss rates for each warp is predicted based on the observation that each warp shows similar cache miss rates for long period. The warps showing low miss rates or computation-intensive warps are given high priority to be issued when MSHR is full status. Our proposal improves GPU performance by utilizing cache resource more efficiently based on cache miss rate prediction and monitoring the MSHR entries. According to our experimental results, reservation fail cycles can be reduced by 25.7% and IPC is increased by 6.2% with the proposed scheduling technique compared to loose round robin scheduler.

Analysis of Impact of Correlation Between Hardware Configuration and Branch Handling Methods Executing General Purpose Applications (범용 응용프로그램 실행 시 하드웨어 구성과 분기 처리 기법에 따른 GPU 성능 분석)

  • Choi, Hong Jun;Kim, Cheol Hong
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.3
    • /
    • pp.9-21
    • /
    • 2013
  • Due to increased computing power and flexibility of GPU, recent GPUs execute general purpose parallel applications as well as graphics applications. Programmers can use GPGPU by using the APIs from GPU vendors. Unfortunately, computational resources of GPU are not fully utilized when executing general purpose applications because of frequent branch instructions. To handle the branch problem, several warp formations have been proposed. Intuitively, we expect that the warp formations providing higher computational resource utilization show higher performance. Contrary to our expectations, according to simulation results, the performance of the warp formation providing better utilization is lower than that of the warp formation providing worse utilization. This is because warp formation providing high utilization causes serious memory bottleneck due to increased memory request. Therefore, warp formation providing high computation utilization cannot guarantee high performance without proper hardware resources. For this reason, we will analyze the correlation between hardware configuration and warp formation. Our simulation results present the guideline to solve the underutilization problem due to branch instructions when designing recent GPU.

The Improvement of Meshwarp Algorithm for Rotational Pose Transformation of a Front Facial Image (정면 얼굴 영상의 회전 포즈 변형을 위한 메쉬워프 알고리즘의 개선)

  • Kim, Young-Won;Phan, Hung The;Oh, Seung-Taek;Jun, Byung-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11a
    • /
    • pp.425-428
    • /
    • 2002
  • 본 논문에서는 한 장의 정면 얼굴 영상만으로 회전 변형을 수행할 수 있는 새로운 영상기반렌더링(Image Based Rendering, IBR) 기법을 제안한다. 3차원 기하학적 모델을 대신하면서 수평 회전 변형을 연출하기 위해, 특정 인물의 정면, 좌우 반측면, 좌우 측면의 얼굴 영상에 대한 표준 메쉬 집합을 작성한다. 변형하고자 하는 임의의 인물에 대해서는 정면 영상에 대한 메쉬만을 작성하고, 나머지 측면 참조 메쉬들은 표준 메쉬 집합에 의해 자동으로 생성된다. 입체적인 회전 효과를 연출하기 위해, 회전 변형시 발생할 수 있는 제어점들간의 중첩 및 역전을 허용하도록 기존의 두 단계 메쉬워프 알고리즘을 개선한 역전가능 메쉬워프 알고리즘(invertible meshwarp algorithm)을 제안한다. 이 알고리즘을 이용하여 다양한 남녀노소의 정면 얼굴 영상에 대해 회전에 따른 포즈 변형을 수행하여 비교적 자연스러운 포즈 변형 결과를 얻었다.

  • PDF

Acceleration of GPU-based Shear-Skew Warp Volume Rendering (GPU 기반 쉐아-스큐 워프 볼륨 렌더링 가속 기법)

  • Cho, Chang-Woo;Kim, Yoon-Ki;Jeong, Chang-Sung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1418-1420
    • /
    • 2013
  • GPU는 범용 CPU와는 달리 수백 개의 코어로 이루어져 병렬처리에 특화된 형태로 발전되어 왔으며, 이미지 및 동영상 처리, 유체 역학 시뮬레이션, 의료, 지진 분석 등 점차 많은 영역에서 사용 되고 있다. 최근에는 GPU를 이용하여 볼륨 렌더링을 가속화하는 많은 기법들이 연구되고 있다. 본 논문에서는 볼륨 렌더링을 가속화하기 위한 GPU 기반의 쉐아-스큐 워프 기법을 제안한다. 여기서는 GPU를 이용하여 효율적인 메모리 사용, 코어의 활성화, 뱅크 충돌 감소 기법을 이용하여 기존의 CPU 기반 볼륨 렌더링 기법과 비교하여 빠른 시간에 동일한 결과물을 생성한다.

Pose Transformation of a Frontal Face Image by Invertible Meshwarp Algorithm (역전가능 메쉬워프 알고리즘에 의한 정면 얼굴 영상의 포즈 변형)

  • 오승택;전병환
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.153-163
    • /
    • 2003
  • In this paper, we propose a new technique of image based rendering(IBR) for the pose transformation of a face by using only a frontal face image and its mesh without a three-dimensional model. To substitute the 3D geometric model, first, we make up a standard mesh set of a certain person for several face sides ; front. left, right, half-left and half-right sides. For the given person, we compose only the frontal mesh of the frontal face image to be transformed. The other mesh is automatically generated based on the standard mesh set. And then, the frontal face image is geometrically transformed to give different view by using Invertible Meshwarp Algorithm, which is improved to tolerate the overlap or inversion of neighbor vertexes in the mesh. The same warping algorithm is used to generate the opening or closing effect of both eyes and a mouth. To evaluate the transformation performance, we capture dynamic images from 10 persons rotating their heads horizontally. And we measure the location error of 14 main features between the corresponding original and transformed facial images. That is, the average difference is calculated between the distances from the center of both eyes to each feature point for the corresponding original and transformed images. As a result, the average error in feature location is about 7.0% of the distance from the center of both eyes to the center of a mouth.

A Study on the Synthesis of Facial Poses based on Warping (워핑 기법에 의한 얼굴의 포즈 합성에 관한 연구)

  • 오승택;서준원;전병환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.499-501
    • /
    • 2001
  • 본 논문에서는 사실적인 아바타(avata) 구현의 핵심이라 할 수 있는 입체적인 얼굴 표현을 위해, (※원문참조) 기하학적인 정보를 사용하지 않고 중첩 메쉬를 허용하는 개선된 메쉬 워프 알고리즘(mesh warp algor※원문참조)을 이용하여 IBR(Image Based Rendering)을 구현하는 방법을 제안한다. 3차원 모델을 대신하기 위해 (※원문참조) 인물의 정면, 좌우 반측면, 좌우 측면의 얼굴 영상들에 대해 작성된 메쉬를 사용한다. 합성하고자 하는 (※원문참조) 정면 얼굴 영상에 대해서는 정면 메쉬만을 작성하고, 반측면이나 측면 메쉬는 표준 메쉬를 근거로 자(※원문참조)된다. 얼굴 포즈 합성의 성능을 펴가하기 위해, 얼굴을 수평으로 회전하는 실제 포즈 영상과 합성된 포(※원문참조)에 대해 주요 특징점 들을 정규화 한 위치 오차를 측정한 결과, 평균적으로 양 눈의 중심에서 입의 (※원문참조)리에 대해 약 5%의 위치 오차만이 발생한 것으로 나타났다.

  • PDF

A Parametric Study of the Hemming Process by Finite Element Analysis (유한요소해석에 의한 헤밍 공정 변수연구)

  • Kim, Hyung-Jong;Choi, Won-Mog;Lim, Jae-Kyu;Park, Chun-Dal;Lee, Woo-Hong;Kim, Heon-Young
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.28 no.2
    • /
    • pp.149-157
    • /
    • 2004
  • Implicit finite element analysis of the flat surface-straight edge hemming process is performed by using a commercial code ABAQUS/Standard. Methods of finite element modeling for springback simulation and contact pair definition are discussed. An optimal mesh system is chosen through the error analysis that is based on the smoothing of discontinuity in the state variables. This study has focused on the investigation of the influence of process parameters in flanging, pre-hemming and main hemming on final hem quality, which can be defined by turn-down, warp and roll-in. The parameters adopted in this parametric study are flange length, flange angle, flanging die corner radius, face angle and insertion angle of pre-hemming punch, and over-stroke of pre-hemming and main hemming punches.

Implementation of 2.5D Mapping System for Fashion Design (패션디자인을 위한 2.5D맵핑 시스템의 구현)

  • Lee, Min-Kyu;Kim, Young-Un;Cho, Jun-Ei;Han, Sung-Kuk;Jung, Sung-Tae;Lee, Yong-Ju;Jung, Suck-Tae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.599-602
    • /
    • 2005
  • This paper utilizing model picture of finished clothes in fashion design field various material (textile fabrics) doing Draping directly can invent new design, and do not produce direction sample or poetic theme width and confirm clothes work to simulation. Also, construct database about model and material image that can confirm Mapping result by real time. Development did the 2.5D Mapping system that used path extraction algorithm, warp algorithm, a lighting extraction and application algorithm in order to implement natural Draping of model picture and material image.

  • PDF