• Title/Summary/Keyword: Parallel pipeline

Search Result 172, Processing Time 0.027 seconds

Hardware Implementation of Genetic Algorithm Processor for EHW (EHW를 위한 Genetic Algorithm Processor 구현)

  • Kim, Jin-Jung;Kim, Yong-Hun;Choi, Yun-Ho;Chung, Duck-Jin
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2827-2829
    • /
    • 1999
  • Genetic algorithms were described as a method of solving large-scaled optimization problems with complex constraints. It has overcome their slowness, a major drawback of genetic algorithms using hardware implementation of genetic algorithm processor (GAP). In this study, we proposed GAP effectively connecting the goodness of survival-based GA, steady-state GA, tournament selection. Using Pipeline Parallel processing, handshaking protocol effectively, the proposed GAP exhibits 50% speed-up over survival-based GA which runs one million crossovers per second(1MHz). It will be used for high speed processing such of central processor of EHW, robot control and many optimization problem.

  • PDF

Combinations Method and Parallel Pipeline Multiple Recognizer Structure for Recognizing Unconstrained Handwritten Numerals (무제약 필기체 숫자를 인식하기 위한 병렬 파이프라인 다중 인식기의 구조와 결합 방법)

  • 최용호;이호현;조범준
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2002.05c
    • /
    • pp.223-228
    • /
    • 2002
  • 숫자를 인식하는 방법에는 여러 가지가 있지만 단일 인식기를 구성하는 경우보다 다중 인식기를 이용하는 방법이 뛰어나다는 연구 발표가 있었다. 그래서 다중 인식에 대한 연구가 활발히 진행되고 있는데, 다중 인식기를 이용하는 방법에는 크게 직렬 조합형과 병렬 조합형이 있는데, 직렬 조합형은 인식기를 파이프라인 처럼 구성하여 순차적으로 인식하는 방법이고, 병렬조합형은 인식기를 병렬로 구성하여 인식기들의 결과를 조합하여 얻어내는 방법이다. 본 논문에서는 무제약 필기체 숫자를 인식하기 위한 병렬 파이프라인 다중 인식기의 구조와 결합 방법을 제안 하고자 한다. 조선대학교 필기체 숫자 데이터를 이용하여 실험한 결과 기존의 방법보다 비교적 높은 인식률을 나타내었다.

  • PDF

KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

  • Vo, Viet Tan;Kim, Cheol Hong
    • Journal of Information Processing Systems
    • /
    • v.17 no.6
    • /
    • pp.1157-1169
    • /
    • 2021
  • Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernel progress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warp schedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.

Analysis of Stray Current Interference between Underground Pipelines and DC Electric Railways (매설배관과 직류전기철도의 표유전류 간섭분석)

  • Ha Y.C.;Bae J.H.;Ha T.H.;Lee H.G.;Kim D.E.
    • Journal of the Korean Institute of Gas
    • /
    • v.10 no.3 s.32
    • /
    • pp.41-47
    • /
    • 2006
  • When an underground pipeline runs parallel with DC electric railways, it suffers from electrolytic corrosion caused by the stray current leaked from the railway negative returns, i.e., the rails. Perforation due to the electrolytic corrosion may bring about large-scale accidents even under cathodically protected condition. Traditionally, drainage bonding methods have been widely used as a mitigation method for stray current interference. In particular, the increased adoption of forced drainage method to gas pipelines makes the interference much more sophisticated. In this paper, we analyze the electric interference between pipelines and railways from the results of field investigation carried out in Seoul and Busan.

  • PDF

Design of Stereo Image Match Processor for Real Time Stereo Matching (실시간 스테레오 정합을 위한 스테레오 영상 정합 프로세서 설계)

  • Kim, Yeon-Jae;Sim, Deok-Seon
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.37 no.2
    • /
    • pp.50-59
    • /
    • 2000
  • Stereo vision is a technique extracting depth information from stereo images, which are two images that view an object or a scene from different locations. The most important procedure in stereo vision, which is called stereo matching, is to find the same points in stereo images. It is difficult to match stereo images in real time because stereo matching requires heavy calculation. In this Paper we design a digital VLSI to Process stereo matching in real time, which we call stereo image match processor (SIMP). For implementation of real time stereo matching, sliding memory and minimum selection tree are presented. SIMP is designed with pipeline architecture and parallel processing. SIMP takes 64 gray level 64$\times$64 stereo images and yields 8 level 64 $\times$64 disparity map by 3 bit disparity and 12 bit address outputs. SIMP can process stereo images with process speed of 240 frames/sec.

  • PDF

Acceleration of Feature-Based Image Morphing Using GPU (GPU를 이용한 특징 기반 영상모핑의 가속화)

  • Kim, Eun-Ji;Yoon, Seung-Hyun;Lee, Jieun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.20 no.2
    • /
    • pp.13-24
    • /
    • 2014
  • In this study, a graphics-processing-unit (GPU)-based acceleration technique is proposed for the feature-based image morphing. This technique uses the depth-buffer of the graphics hardware to calculate efficiently the shortest distance between a pixel and the control lines. The pairs of control lines between the source image and the destination image are determined by user's input, and the distance function of each control line is rendered using two rectangles and two cones. The distance between each pixel and its nearest control line is stored in the depth buffer through the graphics pipeline, and this is used to conduct the morphing operation efficiently. The pixel-unit morphing operation is parallelized using the compute unified device architecture (CUDA) to reduce the morphing time. We demonstrate the efficiency of the proposed technique using several experimental results.

Interactive Colision Detection for Deformable Models using Streaming AABBs

  • Zhang, Xinyu;Kim, Young-J.
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02c
    • /
    • pp.306-317
    • /
    • 2007
  • We present an interactive and accurate collision detection algorithm for deformable, polygonal objects based on the streaming computational model. Our algorithm can detect all possible pairwise primitive-level intersections between two severely deforming models at highly interactive rates. In our streaming computational model, we consider a set of axis aligned bounding boxes (AABBs) that bound each of the given deformable objects as an input stream and perform massively-parallel pairwise, overlapping tests onto the incoming streams. As a result, we are able to prevent performance stalls in the streaming pipeline that can be caused by expensive indexing mechanism required by bounding volume hierarchy-based streaming algorithms. At run-time, as the underlying models deform over time, we employ a novel, streaming algorithm to update the geometric changes in the AABB streams. Moreover, in order to get only the computed result (i.e., collision results between AABBs) without reading back the entire output streams, we propose a streaming en/decoding strategy that can be performed in a hierarchical fashion. After determining overlapped AABBs, we perform a primitive-level (e.g., triangle) intersection checking on a serial computational model such as CPUs. We implemented the entire pipeline of our algorithm using off-the-shelf graphics processors (GPUs), such as nVIDIA GeForce 7800 GTX, for streaming computations, and Intel Dual Core 3.4G processors for serial computations. We benchmarked our algorithm with different models of varying complexities, ranging from 15K up to 50K triangles, under various deformation motions, and the timings were obtained as 30~100 FPS depending on the complexity of models and their relative configurations. Finally, we made comparisons with a well-known GPU-based collision detection algorithm, CULLIDE [4] and observed about three times performance improvement over the earlier approach. We also made comparisons with a SW-based AABB culling algorithm [2] and observed about two times improvement.

  • PDF

Implementation of a 3D Graphics Hardwired T&L Accelerator based on a SoC Platform for a Mobile System (SoC 플랫폼 기반 모바일용 3차원 그래픽 Hardwired T&L Accelerator 구현)

  • Lee, Kwang-Yeob;Koo, Yong-Seo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.9
    • /
    • pp.59-70
    • /
    • 2007
  • In this paper, we proposed an effective T&L(Transform & Lighting) Processor architecture for a real time 3D graphics acceleration SoC(System on a Chip) in a mobile system. We designed Floating point arithmetic IPs for a T&L processor. And we verified IPs using a SoC Platform. Designed T&L Processor consists of 24 bit floating point data format and 16 bit fixed point data format, and supports the pipeline keeping the balance between Transform process and Lighting process using a parallel computation of 3D graphics. The delay of pipeline processing only Transform operation is almost same as the delay processing both Transform operation and Lighting operation. Designed T&L Processor is implemented and verified using a SoC Platform. The T&L Processor operates at 80MHz frequency in Xilinx-Virtex4 FPGA. The processing speed is measured at the rate of 20M Vertexes/sec.

Hardware Design of High Performance In-loop Filter in HEVC Encoder for Ultra HD Video Processing in Real Time (UHD 영상의 실시간 처리를 위한 고성능 HEVC In-loop Filter 부호화기 하드웨어 설계)

  • Im, Jun-seong;Dennis, Gookyi;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.401-404
    • /
    • 2015
  • This paper proposes a high-performance in-loop filter in HEVC(High Efficiency Video Coding) encoder for Ultra HD video processing in real time. HEVC uses in-loop filter consisting of deblocking filter and SAO(Sample Adaptive Offset) to solve the problems of quantization error which causes image degradation. In the proposed in-loop filter encoder hardware architecture, the deblocking filter and SAO has a 2-level hybrid pipeline structure based on the $32{\times}32CTU$ to reduce the execution time. The deblocking filter is performed by 6-stage pipeline structure, and it supports minimization of memory access and simplification of reference memory structure using proposed efficient filtering order. Also The SAO is implemented by 2-statge pipeline for pixel classification and applying SAO parameters and it uses two three-layered parallel buffers to simplify pixel processing and reduce operation cycle. The proposed in-loop filter encoder architecture is designed by Verilog HDL, and implemented by 205K logic gates in TSMC 0.13um process. At 110MHz, the proposed in-loop filter encoder can support 4K Ultra HD video encoding at 30fps in realtime.

  • PDF

Development of TDR-based Water Leak Detection Sensor for Seawater Pipeline of Ship (시간영역반사계를 이용한 해수배관시스템의 누수 탐지용 센서 개발 연구)

  • Hwang, Hyun-Kyu;Shin, Dong-Ho;Kim, Heon-Hui;Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.6
    • /
    • pp.1044-1053
    • /
    • 2022
  • Time domain reflectometry (TDR) is a diagnostic technique to evaluate the physical integrity of cable and finds application in leak detection and localization of piping system. In this study, a cable-shaped leak detection sensor was proposed using the TDR technique for monitoring leakage detection of ship's engine room seawater piping system. The cable sensor was developed using a twisted pair arrangement and wound by an absorbent material. The availability and performance of the sensor for leak detection and localization were evaluated on a lab-scale pipeline set up. The developed sensor was installed onto the pipes and flanges of the lab-scale set up and various TDR waveforms were acquired and analyzed according to the dif erent variables including the number of twists and sheath thickness. The result indicated that the twisted cable sensor was able to produce clear and smooth signal as compared to the TDR sensor with a parallel arrangement. The optimal number of twist was determined to be above 10 per the unit length. The optimal diameter of sheath thickness that results in the desired sensitivity was determined to be ranging from 80% up to 120% of the diameter of the conductor. The linear regression analysis for estimation of leak localization was carried out to estimate the location of the leakage, and the result was a determination coefficient of 0.9998, indicating a positive relationship with the actual leakage point. The proposed TDR based leak detection method appears to be an effective method for monitoring leakage of ship's seawater piping system.