• Title/Summary/Keyword: 병렬탐색

Search Result 188, Processing Time 0.024 seconds

A 4-way Pipelined Processing Architecture for Three-Step Search Block Matching Algorithm (3 단계 블록 매칭 알고리즘을 위한 4-경로 파이프라인 처리)

  • Jung, Sung-Tae;Lee, Sang-Seol;Nam, Kung-Moon
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.8
    • /
    • pp.1170-1182
    • /
    • 2004
  • A novel 4-way pipelined processing architecture is presented for three-step search block-matching motion estimation. For the 4-way pipelined processing, we have developed a method which divides the current block and search area into 4 subregions respectively and processes them concurrently. Also, we have developed memory partitioning method to access pixel data from 4 subregions concurrently without memory conflict. The architecture has been designed and simulated with C language and VHDL. Experimental results show that the proposed architecture achieves a high performance for real time motion estimation.

  • PDF

Fixed-complexity Sphere Encoder for Multi-user MIMO Systems (다중 사용자 MIMO 시스템을 위한 고정 복잡도를 갖는 스피어 인코더)

  • Mohaisen, Manar;Han, Dong-Keol;Chang, Kyung-Hi
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.7A
    • /
    • pp.632-638
    • /
    • 2010
  • In this paper, we propose a fixed-complexity sphere encoder (FSE) for multi-user MIMO (MU-MIMO) systems. The Proposed FSE accomplishes a scalable tradeoff between performance and complexity. Also, because it has a parallel tree-search structure, the proposed encoder can be easily pipelined, leading to a tremendous reduction in the precoding latency. The complexity of the proposed encoder is also analyzed, and we propose two techniques that reduce it. Simulation and analytical results demonstrate that in a $4\times4$ MU-MIMO system, the complexity of the proposed FSE is 16% that of the conventional QRD-M encoder (QRDM-E). Also, the encoding throughput of the proposed endoder is 7.5 times that of the QRDM-E with tolerable degradation in the BER performance, while achieving the optimum diversity order.

An Efficient Algorithm for finding Optimal Spans to determine R=1/2 Rate Systematic Convolutional Self-Doubly Orthogonal Codes (R=1/2 Self-Doubly 조직 직교 길쌈부호를 찾는 효율적인 최적 스팬 알고리듬)

  • Doniyor, Atabaev;Suh, Hee-Jong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.11
    • /
    • pp.1239-1244
    • /
    • 2015
  • In this paper, a new method for finding optimal and short span in Convolutional Self-Doubly Orthogonal(CDO) codes are proposed. This new algorithm based on Parallel Implicitly-Exhaustive search, where we applied dynamic search space reduction methods in order to reduce computational time for finding Optimal Span for R=1/2 rate CDO codes. The simulation results shows that speedup and error correction performance of the new algorithm is better.

A Study on Optimal Scheduling with Directed Acyclic Graphs Task onto Multiprocessors (다중프로세서에서 비순환 타스크 그래프의 최적 스케쥴링에 관한 연구)

  • 조민환
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.4
    • /
    • pp.40-46
    • /
    • 1999
  • The task scheduling has an effect on system execution time in a precedence constrained task graph onto the multiprocessor system. This problem is known to be NP-hard. many people made an effort to obtain near optimal schedule. We compared modified critical path schedule with many other methods(CP, MH, DL Swapping) For testing this subject, we created randomly a directed acyclic task graph with many root nodes and terminal nodes simulation result convinced for us that the modified critical path algorithm is superior to the other scheduling algorithm.

  • PDF

Zero-tree packetization without additional memory using DFS (DFS를 이용한 추가 메모리를 요구하지 않는 제로트리 압축기법)

  • Kim, Chung-Kil;Lee, Joo-Kyong;Chung, Ki-Dong
    • The KIPS Transactions:PartB
    • /
    • v.10B no.5
    • /
    • pp.575-578
    • /
    • 2003
  • SPIHT algorithm is a wavelet based fast and effective technique for image compression. It uses a list structure to store status information which is generated during set-partitioning of zero-tree. Usually, this requires lots of additional memory depending on how high the bit-rate is. Therefore, in this paper, we propose a new technique called MZP-DFS, which needs no additional memory when running SPIHT algorithm. It traverses a spatial-tree according to DFS and eliminates additional memory as it uses test-functions for encoding and LSB bits of coefficients for decoding respectively. This method yields nearly the same performance as SPIHT. This may be desirable in hardware implementation because no additional memory is required. Moreover. it exploits parallelism to process each spatial-tree that it can be applied well in real-time image compression.

Training Artificial Neural Networks and Convolutional Neural Networks using WFSO Algorithm (WFSO 알고리즘을 이용한 인공 신경망과 합성곱 신경망의 학습)

  • Jang, Hyun-Woo;Jung, Sung Hoon
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.969-976
    • /
    • 2017
  • This paper proposes the learning method of an artificial neural network and a convolutional neural network using the WFSO algorithm developed as an optimization algorithm. Since the optimization algorithm searches based on a number of candidate solutions, it has a drawback in that it is generally slow, but it rarely falls into the local optimal solution and it is easy to parallelize. In addition, the artificial neural networks with non-differentiable activation functions can be trained and the structure and weights can be optimized at the same time. In this paper, we describe how to apply WFSO algorithm to artificial neural network learning and compare its performances with error back-propagation algorithm in multilayer artificial neural networks and convolutional neural networks.

Acceleration Method of Inter Prediction using Advanced SIMD (Advanced SIMD를 이용한 화면 간 예측 고속화방법)

  • Kim, Wan-Su;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.16 no.4
    • /
    • pp.382-388
    • /
    • 2012
  • An H.264/AVC fast motion estimation methodology is presented in this paper. Advanced SIMD based NEON which is one of the parallel processing methods is supported under the ARM Cortex-A9 dual-core platform. NEON is applied to a full search technique with one of the various motion estimation methods and SAD operation count of each macroblock is reduced to 1/4. Pixel values of the corresponding macroblock are assigned to eight 16-bit NEON registers and Intrinsic function in NEON architecture carried out 128 bits arithmetic operations at the same time. In this way, the exact motion vector with the minimum SAD value among the calculated SAD values can be designated. Experimental results show that performance gets improved 30% above average in accordance with the size of image and macroblock.

Sweet Spot Search of Antenna Beam using The Two ADALINE (두개의 ADALINE을 이용한 안테나 빔의 스위트 스폿 탐색)

  • Lee, Chang-Young;Choi, Kyu-Min;Kang, Seong-Ho;Chung, Sung-Boo;Eom, Ki-Hwan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.705-708
    • /
    • 2005
  • In this paper, we propose a method that search the sweet spot of antenna beam, and keep it for fast speed transmission in millimeter wave on point-to-point link We use TDD(Time Division Duplex) as transfer method, and it transfers the control data of antenna. The proposed method is composed of two ADALINE which used the parallel. The efficiency of the proposed method is verified by means of simulations with white Gaussian noise and not on point-to-point link.

  • PDF

Improved Dispatching Algorithm for Satisfying both Quality and Due Date (품질과 납기를 동시에 만족하는 작업투입 개선에 관한 연구)

  • Yoon, Ji-Myoung;Ko, Hyo-Heon;Baek, Jong-Kwan;Kim, Sung-Shick
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.6
    • /
    • pp.1838-1855
    • /
    • 2008
  • The manufacturing industry seeks for improvements in efficiency at the manufacturing process. This paper presents a method for effective real time dispatching for parallel machines with multi product that minimizes mean tardiness and maximizes the quality of the product. What is shown in this paper is that using the Rolling Horizon Tabu search method in the real time dispatching process, mean tardiness can be reduced to the minimum. The effectiveness of the method presented in this paper has been examined in the simulation and compared with other dispatching methods. In fact, using this method manufacturing companies can increase profits and improve customer satisfaction as well.

Evaluation of Alignment Methods for Genomic Analysis in HPC Environment (HPC 환경의 대용량 유전체 분석을 위한 염기서열정렬 성능평가)

  • Lim, Myungeun;Jung, Ho-Youl;Kim, Minho;Choi, Jae-Hun;Park, Soojun;Choi, Wan;Lee, Kyu-Chul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.107-112
    • /
    • 2013
  • With the progress of NGS technologies, large genome data have been exploded recently. To analyze such data effectively, the assistance of HPC technique is necessary. In this paper, we organized a genome analysis pipeline to call SNP from NGS data. To organize the pipeline efficiently under HPC environment, we analyzed the CPU utilization pattern of each pipeline steps. We found that sequence alignment is computing centric and suitable for parallelization. We also analyzed the performance of parallel open source alignment tools and found that alignment method utilizing many-core processor can improve the performance of genome analysis pipeline.