• Title/Summary/Keyword: Parallel Processing Structure

Search Result 303, Processing Time 0.026 seconds

Parallel Approximate String Matching with k-Mismatches for Multiple Fixed-Length Patterns in DNA Sequences on Graphics Processing Units (GPU을 이용한 다중 고정 길이 패턴을 갖는 DNA 시퀀스에 대한 k-Mismatches에 의한 근사적 병열 스트링 매칭)

  • Ho, ThienLuan;Kim, HyunJin;Oh, SeungRohk
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.6
    • /
    • pp.955-961
    • /
    • 2017
  • In this paper, we propose a parallel approximate string matching algorithm with k-mismatches for multiple fixed-length patterns (PMASM) in DNA sequences. PMASM is developed from parallel single pattern approximate string matching algorithms to effectively calculate the Hamming distances for multiple patterns with a fixed-length. In the preprocessing phase of PMASM, all target patterns are binary encoded and stored into a look-up memory. With each input character from the input string, the Hamming distances between a substring and all patterns can be updated at the same time based on the binary encoding information in the look-up memory. Moreover, PMASM adopts graphics processing units (GPUs) to process the data computations in parallel. This paper presents three kinds of PMASM implementation methods in GPUs: thread PMASM, block-thread PMASM, and shared-mem PMASM methods. The shared-mem PMASM method gives an example to effectively make use of the GPU parallel capacity. Moreover, it also exploits special features of the CUDA (Compute Unified Device Architecture) memory structure to optimize the performance. In the experiments with DNA sequences, the proposed PMASM on GPU is 385, 77, and 64 times faster than the traditional naive algorithm, the shift-add algorithm and the single thread PMASM implementation on CPU. With the same NVIDIA GPU model, the performance of the proposed approach is enhanced up to 44% and 21%, compared with the naive, and the shift-add algorithms.

Parallel Fuzzy Information Processing System - KAFA : KAist Fuzzy Accelerator -

  • Kim, Young-Dal;Lee, Hyung-Kwang;Park, Kyu-Ho
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.981-984
    • /
    • 1993
  • During the past decade, several specific hardwares for fast fuzzy inference have been developed. Most of them are dedicated to a specific inference method and thus cannot support other inference methods. In this paper, we present a hardware architecture called KAFA(KAist Fuzzy Accelerator) which provides various fuzzy inference methods and fuzzy set operators. The architecture has SIMD structure, which consists of two parts; system control/interface unit(Main Controller) and arithmetic units(FPEs). Using the parallel processing technology, the KAFA has the high performance for fuzzy information processing. The speed of the KAFA holds promise for the development of the new fuzzy application systems.

  • PDF

The Implementation of Processor for Linearly shift Knapsack Public Key Crypto System In Cheon Paik (선형이동 Knapsack 공개키 암호시스템을 위한 프로세서 구현)

  • 백인천;차균현
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.11
    • /
    • pp.2291-2302
    • /
    • 1994
  • This paper shows the implementation and design of special processor for linearly shift knapsack public key cryptography system. We highten the density of existing knapsack vector and shift the vectors linearly in order to implement the structure of linearly shift knapsack system which has the stronger cryptosystem. As it needs the parallel processing at each path according to the characteristics of this system. we propose the pipelined parallel structure and implement this system into VLSL. Also we evaluate this system and compare with other systems. The processing speed of this system is 550kb/s when dimension is 100. It is possible to use this system at the place of requiring high speed security to enlarge the structure of it.

  • PDF

A Parallel Algorithm for Merging Relaxed Min-Max Heaps (Relaxed min-max 힙을 병합하는 병렬 알고리즘)

  • Min, Yong-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.5
    • /
    • pp.1162-1171
    • /
    • 1998
  • This paper presents a data structure that implements a mergable double-ended priority queue : namely an improved relaxed min-max-pair heap. By means of this new data structure, we suggest a parallel algorithm to merge priority queues organized in two relaxed heaps of different sizes, n and k, respectively. This new data-structure eliminates the blossomed tree and the lazying method used to merge the relaxed min-max heaps in [9]. As a result, employing max($2^{i-1}$,[(m+1/4)]) processors, this algorithm requires O(log(log(n/k))${\times}$log(n)) time. Also, on the MarPar machine, this method achieves a 35.205-fold speedup with 64 processors to merge 8 million data items which consist of two relaxed min-max heaps of different sizes.

  • PDF

An Efficient Dynamic Load balancing Strategy for Tree-structured Computations (트리구조의 계산을 위한 효율적인 동적 부하분산 전략)

  • Hwang, In-Jae;Hong, Dong-Kweon
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.455-460
    • /
    • 2001
  • For some applications, the computational structure changes dynamically during the program execution. When this happens, static partitioning and allocation of tasks are not enough to achieve high performance in parallel computers. In this paper, we propose a dynamic load balancing algorithm efficiently distributes the computation with dynamically growing tree structure to processors. We present an implementation technique for the algorithm on mesh architectures, and analyze its complexity. We also demonstrate through experiments how our algorithm provides good quality solutions.

  • PDF

Parallel Processing Based Decompositon Technique for Efficient Collaborative Optimization (효율적 분산협동최적설계를 위한 병렬처리 기반 분해 기법)

  • Park, Hyeong-Uk;Kim, Seong-Chan;Kim, Min-Su;Choe, Dong-Hun
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.25 no.5
    • /
    • pp.883-890
    • /
    • 2001
  • In practical design studies, most of designers solve multidisciplinary problems with large size and complex design system. These multidisciplinary problems have hundreds of analysis and thousands of variables. The sequence of process to solve these problems affects the speed of total design cycle. Thus it is very important for designer to reorder the original design processes to minimize total computational cost. This is accomplished by decomposing large multidisciplinary problem into several multidisciplinary analysis subsystem (MDASS) and processing it in parallel. This paper proposes new strategy for parallel decomposition of multidisciplinary problem to raise design efficiency by using genetic algorithm and shows the relationship between decomposition and multidisciplinary design optimization (MDO) methodology.

High Speed Turbo Product Code Decoding Algorithm (고속 Turbo Product 부호 복호 알고리즘 및 구현에 관한 연구)

  • Choi Duk-Gun;Lee In-Ki;Jung Ji-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.442-449
    • /
    • 2005
  • In this paper, we introduce three kinds of simplified high-speed decoding algorithms for turbo product decoder. First, A parallel decoder structure, the row and column decoders operate in parallel, is proposed. Second, HAD(Hard Decision Aided) algorithm is used for early-stopping algorithm. Lastly, P-Parallel TPC decoder is a parallel decoding scheme, processing P rows and P columns in parallel instead of decoding one by one as that in the original scheme.

Parallel Decoder Module for Digital-Information Translation of Optical Disc (광디스크 디지털 정보 전송을 위한 병렬구조 디코더 모듈)

  • Kim, Jong-Man;Kim, Yeong-Min;Shin, Dong-Yong;Seo, Bum-Su
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2010.06a
    • /
    • pp.289-289
    • /
    • 2010
  • Translation Characteristics of Digital Decoder utilizing the analog parallel processing circuit technology is designed. The fast parallel viterbi decoder system acted by a replacement of the conventional digital viterbi Decoder has good propagation. we are applied proposed analog viterbi decoder to decode PR signal for DVD and analyze the specific circuit and signal characteristics.

  • PDF

A NEW PARALLEL ALGORITHM FOR ROOTING A TREE

  • Kim, Tae-Nam;Oh, Duk-Hwan;Lim, Eun-Ki
    • Journal of applied mathematics & informatics
    • /
    • v.5 no.2
    • /
    • pp.427-432
    • /
    • 1998
  • When an undirected tree T and a vertex ${\gamma}$ in the tree are given the problem to transform T into a rooted tree with ${\gamma}$ as its root is considered. Using Euler tour and prefix sum an optimal algorithm has been developed [2,3]. We will present another parallel algorithm which is optimal also on EREW PRAM. Our approach resuces the given tree step by step by pruning and pointer jumping. That is the tree structure is retained during algorithm processing such that than other tree computations can be carried out in parallel.

(A Design and Implementation of Parallelizing Compiler in Loop Structure) (루프구조의 병렬화 컴파일러 설계 및 구현)

  • 송월봉
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.8
    • /
    • pp.981-988
    • /
    • 2002
  • In this paper, a simple parallel compiler of a sequential loop is presented. This is a procedure for the automatic conversion of a sequential loop into a nested parallel DOALL loops at compile time. For this. the source program of Parafrase II parallel compiler is analyzed and a new general method the extracting parallelism in order to parallel processing effectively in nested loop is implemented.

  • PDF