• Title/Summary/Keyword: Parallel Processing Algorithm

Search Result 680, Processing Time 0.029 seconds

An Efficient Face Detection Method using Skin Color Information and Parallel Processing in Multi-Core SoC (멀티코어 SoC에서 피부색상 정보와 병렬처리를 이용한 효율적인 얼굴 검출 방법)

  • Kim, Hong-Hee;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.16 no.4
    • /
    • pp.375-381
    • /
    • 2012
  • In this paper, we present an implementation of Viola-Jones algorithm in a multi-core SoC by using skin color information and a parallel processing method. In order to reduce unnecessary operations and improve the detection speed, we adopted a face detection algorithm based on skin color and deleted background image. The algorithm is functionally divided into several parts taking account of the size and the dependency so that the divided functions can be proceeded in parallel. Experiment results in SoC with built-in Cortex-A9 multi core show that it is about 1.8 times faster than the existing algorithm which is not divided.

DMRUT-MCDS: Discovery Relationships in the Cyber-Physical Integrated Network

  • Lu, Hongliang;Cao, Jiannong;Zhu, Weiping;Jiao, Xianlong;Lv, Shaohe;Wang, Xiaodong
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.558-567
    • /
    • 2015
  • In recent years, we have seen a proliferation of mobile-network-enabled smart objects, such as smart-phones and smart-watches, that form a cyber-physical integrated network to connect the cyber and physical worlds through the capabilities of sensing, communicating, and computing. Discovery of the relationship between smart objects is a critical and nontrivial task in cyber-physical integrated network applications. Aiming to find the most stable relationship in the heterogeneous and dynamic cyber-physical network, we propose a distributed and efficient relationship-discovery algorithm, called dynamically maximizing remaining unchanged time with minimum connected dominant set (DMRUT-MCDS) for constructing a backbone with the smallest scale infrastructure. In our proposed algorithm, the impact of the duration of the relationship is considered in order to balance the size and sustain time of the infrastructure. The performance of our algorithm is studied through extensive simulations and the results show that DMRUT-MCDS performs well in different distribution networks.

The Mapping Method for Parallel Processing of SAR Data

  • In-Pyo Hong;Jae-Woo Joo;Han-Kyu Park
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.11A
    • /
    • pp.1963-1970
    • /
    • 2001
  • It is essential design process to analyze processing method and set out top level HW configuration using main parameters before implementation of the SAR processor. This paper identifies the impact of the I/O and algorithm structure upon the parallel processing to be assessed and suggests the practical mapping method fur parallel processing to the SAR data. Also, simulation is performed to the E-SAR processor to examine the usefulness of the method, and the results are analyzed and discussed.

  • PDF

An Optimal Parallel Sort Algorithm for Minimum Data Movement (최소 자료 이동을 위한 최적 병렬 정렬 알고리즘)

  • Hong, Seong-Su;Sim, Jae-Hong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.3
    • /
    • pp.290-298
    • /
    • 1994
  • In this paper we propose parallel sorting algorithm, taking 0( $n^{n}$ log n) time complexity, 0( $n^{x}$ log n) cost (parallel running time * number of processors) and 0( $n^{1-}$x+ $n^{x}$ )data movement complexity under the ERWW- PRAM model. The methods for solving these problems similar. Parallel algorithm finds pivot for partitioning the data into ordered subsets of approximately equal size by using encording pointers..

  • PDF

Proposal for Decoding-Compatible Parallel Deflate Algorithm by Inserting Control Header Composed of Non-Compressed Blocks (비 압축 블록으로 구성된 제어 헤더 삽입을 통한 압축 해제 호환성 있는 병렬 처리 Deflate 알고리즘 제안)

  • Kim Jung Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.5
    • /
    • pp.207-216
    • /
    • 2023
  • For decoding-compatible parallel Deflate algorithm, this study proposed a new method of the control header being made in such a way that essential information for parallel compression and decompression are stored in the Disposed Bit Area (DBA) of the non-compression block and being inserted into the compressed blocks. Through this, parallel compression and decompression are possible while maintaining perfect compatibility with the existing decoder. After applying this method, the compression time was reduced by up to 71.2% compared to the sequential processing method, and the parallel decompression time was reduced by up to 65.7%. In particular, it is well known that parallel decompression is impossible due to the structural limitations of the Deflate algorithm. However, the decoder equipped with the proposed method enables high-speed parallel decompression at the algorithm level and maintains compatibility, so that parallelly compressed data can be decoded normally by existing decoder programs.

Design of a motion estimator with systolic array structure (Systolic array 구조를 갖는 움직임 추정기 설계)

  • 정대호;최석준;김환영
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.36-42
    • /
    • 1997
  • In the whole world, the research about the VLSI implementation of motion estimation algorithm is progressed to actively full (brute force) search algorithm research with the development of systolic array possible to parallel and pipeline processing. But, because of processing time's limit in a field to handle a huge data quantily such as a high definition television, many problems are happened to full search algorithm. In the paper, as a fast processing to using parallel scheme for the serial input image data, motion estimator of systolic array structure verifying that processing time is improved in contrast to the conventional full search algorithm.

  • PDF

Parallel Fuzzy Inference Method for Large Volumes of Satellite Images

  • Lee, Sang-Gu
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.1 no.1
    • /
    • pp.119-124
    • /
    • 2001
  • In this pattern recognition on the large volumes of remote sensing satellite images, the inference time is much increased. In the case of the remote sensing data [5] having 4 wavebands, the 778 training patterns are learned. Each land cover pattern is classified by using 159, 900 patterns including the trained patterns. For the fuzzy classification, the 778 fuzzy rules are generated. Each fuzzy rule has 4 fuzzy variables in the condition part. Therefore, high performance parallel fuzzy inference system is needed. In this paper, we propose a novel parallel fuzzy inference system on T3E parallel computer. In this, fuzzy rules are distributed and executed simultaneously. The ONE_To_ALL algorithm is used to broadcast the fuzzy input to the all nodes. The results of the MIN/MAX operations are transferred to the output processor by the ALL_TO_ONE algorithm. By parallel processing of the fuzzy rules, the parallel fuzzy inference algorithm extracts match parallelism and achieves a good speed factor. This system can be used in a large expert system that ha many inference variables in the condition and the consequent part.

  • PDF

A Parallel Finite Element Procedure for Contact-Impact Problems (충돌해석을 위한 병렬유한요소 알고리즘)

  • Har, Jason
    • Proceedings of the KSME Conference
    • /
    • 2003.11a
    • /
    • pp.1286-1290
    • /
    • 2003
  • This paper presents a newly implemented parallel finite element procedure for contact-impact problems. Three sub-algorithms are includes in the proposed parallel contact-impact procedure, such as a parallel Belytschko-Lin-Tsay (BLT) shell element generation, a parallel explicit time integration scheme, and a parallel contact search algorithm based on the master slave slide-line algorithm. The underlying focus of the algorithms is on its effectiveness and efficiency for inclusion in future finite element systems on parallel computers. Throughout this research, a prototype code, named GT-PARADYN, is developed on the IBM SP2, a distributed-memory computer. Some numerical examples are provided to demonstrate the timing results of the procedure, discussing the accuracy and efficiency of the code.

  • PDF

An Analysis of Existing Studies on Parallel and Distributed Processing of the Rete Algorithm (Rete 알고리즘의 병렬 및 분산 처리에 관한 기존 연구 분석)

  • Kim, Jaehoon
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.7
    • /
    • pp.31-45
    • /
    • 2019
  • The core technologies for intelligent services today are deep learning, that is neural networks, and parallel and distributed processing technologies such as GPU parallel computing and big data. However, for intelligent services and knowledge sharing services through globally shared ontologies in the future, there is a technology that is better than the neural networks for representing and reasoning knowledge. It is a knowledge representation of IF-THEN in RIF or SWRL, which is the standard rule language of the Semantic Web, and can be inferred efficiently using the rete algorithm. However, when the number of rules processed by the rete algorithm running on a single computer is 100,000, its performance becomes very poor with several tens of minutes, and there is an obvious limitation. Therefore, in this paper, we analyze the past and current studies on parallel and distributed processing of rete algorithm, and examine what aspects should be considered to implement an efficient rete algorithm.

Load Balancing Algorithm for Parallel Computing of Design Problem involving Multi-Disciplinary Analysis (다분야통합해석에 기반한 설계문제의 병렬처리를 위한 부하분산알고리즘)

  • Cho, Jae-Suk;Chu, Min-Sik;Song, Yong-Ho;Choi, Dong-Hoon
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2007.04a
    • /
    • pp.327-332
    • /
    • 2007
  • An engineering design problem involving Multi-Disciplinary Analysis(MDA) generally requires a large amounts of CPU time for the entire design process, and therefore Multiple Processing System (MPS) are essential to reduce the completion time. However, when applying conventional parallel processing techniques, all of the CAE S/W required for the MDA should be installed on all the servers making up NIPS because of characteristic of MDA and it would be a great expense in CAE S/W licenses. To solve this problem, we propose a Weight-based Multiqueue Load Balancing algorithm for a heterogeneous MPS where performance of servers and CAE S/W installed on each server are different of each other. To validate the performance, a computational experiments comparing the First Come First Serve algorithm and our proposed algorithm was accomplished.

  • PDF