• 제목/요약/키워드: parallel algorithms

검색결과 655건 처리시간 0.027초

재구성 가능한 다중 프로세서 시스템을 이용한 혼합 영상 부호화기 구현에 관한 연구(연구 II : 병렬 알고리즘 구현) (A Study on Hybrid Image Coder Using a Reconfigurable Multiprocessor System (Study II : Parallel Algorithm Implementation)

  • 최상훈;이광기;김인;이용균;박규태
    • 전자공학회논문지B
    • /
    • 제30B권10호
    • /
    • pp.13-26
    • /
    • 1993
  • Motion picture algorithms are realized on the multiprocessor system presented in the Study I. For the most efficient processing of the algorithms, pipelining and geometrical parallel processing methods are employed, and processing time, communication load and efficiency of each algorithm are compared. The performance of the implemented system is compared and analysed with reference to MPEG coding algorithm. Theoretical calculations and experimental results both shows that geometrical partitioning is a more suitable parallel processing algorithm for moving picture coding having the advantage of easy algorithm modification and expansion, and the overall efficiency is higher than pipelining.

  • PDF

실시간 차량 시뮬레이터 개발을 위한 암시적 적분기법을 이용한 병렬처리 알고리즘에 관한 연구 (Study on the parallel processing algorithms with implicit integration method for real-time vehicle simulator development)

  • 박민영;이정근;배대성
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 1995년도 추계학술대회 논문집
    • /
    • pp.497-500
    • /
    • 1995
  • In this paper, a program for real time simulation of a vehicle is developed. The program uses relative coordinates and BEF(Backward Difference Formula) numerical integration method. Numerical tests showed that the proposed implicit method is more stable in carring out the numerical integration for vehicl dynamics than the explicit method. Hardware requirements for real time simulation are suggested. Algorithms of parallel processing is developed with DSP (digital signal processor).

  • PDF

OpenCL을 이용한 GPGPU 기반 지문개선 알고리즘 가속화 (Accelerating Fingerprint Enhancement Algorithm on GPGPU using OpenCL)

  • 김대희;박능수
    • 전기학회논문지
    • /
    • 제65권4호
    • /
    • pp.666-672
    • /
    • 2016
  • Recently the fingerprint is widely used as one of biometrics to improve the security of financial mobile applications, because of its user convenience and high recognition rate. However, in order to apply fingerprint algorithms to finance and security applications, the recognition rate and processing speed of the fingerprint algorithms have to be improved further. In this paper, we propose the parallel fingerprint enhancement algorithm on general-purpose computing on graphics processing unit (GPGPU) using OpenCL. We discuss the analysis of the parallelism in the fingerprint algorithm as well as the exploration of optimization parameters of the parallel fingerprint algorithm to improve the performance. The experimental results showed that the execution of parallel fingerprint enhancement algorithm on GPGPUs was accelerated from 29.4 upto 69.2 times compared with the execution of the original one on the host CPUs.

VLSI 병렬 연산을 위한 여현 변환 알고리듬 (Discrete Cosine Transform Algorithms for the VLSI Parallel Implementation)

  • 조남익;이상욱
    • 대한전자공학회논문지
    • /
    • 제25권7호
    • /
    • pp.851-858
    • /
    • 1988
  • In this paper, we propose two different VLSI architectures for the parallel computation of DCT (discrete cosine transform) algorithm. First, it is shown that the DCT algorithm can be implemented on the existing systolic architecture for the DFT(discrete fourier transform) by introducing some modification. Secondly, a new prime factor DCT algorithm based on the prime factor DFT algorithm is proposed. And it is shown that the proposed algorihtm can be implemented in parallel on the systolic architecture for the prime factor DFT. However, proposed algorithm is only applicable to the data length which can be decomposed into relatively prime and odd numbers. It is also found that the proposed systolic architecture requires less multipliers than the structures implementing FDCT(fast DCT) algorithms directly.

  • PDF

트리에서 가장 긴 비음수 경로를 찾는 직렬 및 병렬 알고리즘 (Sequential and Parallel Algorithms for Finding a Longest Non-negative Path in a Tree)

  • 김성권
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제33권12호
    • /
    • pp.880-884
    • /
    • 2006
  • 각 에지에 무게(양수, 음수, 0 가능)가 주어진 트리에서, 경로의 에지들의 무게의 합이 비음수이면서 길이가 가장 긴 경로를 구하는 문제를 해결하고자 한다. 트리에서 가장 긴 비음수 경로를 찾는 O(n logn) 시간 직렬 알고리즘과 $O(log^2n)$ 시간과 O(n)개의 프로세서를 사용하는 CREW PRAM 병렬 알고리즘을 제시한다. 여기서, n은 트리가 가지는 노드의 수이다.

Scheduling for a Two-Machine, M-Parallel Flow Shop to Minimize Makesan

  • Lee, Dong Hoon;Lee, Byung Gun;Joo, Cheol Min;Lee, Woon Sik
    • 산업경영시스템학회지
    • /
    • 제23권56호
    • /
    • pp.9-18
    • /
    • 2000
  • This paper considers the problem of two-machine, M-parallel flow shop scheduling to minimize makespan, and proposes a series of heuristic algorithms and a branch and bound algorithm. Two processing times of each job at two machines on each line are identical on any line. Since each flow-shop line consists of two machines, Johnson's sequence is optimal for each flow-shop line. Heuristic algorithms are developed in this paper by combining a "list scheduling" method and a "local search with global evaluation" method. Numerical experiments show that the proposed heuristics can efficiently give optimal or near-optimal schedules with high accuracy. with high accuracy.

  • PDF

대형구조물의 분산구조해석을 위한 PCG 알고리즘 (Distributed Structural Analysis Algorithms for Large-Scale Structures based on PCG Algorithms)

  • 권윤한;박효선
    • 한국전산구조공학회논문집
    • /
    • 제12권3호
    • /
    • pp.385-396
    • /
    • 1999
  • 최근 공학분야에서 다루어지고 있는 문제의 규모가 대형화하고 있으며 이러한 대형구조물의 구조설계는 부재의 강도설계 및 절점의 변위조절을 위하여 많은 수의 구조해석을 요구한다. 한 대의 개인용 컴퓨터에 의한 대형구조물의 구조해석은 대용량의 기억장치와 많은 계산 시간이 요구되므로 반복적 해석이 필요한 대형구조물의 설계에 효율적으로 이용되기 어려운 실정이다. 따라서, 본 논문에서는 이러한 문제에 대한 대안으로 다수의 개인용 컴퓨터들을 네트워크로 연결하여 고성능 병렬연산시스템을 구성하고 이에 적합한 두 가지 형태의 분산구조방정식해법들을 반복법인 PCG 알고리즘을 이용하여 개발하였다. 대형구조물을 위한 분산구조해석법은 구조해석 과정에 요구되는 각 컴퓨터 상호 간의 통신회수와 통신량을 최소화할 수 있도록 개발되었다. 분산구조해석법의 성능은 대규모 3차원 트러스 구조물 및 144층 가새 튜브구조물의 구조해석에 적용하여 분석하였다.

  • PDF

Parallel Multithreaded Processing for Data Set Summarization on Multicore CPUs

  • Ordonez, Carlos;Navas, Mario;Garcia-Alvarado, Carlos
    • Journal of Computing Science and Engineering
    • /
    • 제5권2호
    • /
    • pp.111-120
    • /
    • 2011
  • Data mining algorithms should exploit new hardware technologies to accelerate computations. Such goal is difficult to achieve in database management system (DBMS) due to its complex internal subsystems and because data mining numeric computations of large data sets are difficult to optimize. This paper explores taking advantage of existing multithreaded capabilities of multicore CPUs as well as caching in RAM memory to efficiently compute summaries of a large data set, a fundamental data mining problem. We introduce parallel algorithms working on multiple threads, which overcome the row aggregation processing bottleneck of accessing secondary storage, while maintaining linear time complexity with respect to data set size. Our proposal is based on a combination of table scans and parallel multithreaded processing among multiple cores in the CPU. We introduce several database-style and hardware-level optimizations: caching row blocks of the input table, managing available RAM memory, interleaving I/O and CPU processing, as well as tuning the number of working threads. We experimentally benchmark our algorithms with large data sets on a DBMS running on a computer with a multicore CPU. We show that our algorithms outperform existing DBMS mechanisms in computing aggregations of multidimensional data summaries, especially as dimensionality grows. Furthermore, we show that local memory allocation (RAM block size) does not have a significant impact when the thread management algorithm distributes the workload among a fixed number of threads. Our proposal is unique in the sense that we do not modify or require access to the DBMS source code, but instead, we extend the DBMS with analytic functionality by developing User-Defined Functions.

Genetic algorithms with a permutation approach to the parallel machines scheduling problem

  • Han, Yong-Ho
    • 경영과학
    • /
    • 제14권2호
    • /
    • pp.47-61
    • /
    • 1997
  • This paper considers the parallel machines scheduling problem characterized as a multi-objective combinatorial problem. As this problem belongs to the NP-complete problem, genetic algorithms are applied instead of the traditional analytical approach. The purpose of this study is to show how the problem can be effectively solved by using genetic algorithms with a permutation approach. First, a permutation representation which can effectively represent the chromosome is introduced for this problem . Next, a schedule builder which employs the combination of scheduling theories and a simple heuristic approach is suggested. Finally, through the computer experiments of genetic algorithm to test problems, we show that the niche formation method does not contribute to getting better solutions and that the PMX crossover operator is the best among the selected four recombination operators at least for our problem in terms of both the performance of the solution and the operational convenience.

  • PDF

Two-Step Suboptimal Filters for Linear Dynamic Systems

  • Ahn, Jun-Il;Minhas, Rashid;Shin, Vladimir
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.16-21
    • /
    • 2005
  • This paper considers the problem of state estimation in linear continuous-time systems with multi-sensor environment and observation uncertainties. We propose two suboptimal filtering algorithms for these types of systems. The filtering algorithms consist of two steps: The local optimal Kalman estimates are computed at the first step. And, these local estimates are lineally fused at the second step. The implementation of the two-step filtering algorithms needs a lower memory demand than the optimal Kalman and adaptive Lainiotis-Kalman filters. In consequence of parallel structure of the proposed filters, the parallel computers can be used for their design. The examples exhibit the effect of common noise on the performance of fusion of the local Kalman estimates based on observations from different sensors and in the presence of uncertainties.

  • PDF