• Title/Summary/Keyword: Parallel Algorithms

Search Result 655, Processing Time 0.032 seconds

A Study on Hybrid Image Coder Using a Reconfigurable Multiprocessor System (Study II : Parallel Algorithm Implementation (재구성 가능한 다중 프로세서 시스템을 이용한 혼합 영상 부호화기 구현에 관한 연구(연구 II : 병렬 알고리즘 구현))

  • Choi, Sang-Hoon;Lee, Kwang-Kee;Kim, In;Lee, Yong-Kyun;Park, Kyu-Tae
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.10
    • /
    • pp.13-26
    • /
    • 1993
  • Motion picture algorithms are realized on the multiprocessor system presented in the Study I. For the most efficient processing of the algorithms, pipelining and geometrical parallel processing methods are employed, and processing time, communication load and efficiency of each algorithm are compared. The performance of the implemented system is compared and analysed with reference to MPEG coding algorithm. Theoretical calculations and experimental results both shows that geometrical partitioning is a more suitable parallel processing algorithm for moving picture coding having the advantage of easy algorithm modification and expansion, and the overall efficiency is higher than pipelining.

  • PDF

Study on the parallel processing algorithms with implicit integration method for real-time vehicle simulator development (실시간 차량 시뮬레이터 개발을 위한 암시적 적분기법을 이용한 병렬처리 알고리즘에 관한 연구)

  • 박민영;이정근;배대성
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.497-500
    • /
    • 1995
  • In this paper, a program for real time simulation of a vehicle is developed. The program uses relative coordinates and BEF(Backward Difference Formula) numerical integration method. Numerical tests showed that the proposed implicit method is more stable in carring out the numerical integration for vehicl dynamics than the explicit method. Hardware requirements for real time simulation are suggested. Algorithms of parallel processing is developed with DSP (digital signal processor).

  • PDF

Accelerating Fingerprint Enhancement Algorithm on GPGPU using OpenCL (OpenCL을 이용한 GPGPU 기반 지문개선 알고리즘 가속화)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.4
    • /
    • pp.666-672
    • /
    • 2016
  • Recently the fingerprint is widely used as one of biometrics to improve the security of financial mobile applications, because of its user convenience and high recognition rate. However, in order to apply fingerprint algorithms to finance and security applications, the recognition rate and processing speed of the fingerprint algorithms have to be improved further. In this paper, we propose the parallel fingerprint enhancement algorithm on general-purpose computing on graphics processing unit (GPGPU) using OpenCL. We discuss the analysis of the parallelism in the fingerprint algorithm as well as the exploration of optimization parameters of the parallel fingerprint algorithm to improve the performance. The experimental results showed that the execution of parallel fingerprint enhancement algorithm on GPGPUs was accelerated from 29.4 upto 69.2 times compared with the execution of the original one on the host CPUs.

Discrete Cosine Transform Algorithms for the VLSI Parallel Implementation (VLSI 병렬 연산을 위한 여현 변환 알고리듬)

  • 조남익;이상욱
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.7
    • /
    • pp.851-858
    • /
    • 1988
  • In this paper, we propose two different VLSI architectures for the parallel computation of DCT (discrete cosine transform) algorithm. First, it is shown that the DCT algorithm can be implemented on the existing systolic architecture for the DFT(discrete fourier transform) by introducing some modification. Secondly, a new prime factor DCT algorithm based on the prime factor DFT algorithm is proposed. And it is shown that the proposed algorihtm can be implemented in parallel on the systolic architecture for the prime factor DFT. However, proposed algorithm is only applicable to the data length which can be decomposed into relatively prime and odd numbers. It is also found that the proposed systolic architecture requires less multipliers than the structures implementing FDCT(fast DCT) algorithms directly.

  • PDF

Sequential and Parallel Algorithms for Finding a Longest Non-negative Path in a Tree (트리에서 가장 긴 비음수 경로를 찾는 직렬 및 병렬 알고리즘)

  • Kim, Sung-Kwon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.880-884
    • /
    • 2006
  • In an edge-weighted(positive, negative, or zero weights are possible) tree, we want to solve the problem of finding a longest path such that the sum of the weights of the edges in tile path is non-negative. To find a longest non-negative path of a tree we present a sequential algorithm with O(n logn) time and a CREW PRAM parallel algorithm with $O(log^2n)$ time and O(n) processors. where n is the number of nodes in the tree.

Scheduling for a Two-Machine, M-Parallel Flow Shop to Minimize Makesan

  • Lee, Dong Hoon;Lee, Byung Gun;Joo, Cheol Min;Lee, Woon Sik
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.23 no.56
    • /
    • pp.9-18
    • /
    • 2000
  • This paper considers the problem of two-machine, M-parallel flow shop scheduling to minimize makespan, and proposes a series of heuristic algorithms and a branch and bound algorithm. Two processing times of each job at two machines on each line are identical on any line. Since each flow-shop line consists of two machines, Johnson's sequence is optimal for each flow-shop line. Heuristic algorithms are developed in this paper by combining a "list scheduling" method and a "local search with global evaluation" method. Numerical experiments show that the proposed heuristics can efficiently give optimal or near-optimal schedules with high accuracy. with high accuracy.

  • PDF

Distributed Structural Analysis Algorithms for Large-Scale Structures based on PCG Algorithms (대형구조물의 분산구조해석을 위한 PCG 알고리즘)

  • 권윤한;박효선
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.12 no.3
    • /
    • pp.385-396
    • /
    • 1999
  • In the process of structural design for large-scale structures with several thousands of degrees of freedom, a plethora of structural calculations with large amount of data storage are required to obtain the forces and displacements of the members. However, current computational environment with single microprocessor such as a personal computer or a workstation is not capable of generating a high-level of efficiency in structural analysis and design process for large-scale structures. In this paper, a high-performance parallel computing system interconnected by a network of personal computers is proposed for an efficient structural analysis. Two distributed structural analysis algorithms are developed in the form of distributed or parallel preconditioned conjugate gradient (DPCG) method. To enhance the performance of the developed distributed structural analysis algorithms, the number of communications and the size of data to be communicated are minimized. These algorithms are applied to the structural analyses of three large space structures as well as a 144-story tube-in-tube framed structure.

  • PDF

Parallel Multithreaded Processing for Data Set Summarization on Multicore CPUs

  • Ordonez, Carlos;Navas, Mario;Garcia-Alvarado, Carlos
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.111-120
    • /
    • 2011
  • Data mining algorithms should exploit new hardware technologies to accelerate computations. Such goal is difficult to achieve in database management system (DBMS) due to its complex internal subsystems and because data mining numeric computations of large data sets are difficult to optimize. This paper explores taking advantage of existing multithreaded capabilities of multicore CPUs as well as caching in RAM memory to efficiently compute summaries of a large data set, a fundamental data mining problem. We introduce parallel algorithms working on multiple threads, which overcome the row aggregation processing bottleneck of accessing secondary storage, while maintaining linear time complexity with respect to data set size. Our proposal is based on a combination of table scans and parallel multithreaded processing among multiple cores in the CPU. We introduce several database-style and hardware-level optimizations: caching row blocks of the input table, managing available RAM memory, interleaving I/O and CPU processing, as well as tuning the number of working threads. We experimentally benchmark our algorithms with large data sets on a DBMS running on a computer with a multicore CPU. We show that our algorithms outperform existing DBMS mechanisms in computing aggregations of multidimensional data summaries, especially as dimensionality grows. Furthermore, we show that local memory allocation (RAM block size) does not have a significant impact when the thread management algorithm distributes the workload among a fixed number of threads. Our proposal is unique in the sense that we do not modify or require access to the DBMS source code, but instead, we extend the DBMS with analytic functionality by developing User-Defined Functions.

Genetic algorithms with a permutation approach to the parallel machines scheduling problem

  • Han, Yong-Ho
    • Korean Management Science Review
    • /
    • v.14 no.2
    • /
    • pp.47-61
    • /
    • 1997
  • This paper considers the parallel machines scheduling problem characterized as a multi-objective combinatorial problem. As this problem belongs to the NP-complete problem, genetic algorithms are applied instead of the traditional analytical approach. The purpose of this study is to show how the problem can be effectively solved by using genetic algorithms with a permutation approach. First, a permutation representation which can effectively represent the chromosome is introduced for this problem . Next, a schedule builder which employs the combination of scheduling theories and a simple heuristic approach is suggested. Finally, through the computer experiments of genetic algorithm to test problems, we show that the niche formation method does not contribute to getting better solutions and that the PMX crossover operator is the best among the selected four recombination operators at least for our problem in terms of both the performance of the solution and the operational convenience.

  • PDF

Two-Step Suboptimal Filters for Linear Dynamic Systems

  • Ahn, Jun-Il;Minhas, Rashid;Shin, Vladimir
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.16-21
    • /
    • 2005
  • This paper considers the problem of state estimation in linear continuous-time systems with multi-sensor environment and observation uncertainties. We propose two suboptimal filtering algorithms for these types of systems. The filtering algorithms consist of two steps: The local optimal Kalman estimates are computed at the first step. And, these local estimates are lineally fused at the second step. The implementation of the two-step filtering algorithms needs a lower memory demand than the optimal Kalman and adaptive Lainiotis-Kalman filters. In consequence of parallel structure of the proposed filters, the parallel computers can be used for their design. The examples exhibit the effect of common noise on the performance of fusion of the local Kalman estimates based on observations from different sensors and in the presence of uncertainties.

  • PDF