• Title/Summary/Keyword: parallel algorithms

Search Result 655, Processing Time 0.026 seconds

Unit Commitment Using Parallel Genetic Algorithms and Parallel Tabu Search (병렬 유전알고리즘과 병렬 타부탐색법을 이용한 발전기 기동정지계획)

  • Cho, Deok-Hwan;Kang, Hyun-Tae;Kwon, Jung-Uk;Kim, Hyung-Su;Hwang, Gi-Hyun;Park, June-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2001.07a
    • /
    • pp.327-329
    • /
    • 2001
  • This paper presents the application of Parallel genetic algorithm and parallel tabu search to search an optimal solution of a unit commitment problem. The proposed method previously searches the solution globally using the parallel genetic algorithm, and then searches the solution locally using tabu search which has the good local search characteristic to reduce the computation time. This method combines the benefit of both method, and thus improves the performance. To show the usefulness of the proposed method, we simulated for 10 units system. Numerical results show the improvements of cost and computation time compared to previous obtained results.

  • PDF

Correct Implementation of Sub-warp Parallel Prefix Operations based on GPU Hardware Architecture (GPU 하드웨어 아키텍처 기반 sub-warp 단위 병렬 프리픽스(prefix) 연산의 정확한 구현)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.613-619
    • /
    • 2017
  • This paper presents a CUDA (Compute Unified Device Architecture) code to achieve correct GPU parallel segmented prefix operation results with less than 32 segment length for large data arrays. Mark Harris and Michael Garland had published CUDA code to address the tasks. This paper shows that their code does not generate correct results when the local segment length is less than 32, discusses the cause of the problem, and presents a CUDA code that generates correct results. The segmented parallel prefix operation presented in this paper can be applied as a building block to various large parallel processing algorithms including the k-nearest neighbor search problems.

An Improving Method of Restructuring Parallel Programs for Data Race Detection

  • Ha, Keum-Sook;Lee, Sung woo;Yoo, Kee-Young
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.715-718
    • /
    • 2000
  • Although shared memory parallel programs are designed to be deterministic both in their final results and intermediate states, the races that occur when different processes access a common memory location in an order not guaranteed by synchronization could result in unintended non-deterministic executions of the program. So, Detecting races, particularly first data races, is important for debugging explicit shared memory parallel programs. It is possible that all data races reported by other on-the-fly algorithms would disappear once the first races were removed. To detect races parallel programs with nested loops and inter-thread coordination, it must guarantee the order of synchronization operations in an execution instance. In this paper, we propose an improved restructuring method that guarantee ordering execution instance and preserve the semantics of original program. This method requires O(np) time and (s + up) space, where n is the number of total operations, s is the number of synchronization operations and p is the number of parallelism in the execution. Also, this method makes on-the-fly detection of parallel program with nested loops and inter-thread coordination more easily in space and time complexity.

  • PDF

Parallel Algorithm of Improved FunkSVD Based on Spark

  • Yue, Xiaochen;Liu, Qicheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1649-1665
    • /
    • 2021
  • In view of the low accuracy of the traditional FunkSVD algorithm, and in order to improve the computational efficiency of the algorithm, this paper proposes a parallel algorithm of improved FunkSVD based on Spark (SP-FD). Using RMSProp algorithm to improve the traditional FunkSVD algorithm. The improved FunkSVD algorithm can not only solve the problem of decreased accuracy caused by iterative oscillations but also alleviate the impact of data sparseness on the accuracy of the algorithm, thereby achieving the effect of improving the accuracy of the algorithm. And using the Spark big data computing framework to realize the parallelization of the improved algorithm, to use RDD for iterative calculation, and to store calculation data in the iterative process in distributed memory to speed up the iteration. The Cartesian product operation in the improved FunkSVD algorithm is divided into blocks to realize parallel calculation, thereby improving the calculation speed of the algorithm. Experiments on three standard data sets in terms of accuracy, execution time, and speedup show that the SP-FD algorithm not only improves the recommendation accuracy, shortens the calculation interval compared to the traditional FunkSVD and several other algorithms but also shows good parallel performance in a cluster environment with multiple nodes. The analysis of experimental results shows that the SP-FD algorithm improves the accuracy and parallel computing capability of the algorithm, which is better than the traditional FunkSVD algorithm.

Force Distribution Algorithms For Singularity-Free 3-DOF Parallel Haptic Device With Redundant Actuation

  • Kim, Tae-Ju;Chung, Goo-Bong;Yi, Byung-Ju;Seo, Il-Hong
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1598-1602
    • /
    • 2003
  • The parallel-type mechanism provides more accurate and stiff motion than the serial-type mechanism. However, in case of using the haptic device, the performance of the force reflection can be deteriorated due to the singular points existing in workspace. In this paper, we propose a redundantly actuated parallel 3-DOF haptic device, which is singularity-free in the workspace and has an improved force reflection capability. In addition, we propose a new force distribution algorithm, which can reflect force of both high and low resolution, using two sets of actuator with different size. Redundant actuators are attached to the base frame in order to minimize the inertia of the system. Moreover, a wire and gear reduction system is employed to achieve high force reflection along with soft feeling. We confirm the performance of the force reflection capability throughout simulation.

  • PDF

A Restricted Neighborhood Generation Scheme for Parallel Machine Scheduling (병렬 기계 스케줄링을 위한 제한적 이웃해 생성 방안)

  • Shin, Hyun-Joon;Kim, Sung-Shick
    • IE interfaces
    • /
    • v.15 no.4
    • /
    • pp.338-348
    • /
    • 2002
  • In this paper, we present a restricted tabu search(RTS) algorithm that schedules jobs on identical parallel machines in order to minimize the maximum lateness of jobs. Jobs have release times and due dates. Also, sequence-dependent setup times exist between jobs. The RTS algorithm consists of two main parts. The first part is the MATCS(Modified Apparent Tardiness Cost with Setups) rule that provides an efficient initial schedule for the RTS. The second part is a search heuristic that employs a restricted neighborhood generation scheme with the elimination of non-efficient job moves in finding the best neighborhood schedule. The search heuristic reduces the tabu search effort greatly while obtaining the final schedules of good quality. The experimental results show that the proposed algorithm gives better solutions quickly than the existing heuristic algorithms such as the RHP(Rolling Horizon Procedure) heuristic, the basic tabu search, and simulated annealing.

A Synchronous/Asynchronous Hybrid Parallel Power Iteration for Large Eigenvalue Problems by the MPMD Methodology (MPMD 방식의 동기/비동기 병렬 혼합 멱승법에 의한 거대 고유치 문제의 해법)

  • Park, Pil-Seong
    • The KIPS Transactions:PartA
    • /
    • v.11A no.1
    • /
    • pp.67-74
    • /
    • 2004
  • Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence overall performance of the system is dependent upon the speed of the slowest processor. In this paper, we det·ise a synchronous/asynchronous hybrid algorithm to accelerate convergence of the solution for finding the dominant eigenpair of a large matrix, by reducing the idle times of faster processors using MPMD programming methodology.

Parallel Genetic Algorithm using Fuzzy Logic (퍼지 논리를 이용한 병렬 유전 알고리즘)

  • An Young-Hwa;Kwon Key-Ho
    • The KIPS Transactions:PartA
    • /
    • v.13A no.1 s.98
    • /
    • pp.53-56
    • /
    • 2006
  • Genetic algorithms(GA), which are based on the idea of natural selection and natural genetics, have proven successful in solving difficult problems that are not easily solved through conventional methods. The classical GA has the problem to spend much time when population is large. Parallel genetic algorithm(PGA) is an extension of the classical GA. The important aspect in PGA is migration and GA operation. This paper presents PGAs that use fuzzy logic. Experimental results show that the proposed methods exhibit good performance compared to the classical method.

Improved Iterative Decoding of Parallel and Serially Concatenated Trellis Coded Modulation (병렬 및 직렬적으로 연접된 트렐리스 부호화 변조 기법을 위한 향상된 반복적 복호 기법)

  • You, Cheol-Woo;Seo, Dong-Sun
    • Journal of IKEEE
    • /
    • v.11 no.4
    • /
    • pp.198-204
    • /
    • 2007
  • For parallel and serially concatenated trellis coded modulation (TCM), improved iterative decoding schemes with a simple mechanism are proposed and their performances are compared with those of conventional decoding schemes. Simulation results have shown that the proposed schemes have provided a considerable decoding gain in additive white Gaussian noise (AWGN) channels and Rayleigh fading channels, even if they can be implemented by a simple modification of conventional decoding algorithms.

  • PDF

A Parallel Algorithm For Rectilinear Steiner Tree Using Associative Processor (연합 처리기를 이용한 직교선형 스타이너 트리의 병렬 알고리즘)

  • Taegeun Park
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.8
    • /
    • pp.1057-1063
    • /
    • 1995
  • This paper describes an approach for constucting a Rectilinear Steiner Tree (RST) derivable from a Minimum Spanning Tree (MST), using Associative Processor (AP). We propose a fast parallel algorithm using AP's basic algorithms which can be realized by the processing capability of rudimentary logic and the selective matching capability of Content- Addressable Memory (CAM). The main idea behind the proposed algorithm is to maximize the overlaps between the consecutive edges in MST, thus minimizing the cost of a RST. An efficient parallel linear algorithm with O(n) complexity to construct a RST is proposed using an algorithm to find a MST, where n is the number of nodes. A node insertion method is introduced to allow the Z-type layout. The routing process which only depends on the neighbor edges and the no-rerouting strategy both help to speed up finding a RST.

  • PDF