• 제목/요약/키워드: 분산 병렬 알고리즘

Search Result 170, Processing Time 0.028 seconds

Implementation Of Asymmetric Communication For Asynchronous Iteration By the MPMD Method On Distributed Memory Systems (분산 메모리 시스템에서의 MPMD 방식의 비동기 반복 알고리즘을 위한 비대칭 전송의 구현)

  • Park Pil-Seong
    • Journal of Internet Computing and Services
    • /
    • v.4 no.5
    • /
    • pp.51-60
    • /
    • 2003
  • Asynchronous iteration is a way to reduce performance degradation of some parallel algorithms due to load imbalance or transmission delay between computing nodes, which requires asymmetric communication between the nodes of different speeds. To implement such asynchronous communication on distributed memory systems, we suggest an MPMD method that creates an additional separate server process on each computing node, and compare it with an SPMD method that creates a single process per node.

  • PDF

Optimal Design of Direct-Driven Wind Generator Using Mesh Adaptive Direct Search(MADS) (MADS를 이용한 직접구동형 풍력발전기 최적설계)

  • Park, Ji-Seong;An, Young-Jun;Lee, Cheol-Gyun;Kim, Jong-Wook;Jung, Sang-Yong
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.23 no.12
    • /
    • pp.48-57
    • /
    • 2009
  • This paper presents optimal design of direct-driven PM wind generator using MADS (Mesh Adaptive Direct Search). Optimal design of the direct-driven PM Wind Generator, combined with MADS and FEM (Finite Element Method), has been performed to maximize the Annual Energy Production (AEP) over the whole wind speed characterized by the statistical model of the wind speed distribution. In particular, the newly applied MADS contributes to reducing the computation time when compared with Genetic Algorithm (GA) implemented with the parallel computing method.

Static Allocation of C++ Objects to CORBA-based Distributed Systems (C++ 객체의 CORBA 기반 분산 시스템으로의 정적 할당)

  • 최승훈
    • Journal of Internet Computing and Services
    • /
    • v.1 no.2
    • /
    • pp.69-88
    • /
    • 2000
  • One of the most important factors on the performance of the distributed systems is the effective distribution of the software components, There have been a lot of researches on partitioning and allocating the task-based system, while the studies on the allocating the objects of the object-oriented system into the distributed object environments are very little relatively. In this paper. we defines the graph model for partitioning the existing C++ application and allocating the C++ objects into CORBA-base distributed system, In addition, we propose a distributed object allocation algorithm based on this graph model. The performance of distributed systems is determined by the concurrency between objects, the load balance among the processors and the communication cost on the networks. To search for the solutions optimizing the above three factors simultaneously, the object allocation algorithm of this paper is based on the Niched Pareto Genetic Algorithm (NPGA). We performed the experiment on the typical C++ application and CORBA system to prove the effectiveness of our graph model and our object allocation algorithm.

  • PDF

Support vector machines for big data analysis (빅 데이터 분석을 위한 지지벡터기계)

  • Choi, Hosik;Park, Hye Won;Park, Changyi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.989-998
    • /
    • 2013
  • We cannot analyze big data, which attracts recent attentions in industry and academy, by batch processing algorithms developed in data mining because big data, by definition, cannot be uploaded and processed in the memory of a single system. So an imminent issue is to develop various leaning algorithms so that they can be applied to big data. In this paper, we review various algorithms for support vector machines in the literature. Particularly, we introduce online type and parallel processing algorithms that are expected to be useful in big data classifications and compare the strengths, the weaknesses and the performances of those algorithms through simulations for linear classification.

Comparison of Genetic Algorithms and Simulated Annealing for Multiprocessor Task Allocation (멀티프로세서 태스크 할당을 위한 GA과 SA의 비교)

  • Park, Gyeong-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2311-2319
    • /
    • 1999
  • We present two heuristic algorithms for the task allocation problem (NP-complete problem) in parallel computing. The problem is to find an optimal mapping of multiple communicating tasks of a parallel program onto the multiple processing nodes of a distributed-memory multicomputer. The purpose of mapping these tasks into the nodes of the target architecture is the minimization of parallel execution time without sacrificing solution quality. Many heuristic approaches have been employed to obtain satisfactory mapping. Our heuristics are based on genetic algorithms and simulated annealing. We formulate an objective function as a total computational cost for a mapping configuration, and evaluate the performance of our heuristic algorithms. We compare the quality of solutions and times derived by the random, greedy, genetic, and annealing algorithms. Our experimental findings from a simulation study of the allocation algorithms are presented.

  • PDF

Performance Improvement of Network Based Parallel Genetic Algorithm by Exploiting Server's Computing Power (서버의 계산능력을 활용한 네트워크기반 병렬유전자알고리즘의 성능향상)

  • 송봉기;김용성;성길영;우종호
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.4
    • /
    • pp.67-72
    • /
    • 2004
  • This paper proposes a method improving the convergence speed of optimal solution for parallel genetic algorithm in the network based client-server model. Unlike the existing methods of obtaining global elite only by evaluating local elites in server, the proposed method obtains it by evaluating local elites and improving its fitness by applying genetic algorithm during idle time of the server. By using the improved chromosome in server for the client's genetic algorithm processing, the convergence speed of the optimal solution is increased. The improvement of fitness at the server during the interval of chromosome migration is (equation omitted)(F$_{max}$(g)-F$_{max}$(g-1)), whole F$_{max}$(g) is a max fitness of the g-th generation and G is the number of improved generation by the server. As the number of clients increases and G decreases, the improvement of fitness goes down. However the improvement of fitness is better than existing methods..

A Genetic-Based Optimization Model for Clustered Node Allocation System in a Distributed Environment (분산 환경에서 클러스터 노드 할당 시스템을 위한 유전자 기반 최적화 모델)

  • Park, Kyeong-mo
    • The KIPS Transactions:PartA
    • /
    • v.10A no.1
    • /
    • pp.15-24
    • /
    • 2003
  • In this paper, an optimization model for the clustered node allocation systems in the distributed computing environment is presented. In the presented model with a distributed file system framework, the dynamics of system behavior over times is carefully thought over the nodes and hence the functionality of the cluster monitor node to check the feasibility of the current set of clustered node allocation is given. The cluster monitor node of the node allocation system capable of distributing the parallel modules to clustered nodes provides a good allocation solution using Genetic Algorithms (GA). As a part of the experimental studies, the solution quality and computation time effects of varying GA experimental parameters, such as the encoding scheme, the genetic operators (crossover, mutations), the population size, and the number of node modules, and the comparative findings are presented.

Efficient distributed consensus optimization based on patterns and groups for federated learning (연합학습을 위한 패턴 및 그룹 기반 효율적인 분산 합의 최적화)

  • Kang, Seung Ju;Chun, Ji Young;Noh, Geontae;Jeong, Ik Rae
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.73-85
    • /
    • 2022
  • In the era of the 4th industrial revolution, where automation and connectivity are maximized with artificial intelligence, the importance of data collection and utilization for model update is increasing. In order to create a model using artificial intelligence technology, it is usually necessary to gather data in one place so that it can be updated, but this can infringe users' privacy. In this paper, we introduce federated learning, a distributed machine learning method that can update models in cooperation without directly sharing distributed stored data, and introduce a study to optimize distributed consensus among participants without an existing server. In addition, we propose a pattern and group-based distributed consensus optimization algorithm that uses an algorithm for generating patterns and groups based on the Kirkman Triple System, and performs parallel updates and communication. This algorithm guarantees more privacy than the existing distributed consensus optimization algorithm and reduces the communication time until the model converges.

Peak Power Minimization for Clustered VLIW Architectures (분산된 VLIW 구조에서의 최대 전력 최소화 방법)

  • 서재원;김태환;정기석
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.5_6
    • /
    • pp.258-264
    • /
    • 2003
  • VLIW architecture has emerged as one of the most effective architectures in dealing with multimedia applications. In multimedia applications, there is ample potential for parallelizing the execution of multiple operations because such applications typically have data intensive processing which often has limited data and/or control dependencies. As the degree of instruction-level parallelism increases, non-clustered VLIW architectures scale poorly because of the tremendous register port pressure. Therefore, clustered VLIW architecture is definitely preferred over non-clustered VLIW architecture when a higher degree of parallelizing is possible as in the case of multimedia processing However, having multiple clusters in an architecture implies that the amount of hardware is quite large, and therefore, power consumption becomes a very crucial issue. In this paper, we propose an algorithm to minimize the peak power consumption without incurring little or no delay penalty. The effectiveness of our algorithm has been verified by various sets of experiments, and up to 30.7% reduction in the peak power consumption is observed compared with the results that is optimized to minimize resources only.