• 제목/요약/키워드: Resilient distributed datasets

검색결과 3건 처리시간 0.019초

Cooperative Coevolution Differential Evolution Based on Spark for Large-Scale Optimization Problems

  • Tan, Xujie;Lee, Hyun-Ae;Shin, Seong-Yoon
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.155-160
    • /
    • 2021
  • Differential evolution is an efficient algorithm for solving continuous optimization problems. However, its performance deteriorates rapidly, and the runtime increases exponentially when differential evolution is applied for solving large-scale optimization problems. Hence, a novel cooperative coevolution differential evolution based on Spark (known as SparkDECC) is proposed. The divide-and-conquer strategy is used in SparkDECC. First, the large-scale problem is decomposed into several low-dimensional subproblems using the random grouping strategy. Subsequently, each subproblem can be addressed in a parallel manner by exploiting the parallel computation capability of the resilient distributed datasets model in Spark. Finally, the optimal solution of the entire problem is obtained using the cooperation mechanism. The experimental results on 13 high-benchmark functions show that the new algorithm performs well in terms of speedup and scalability. The effectiveness and applicability of the proposed algorithm are verified.

A Hybrid Mechanism of Particle Swarm Optimization and Differential Evolution Algorithms based on Spark

  • Fan, Debin;Lee, Jaewan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권12호
    • /
    • pp.5972-5989
    • /
    • 2019
  • With the onset of the big data age, data is growing exponentially, and the issue of how to optimize large-scale data processing is especially significant. Large-scale global optimization (LSGO) is a research topic with great interest in academia and industry. Spark is a popular cloud computing framework that can cluster large-scale data, and it can effectively support the functions of iterative calculation through resilient distributed datasets (RDD). In this paper, we propose a hybrid mechanism of particle swarm optimization (PSO) and differential evolution (DE) algorithms based on Spark (SparkPSODE). The SparkPSODE algorithm is a parallel algorithm, in which the RDD and island models are employed. The island model is used to divide the global population into several subpopulations, which are applied to reduce the computational time by corresponding to RDD's partitions. To preserve population diversity and avoid premature convergence, the evolutionary strategy of DE is integrated into SparkPSODE. Finally, SparkPSODE is conducted on a set of benchmark problems on LSGO and show that, in comparison with several algorithms, the proposed SparkPSODE algorithm obtains better optimization performance through experimental results.

Spark 애플리케이션 기반의 성능 분석 (A Performance Analysis Based on Spark Application)

  • 정영교;이병준;조영주;윤희용
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2016년도 제53차 동계학술대회논문집 24권1호
    • /
    • pp.79-80
    • /
    • 2016
  • 아파치 스파크는 효율적으로 대용량 데이터를 처리하기 위해 분산 메모리 추상화를 사용하는 오픈 소스 분산 데이터 처리 플랫폼이다. 하지만 아파치 스파크 플랫폼의 특정 작업의 성능은 입력 데이터의 유형과 크기, 디자인 및 알고리즘의 구현 및 컴퓨팅 능력에 따라 메모리 사용량 및 I/O 비용이 크게 달라질 수 있다는 문제점이 있다. 이러한 문제점을 해결하기 위하여 본 논문에서는 아파치 스파크 플랫폼에 대한 높은 정밀도 작업 성능을 예측할 수 있도록 CPU core수의 증가에 따른 WordCount 시뮬레이션을 비교 평가 하였다.

  • PDF