• Title/Summary/Keyword: Distributed and Parallel Algorithms

Search Result 77, Processing Time 0.029 seconds

Parallel Algorithm of Improved FunkSVD Based on Spark

  • Yue, Xiaochen;Liu, Qicheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1649-1665
    • /
    • 2021
  • In view of the low accuracy of the traditional FunkSVD algorithm, and in order to improve the computational efficiency of the algorithm, this paper proposes a parallel algorithm of improved FunkSVD based on Spark (SP-FD). Using RMSProp algorithm to improve the traditional FunkSVD algorithm. The improved FunkSVD algorithm can not only solve the problem of decreased accuracy caused by iterative oscillations but also alleviate the impact of data sparseness on the accuracy of the algorithm, thereby achieving the effect of improving the accuracy of the algorithm. And using the Spark big data computing framework to realize the parallelization of the improved algorithm, to use RDD for iterative calculation, and to store calculation data in the iterative process in distributed memory to speed up the iteration. The Cartesian product operation in the improved FunkSVD algorithm is divided into blocks to realize parallel calculation, thereby improving the calculation speed of the algorithm. Experiments on three standard data sets in terms of accuracy, execution time, and speedup show that the SP-FD algorithm not only improves the recommendation accuracy, shortens the calculation interval compared to the traditional FunkSVD and several other algorithms but also shows good parallel performance in a cluster environment with multiple nodes. The analysis of experimental results shows that the SP-FD algorithm improves the accuracy and parallel computing capability of the algorithm, which is better than the traditional FunkSVD algorithm.

A Global Framework for Parallel and Distributed Application with Mobile Objects (이동 객체 기반 병렬 및 분산 응용 수행을 위한 전역 프레임워크)

  • Han, Youn-Hee;Park, Chan-Yeol;Hwang, Chong-Sun;Jeong, Young-Sik
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.6
    • /
    • pp.555-568
    • /
    • 2000
  • The World Wide Web has become the largest virtual system that is almost universal in scope. In recent research, it has become effective to utilize idle hosts existing in the World Wide Web for running applications that require a substantial amount of computation. This novel computing paradigm has been referred to as the advent of global computing. In this paper, we implement and propose a mobile object-based global computing framework called Tiger, whose primary goal is to present novel object-oriented programming libraries that support distribution, dispatching, migration of objects and concurrency among computational activities. The programming libraries provide programmers with access, location and migration transparency for distributed and mobile objects. Tiger's second goal is to provide a system supporting requisites for a global computing environment - scalability, resource and location management. The Tiger system and the programming libraries provided allow a programmer to easily develop an objectoriented parallel and distributed application using globally extended computing resources. We also present the improvement in performance gained by conducting the experiment with highly intensive computations such as parallel fractal image processing and genetic-neuro-fuzzy algorithms.

  • PDF

Performance Evaluation of Hash Join Algorithms Supporting Dynamic Load Balancing for a Database Sharing System (데이타베이스 공유 시스템에서 동적 부하분산을 지원하는 해쉬 조인 알고리즘들의 성능 평가)

  • Moon, Ae-Kyung;Cho, Haeng-Rae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.12
    • /
    • pp.3456-3468
    • /
    • 1999
  • Most of previous parallel join algorithms assume a database partition system(DPS), where each database partition is owned by a single processing node. While the DPS is novel in the sense that it can interconnect a large number of nodes and support a geographically distributed environment, it may suffer from poor facility for load balancing and system availability compared to the database sharing system(DSS). In this paper, we propose a dynamic load balancing strategy by exploiting the characteristics of the DSS, and then extend the conventional hash join algorithms to the DSS by using the dynamic load balancing strategy. With simulation studies under a wide variety of system configurations and database workloads, we analyze the effects of the dynamic load balancing strategy and differences in the performances of hash join algorithms in the DSS.

  • PDF

Multi-objective optimization using a two-leveled symbiotic evolutionary algorithm (2 계층 공생 진화알고리듬을 이용한 다목적 최적화)

  • Sin, Gyeong-Seok;Kim, Yeo-Geun
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.11a
    • /
    • pp.573-576
    • /
    • 2006
  • This paper deals with multi-objective optimization problem of finding a set of well-distributed solutions close to the true Pareto optimal solutions. In this paper, we present a two-leveled symbiotic evolutionary algorithm to efficiently solve the problem. Most of the existing multi-objective evolutionary algorithms (MOEAs) operate one population that consists of individuals representing the complete solution to the problem. The proposed algorithm maintains several populations, each of which represents a partial solution to the entire problem, and has a structure with two levels. The parallel search and the structure are intended to improve the capability of searching diverse and good solutions. The performance of the proposed algorithm is compared with those of the existing algorithms in terms of convergence and diversity. The experimental results confirm the effectiveness of the proposed algorithm.

  • PDF

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

  • Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.96-102
    • /
    • 2022
  • We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.

Sort-Based Distributed Parallel Data Cube Computation Algorithm using MapReduce (맵리듀스를 이용한 정렬 기반의 데이터 큐브 분산 병렬 계산 알고리즘)

  • Lee, Suan;Kim, Jinho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.9
    • /
    • pp.196-204
    • /
    • 2012
  • Recently, many applications perform OLAP(On-Line Analytical Processing) over a very large volume of data. Multidimensional data cube is regarded as a core tool in OLAP analysis. This paper focuses on the method how to efficiently compute data cubes in parallel by using a popular parallel processing tool, MapReduce. We investigate efficient ways to implement PipeSort algorithm, a well-known data cube computation method, on the MapReduce framework. The PipeSort executes several (descendant) cuboids at the same time as a pipeline by scanning one (ancestor) cuboid once, which have the same sorting order. This paper proposed four ways implementing the pipeline of the PipeSort on the MapReduce framework which runs across 20 servers. Our experiments show that PipeMap-NoReduce algorithm outperforms the rest algorithms for high-dimensional data. On the contrary, Post-Pipe stands out above the others for low-dimensional data.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

Auto-Tuning of Reference Model Based PID Controller Using Immune Algorithm

  • Kim, Dong-Hwa;Park, Jin-Ill
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.3
    • /
    • pp.246-254
    • /
    • 2002
  • In this paper auto-tuning scheme of PID controller based on the reference model has been studied for a Process control system by immune algorithm. Up to this time, many sophisticated tuning algorithms have been tried in order to improve the PID controller performance under such difficult conditions. Also, a number of approaches have been proposed to implement mixed control structures that combine a PID controller with fuzzy logic. However, in the actual plant, they are manually tuned through a trial and error procedure, and the derivative action is switched off. Therefore, it is difficult to tune. Since the immune system possesses a self organizing and distributed memory, it is thus adaptive to its external environment and allows a PDP (Parallel Distributed Processing) network to complete patterns against the environmental situation. Simulation results reveal that reference model basd tuning by immune network suggested in this paper is an effective approach to search for optimal or near optimal process control.

Two-Phase Distributed Evolutionary algorithm with Inherited Age Concept

  • Kang, Young-Hoon;Z. Zenn Bien
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.101.4-101
    • /
    • 2001
  • Evolutionary algorithm has been receiving a remarkable attention due to the model-free and population-based parallel search attributes and much successful results are coming out. However, there are some problems in most of the evolutionary algorithms. The critical one is that it takes much time or large generations to search the global optimum in case of the objective function with multimodality. Another problem is that it usually cannot search all the local optima because it pays great attention to the search of the global optimum. In addition, if the objective function has several global optima, it may be very difficult to search all the global optima due to the global characteristics of the selection methods. To cope with these problems, at first we propose a preprocessing process, grid-filtering algorithm(GFA), and propose a new distributed evolutionary ...

  • PDF

Performance Evaluation of Scheduling Algorithms according to Communication Cost in the Grid System of Co-allocation Environment (Co-allocation 환경의 그리드 시스템에서 통신비용에 따른 스케줄링 알고리즘의 성능 분석)

  • Kang, Oh-Han;Kang, Sang-Seong;Kim, Jin-Suk
    • The KIPS Transactions:PartA
    • /
    • v.14A no.2
    • /
    • pp.99-106
    • /
    • 2007
  • Grid computing, a mechanism which uses heterogeneous systems that are geographically distributed, draws attention as a new paradigm for the next generation operation of parallel and distributed computing. The importance of grid computing concerning communication cost is very huge because grid computing furnishes uses with integrated virtual computing service, in which a number of computer systems are connected by a high-speed network. Therefore, to reduce the execution time, the scheduling algorithm in grid environment should take communication cost into consideration as well as computing ability of resources. However, most scheduling algorithms have not only ignored the communication cost by assuming that all tasks were dealt in one cluster, but also did not consider the overhead of communication cost when the tasks were processed in a number of clusters. In this paper, the functions of original scheduling algorithms are analyzed. More importantly, the functions of algorithms are compared and analyzed with consideration of communication cost within the co allocation environment, in which a task is performed separately in many clusters.