• Title/Summary/Keyword: distributed computing cluster

Search Result 89, Processing Time 0.028 seconds

An Internet-based computing framework for the simulation of multi-scale response of structural systems

  • Chen, Hung-Ming;Lin, Yu-Chih
    • Structural Engineering and Mechanics
    • /
    • v.37 no.1
    • /
    • pp.17-37
    • /
    • 2011
  • This paper presents a new Internet-based computational framework for the realistic simulation of multi-scale response of structural systems. Two levels of parallel processing are involved in this frame work: multiple local distributed computing environments connected by the Internet to form a cluster-to-cluster distributed computing environment. To utilize such a computing environment for a realistic simulation, the simulation task of a structural system has been separated into a simulation of a simplified global model in association with several detailed component models using various scales. These related multi-scale simulation tasks are distributed amongst clusters and connected to form a multi-level hierarchy. The Internet is used to coordinate geographically distributed simulation tasks. This paper also presents the development of a software framework that can support the multi-level hierarchical simulation approach, in a cluster-to-cluster distributed computing environment. The architectural design of the program also allows the integration of several multi-scale models to be clients and servers under a single platform. Such integration can combine geographically distributed computing resources to produce realistic simulations of structural systems.

A Data Transfer Method of the Sub-Cluster Group based on the Distributed and Shared Memory (분산 공유메모리를 기반으로 한 서브 클러스터 그룹의 자료전송방식)

  • Lee, Kee-Jun
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.635-642
    • /
    • 2003
  • The radical development of recent network technology provides the basic foundation which can establish a high speed and cheap cluster system. It is a general trend that conventional cluster systems are built as the system over a fixed level based on stabilized and high speed local networks. A multi-distributed web cluster group is a web cluster model which can obtain high performance, high efficiency and high availability through mutual cooperative works between effective job division and system nodes through parallel performance of a given work and shared memory of SC-Server with low price and low speed system nodes on networks. For this, multi-distributed web cluster group builds a sub-cluster group bound with single imaginary networks of multiple system nodes and uses the web distributed shared memory of system nodes for the effective data transmission within sub-cluster groups. Since the presented model uses a load balancing and parallel computing method of large-scale work required from users, it can maximize the processing efficiency.

A Token Based Protocol for Mutual Exclusion in Mobile Ad Hoc Networks

  • Sharma, Bharti;Bhatia, Ravinder Singh;Singh, Awadhesh Kumar
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.36-54
    • /
    • 2014
  • Resource sharing is a major advantage of distributed computing. However, a distributed computing system may have some physical or virtual resource that may be accessible by a single process at a time. The mutual exclusion issue is to ensure that no more than one process at a time is allowed to access some shared resource. The article proposes a token-based mutual exclusion algorithm for the clustered mobile ad hoc networks (MANETs). The mechanism that is adapted to handle token passing at the inter-cluster level is different from that at the intra-cluster level. It makes our algorithm message efficient and thus suitable for MANETs. In the interest of efficiency, we implemented a centralized token passing scheme at the intra-cluster level. The centralized schemes are inherently failure prone. Thus, we have presented an intra-cluster token passing scheme that is able to tolerate a failure. In order to enhance reliability, we applied a distributed token circulation scheme at the inter-cluster level. More importantly, the message complexity of the proposed algorithm is independent of N, which is the total number of nodes in the system. Also, under a heavy load, it turns out to be inversely proportional to n, which is the (average) number of nodes per each cluster. We substantiated our claim with the correctness proof, complexity analysis, and simulation results. In the end, we present a simple approach to make our protocol fault tolerant.

Effects of Hypervisor on Distributed Big Data Processing in Virtualizated Cluster Environment (가상화 클러스터 환경에서 빅 데이터 분산 처리 성능에 하이퍼바이저가 미치는 영향)

  • Chung, Haejin;Nah, Yunmook
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.2
    • /
    • pp.89-94
    • /
    • 2016
  • Recently, cluster computing environments have been in a process of change toward virtualized cluster environments. The change of the cluster environment has great impact on the performance of large volume distributed processing. Therefore, many domestic and international IT companies have invested heavily in research on cluster environments. In this paper, we show how the hypervisor affects the performance of distributed processing of a large volume of data. We present a performance comparison of MapReduce processing in two virtualized cluster environments, one built using the Xen hypervisor and the other built using the container-based Docker. Our results show that Docker is faster than Xen.

RDP: A storage-tier-aware Robust Data Placement strategy for Hadoop in a Cloud-based Heterogeneous Environment

  • Muhammad Faseeh Qureshi, Nawab;Shin, Dong Ryeol
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4063-4086
    • /
    • 2016
  • Cloud computing is a robust technology, which facilitate to resolve many parallel distributed computing issues in the modern Big Data environment. Hadoop is an ecosystem, which process large data-sets in distributed computing environment. The HDFS is a filesystem of Hadoop, which process data blocks to the cluster nodes. The data block placement has become a bottleneck to overall performance in a Hadoop cluster. The current placement policy assumes that, all Datanodes have equal computing capacity to process data blocks. This computing capacity includes availability of same storage media and same processing performances of a node. As a result, Hadoop cluster performance gets effected with unbalanced workloads, inefficient storage-tier, network traffic congestion and HDFS integrity issues. This paper proposes a storage-tier-aware Robust Data Placement (RDP) scheme, which systematically resolves unbalanced workloads, reduces network congestion to an optimal state, utilizes storage-tier in a useful manner and minimizes the HDFS integrity issues. The experimental results show that the proposed approach reduced unbalanced workload issue to 72%. Moreover, the presented approach resolve storage-tier compatibility problem to 81% by predicting storage for block jobs and improved overall data block placement by 78% through pre-calculated computing capacity allocations and execution of map files over respective Namenode and Datanodes.

On the Performance of Oracle Grid Engine Queuing System for Computing Intensive Applications

  • Kolici, Vladi;Herrero, Albert;Xhafa, Fatos
    • Journal of Information Processing Systems
    • /
    • v.10 no.4
    • /
    • pp.491-502
    • /
    • 2014
  • In this paper we present some research results on computing intensive applications using modern high performance architectures and from the perspective of high computational needs. Computing intensive applications are an important family of applications in distributed computing domain. They have been object of study using different distributed computing paradigms and infrastructures. Such applications distinguish for their demanding needs for CPU computing, independently of the amount of data associated with the problem instance. Among computing intensive applications, there are applications based on simulations, aiming to maximize system resources for processing large computations for simulation. In this research work, we consider an application that simulates scheduling and resource allocation in a Grid computing system using Genetic Algorithms. In such application, a rather large number of simulations is needed to extract meaningful statistical results about the behavior of the simulation results. We study the performance of Oracle Grid Engine for such application running in a Cluster of high computing capacities. Several scenarios were generated to measure the response time and queuing time under different workloads and number of nodes in the cluster.

Mutual Authentication Protocol for Safe Data Transmission of Multi-distributed Web Cluster Model (다중 분산 웹 클러스터모델의 안전한 데이터 전송을 위한 상호 인증 프로토콜)

  • Lee, Kee-Jun;Kim, Chang-Won;Jeong, Chae-Yeong
    • The KIPS Transactions:PartC
    • /
    • v.8C no.6
    • /
    • pp.731-740
    • /
    • 2001
  • Multi-distributed web cluster model expanding conventional cluster system is the cluster system which processes large-scaled work demanded from users with parallel computing method by building a number of system nodes on open network into a single imaginary network. Multi-distributed web cluster model on the structured characteristics exposes internal system nodes by an illegal third party and has a potential that normal job performance is impossible by the intentional prevention and attack in cooperative work among system nodes. This paper presents the mutual authentication protocol of system nodes through key division method for the authentication of system nodes concerned in the registration, requirement and cooperation of service code block of system nodes and collecting the results and then designs SNKDC which controls and divides symmetrical keys of the whole system nodes safely and effectively. SNKDC divides symmetrical keys required for performing the work of system nodes and the system nodes transmit encoded packet based on the key provided. Encryption packet given and taken between system nodes is decoded by a third party or can prevent the outflow of information through false message.

  • PDF

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

  • Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.96-102
    • /
    • 2022
  • We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.

Container-based Cluster Management System for User-driven Distributed Computing (사용자 맞춤형 분산 컴퓨팅을 위한 컨테이너 기반 클러스터 관리 시스템)

  • Park, Ju-Won;Hahm, Jaegyoon
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.587-595
    • /
    • 2015
  • Several fields of science have traditionally demanded large-scale workflow support, which requires thousands of central processing unit (CPU) cores. In order to support such large-scale scientific workflows, large-capacity cluster systems such as supercomputers are widely used. However, as users require a diversity of software packages and configurations, a system administrator has some trouble in making a service environment in real time. In this paper, we present a container-based cluster management platform and introduce an implementation case to minimize performance reduction and dynamically provide a distributed computing environment desired by users. This paper offers the following contributions. First, a container-based virtualization technology is assimilated with a resource and job management system to expand applicability to support large-scale scientific workflows. Second, an implementation case in which docker and HTCondor are interlocked is introduced. Lastly, docker and native performance comparison results using two widely known benchmark tools and Monte-Carlo simulation implemented using various programming languages are presented.

Implementation of AIoT Edge Cluster System via Distributed Deep Learning Pipeline

  • Jeon, Sung-Ho;Lee, Cheol-Gyu;Lee, Jae-Deok;Kim, Bo-Seok;Kim, Joo-Man
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.278-288
    • /
    • 2021
  • Recently, IoT systems are cloud-based, so that continuous and large amounts of data collected from sensor nodes are processed in the data server through the cloud. However, in the centralized configuration of large-scale cloud computing, computational processing must be performed at a physical location where data collection and processing take place, and the need for edge computers to reduce the network load of the cloud system is gradually expanding. In this paper, a cluster system consisting of 6 inexpensive Raspberry Pi boards was constructed to perform fast data processing. And we propose "Kubernetes cluster system(KCS)" for processing large data collection and analysis by model distribution and data pipeline method. To compare the performance of this study, an ensemble model of deep learning was built, and the accuracy, processing performance, and processing time through the proposed KCS system and model distribution were compared and analyzed. As a result, the ensemble model was excellent in accuracy, but the KCS implemented as a data pipeline proved to be superior in processing speed..