• Title/Summary/Keyword: Parallel computing

Search Result 812, Processing Time 0.027 seconds

DMRUT-MCDS: Discovery Relationships in the Cyber-Physical Integrated Network

  • Lu, Hongliang;Cao, Jiannong;Zhu, Weiping;Jiao, Xianlong;Lv, Shaohe;Wang, Xiaodong
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.558-567
    • /
    • 2015
  • In recent years, we have seen a proliferation of mobile-network-enabled smart objects, such as smart-phones and smart-watches, that form a cyber-physical integrated network to connect the cyber and physical worlds through the capabilities of sensing, communicating, and computing. Discovery of the relationship between smart objects is a critical and nontrivial task in cyber-physical integrated network applications. Aiming to find the most stable relationship in the heterogeneous and dynamic cyber-physical network, we propose a distributed and efficient relationship-discovery algorithm, called dynamically maximizing remaining unchanged time with minimum connected dominant set (DMRUT-MCDS) for constructing a backbone with the smallest scale infrastructure. In our proposed algorithm, the impact of the duration of the relationship is considered in order to balance the size and sustain time of the infrastructure. The performance of our algorithm is studied through extensive simulations and the results show that DMRUT-MCDS performs well in different distribution networks.

New execution model for CAPE using multiple threads on multicore clusters

  • Do, Xuan Huyen;Ha, Viet Hai;Tran, Van Long;Renault, Eric
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.825-834
    • /
    • 2021
  • Based on its simplicity and user-friendly characteristics, OpenMP has become the standard model for programming on shared-memory architectures. Checkpointing-aided parallel execution (CAPE) is an approach that utilizes the discontinuous incremental checkpointing technique (DICKPT) to translate and execute OpenMP programs on distributed-memory architectures automatically. Currently, CAPE implements the OpenMP execution model by utilizing the DICKPT to distribute parallel jobs and their data to slave machines, and then collects the results after executing these distributed jobs. Although this model has been proven to be effective in terms of performance and compatibility with OpenMP on distributed-memory systems, it cannot fully exploit the capabilities of multicore processors. This paper presents a novel execution model for CAPE that utilizes two levels of parallelism. In the proposed model, we add another level of parallelism in the form of multithreaded processes on slave machines with the goal of better exploiting their multicore CPUs. Initial experimental results presented near the end of this paper demonstrate that this model provides significantly enhanced CAPE performance.

PESA: Prioritized experience replay for parallel hybrid evolutionary and swarm algorithms - Application to nuclear fuel

  • Radaideh, Majdi I.;Shirvan, Koroush
    • Nuclear Engineering and Technology
    • /
    • v.54 no.10
    • /
    • pp.3864-3877
    • /
    • 2022
  • We propose a new approach called PESA (Prioritized replay Evolutionary and Swarm Algorithms) combining prioritized replay of reinforcement learning with hybrid evolutionary algorithms. PESA hybridizes different evolutionary and swarm algorithms such as particle swarm optimization, evolution strategies, simulated annealing, and differential evolution, with a modular approach to account for other algorithms. PESA hybridizes three algorithms by storing their solutions in a shared replay memory, then applying prioritized replay to redistribute data between the integral algorithms in frequent form based on their fitness and priority values, which significantly enhances sample diversity and algorithm exploration. Additionally, greedy replay is used implicitly to improve PESA exploitation close to the end of evolution. PESA features in balancing exploration and exploitation during search and the parallel computing result in an agnostic excellent performance over a wide range of experiments and problems presented in this work. PESA also shows very good scalability with number of processors in solving an expensive problem of optimizing nuclear fuel in nuclear power plants. PESA's competitive performance and modularity over all experiments allow it to join the family of evolutionary algorithms as a new hybrid algorithm; unleashing the power of parallel computing for expensive optimization.

Parallel Contact Treatment and Parallel Performance of Impact Simulation Based on Lagrangian Scheme (Lagrangian 기법에 의한 충돌 해석 시 접촉처리의 병렬화 및 병렬효율 평가)

  • Back, Seung-Hoon;Kim, Seung-Jo;Lee, Min-Hyung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.30 no.11 s.254
    • /
    • pp.1447-1454
    • /
    • 2006
  • The evaluation of parallel performance of a high speed impact simulation is not an easy task because not only the development of parallel explicit code is difficult but also a large number of processors is not easily accessible. In this paper, the parallel performance of a new Lagrangian FEM impact code carried out on cluster supercomputer has been described in high speed range. In the case of metal sphere impacting to oblique plate, the overall speed-up continuously increases even up to 128 CPUs. Investigation of elapsed time of each part reveals that most of the inefficiency comes from the load imbalance of contact.

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • v.18 no.5
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

Running Large-scale Mobile Software using PDA Cluster Computing (PDA 클러스터 컴퓨팅을 활용한 대용량 모바일 소프트웨어 실행)

  • Min, Hye-Rhyn;Lee, Jong-Woo
    • Journal of Digital Contents Society
    • /
    • v.10 no.2
    • /
    • pp.249-258
    • /
    • 2009
  • As wireless internet markets become larger than before, many mobile applications are also being developed actively. In this circumstances mobile devices such as cell phones, PDAs are playing an important role to satisfy the user's need of ubiquitous computing. Due to the hardware limitations, however, the mobile devices like PDA can not run large-scale softwares by itself. The main goal of this paper is to make large-scale applications runnable on PDA. To accomplish this, we used the PDA-JPVM cluster computing engine which has been already developed by us. We found out by running the applications and the performance evaluation that large-scale Java softwares can easily run on the hardware-limited PDA. And the performance evaluation results are also presented.

  • PDF

A domain decomposition method applied to queuing network problems

  • Park, Pil-Seong
    • Communications of the Korean Mathematical Society
    • /
    • v.10 no.3
    • /
    • pp.735-750
    • /
    • 1995
  • We present a domain decomposition algorithm for solving large sparse linear systems of equations arising from queuing networks. Such techniques are attractive since the problems in subdomains can be solved independently by parallel processors. Many of the methods proposed so far use some form of the preconditioned conjugate gradient method to deal with one large interface problem between subdomains. However, in this paper, we propose a "nested" domain decomposition method where the subsystems governing the interfaces are small enough so that they are easily solvable by direct methods on machines with many parallel processors. Convergence of the algorithms is also shown.lso shown.

  • PDF

Finite element analysis of welding process by parallel computation (병렬 처리를 이용한 용접 공정 유한 요소 해석)

  • 임세영;김주완;최강혁;임재혁
    • Proceedings of the KWS Conference
    • /
    • 2003.11a
    • /
    • pp.156-158
    • /
    • 2003
  • An implicit finite element implementation for Leblond's transformation plasticity constitutive equations, which are widely used in welded steel structure is proposed in the framework of parallel computing. The implementation is based upon the multiplicative decomposition of deformation gradient and hyper elastic formulation. We examine the efficiency of parallel computation for the finite element analysis of a welded structure using domain-wise multi-frontal solver.

  • PDF

Three dimensional finite element analysis of art-welding processor via parallel compuating (아크 용접 공정의 3차원 병렬처리 유한 요소 해석)

  • 임세영;김주완;김현규;조영삼
    • Proceedings of the KWS Conference
    • /
    • 2002.05a
    • /
    • pp.161-163
    • /
    • 2002
  • An implicit finite element implementation for Leblond's transformation plasticity constitutive equations, which are widely used in welded steel structure is proposed in the framework of parallel computing. The implementation is based upon the updated Lagrangian formulation. We examine the efficiency of parallel compuatation for the finite element analysis of a welded structure using multi-frontal solver.

  • PDF

A Computer Program for System Reliability Prediction (시스템의 신뢰성(信賴性) 예측(豫測)을 위한 컴퓨터 프로그램)

  • Kim, Yeong-Hwi;Choe, Mun-Gi
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.1 no.2
    • /
    • pp.51-56
    • /
    • 1975
  • A computer program for computing complex system reliability is described. The program is composed of three phases : Phase I program reduces all series, parallel and series-parallel components and subsequently obtains an irreducible non-series-parallel system. Phase II program enumerates all the possible paths from the source to the sink of the graph. Phase III program then computes system reliability based on the information obtained by the Phase II program. The program is based on a modified version of the algorithm published in [6]. An example of the use of the computer program is given.

  • PDF