• Title/Summary/Keyword: Code Parallelization

Search Result 35, Processing Time 0.023 seconds

On-line Trace Based Automatic Parallelization of Java Programs on Multicore Platforms

  • Sun, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.2
    • /
    • pp.105-118
    • /
    • 2012
  • We propose two new approaches that automatically parallelize Java programs at runtime. These approaches, which rely on run-time trace information collected during program execution, dynamically recompile Java byte code that can be executed in parallel. One approach utilizes trace information to improve traditional loop parallelization, and the other parallelizes traces instead of loop iterations. We also describe a cost/benefit model that makes intelligent parallelization decisions, as well as a parallel execution environment to execute parallelized programs. These techniques are based on Jikes RVM. Our approach is evaluated by parallelizing sequential Java programs, and its performance is compared to that of the manually parallelized code. According to the experimental results, our approach has low overheads and achieves competitive speedups compared to the manually parallelizing code. Moreover, trace parallelization can exploit parallelism beyond loop iterations.

A Study on the Automatic Parallelization Method and Tool Development

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.87-94
    • /
    • 2020
  • Recently, computer hardware is evolving toward increasing the number of computing cores, not increasing the clock speed. In order to use the performance of parallelized hardware to the maximum, the running program must also be parallelized. However, software developers are accustomed to sequential programs, and in most cases, write programs that operate sequentially. They also have a lot of difficulty designing and developing software in parallel. We propose a method to automatically convert a sequential C/C++ program into a parallelized program, and develop a parallelization tool that supports it. It supports open multiprocessing (OpenMP) and parallel patterns library (PPL) as a parallel framework. Perfect automatic parallelization is difficult due to dynamic features such as pointer operation and polymorphism in C/C++ language. This study focuses on verifying the conditions of parallelization rather than focusing on fully automatic parallelization, and providing advice to developers in detail if parallelization is not possible.

Parallelization and application of SACOS for whole core thermal-hydraulic analysis

  • Gui, Minyang;Tian, Wenxi;Wu, Di;Chen, Ronghua;Wang, Mingjun;Su, G.H.
    • Nuclear Engineering and Technology
    • /
    • v.53 no.12
    • /
    • pp.3902-3909
    • /
    • 2021
  • SACOS series of subchannel analysis codes have been developed by XJTU-NuTheL for many years and are being used for the thermal-hydraulic safety analysis of various reactor cores. To achieve fine whole core pin-level analysis, the input preprocessing and parallel capabilities of the code have been developed in this study. Preprocessing is suitable for modeling rectangular and hexagonal assemblies with less error-prone input; parallelization is established based on the domain decomposition method with the hybrid of MPI and OpenMP. For domain decomposition, a more flexible method has been proposed which can determine the appropriate task division of the core domain according to the number of processors of the server. By performing the calculation time evaluation for the several PWR assembly problems, the code parallelization has been successfully verified with different number of processors. Subsequent analysis results for rectangular- and hexagonal-assembly core imply that the code can be used to model and perform pin-level core safety analysis with acceptable computational efficiency.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.

Information Technology and Computational Fluid Dynamics (정보통신기술과 전산유체역학)

  • Cho Kum Won;Park Hyungwoo;Lee Sangsan
    • Journal of computational fluids engineering
    • /
    • v.6 no.3
    • /
    • pp.51-56
    • /
    • 2001
  • As IT(Information Technology) has been developing, an application engineering is advanced so quickly. Especially, CFD field that is influenced greatly by Computing Power is an outstanding example. In this paper, it says a research tendency of the KISTI Supercomputing Center that performs the CFD research based on IT. The representative researches are the National Grid Project, TeraCluster Construction and development and a supporting plan for Supercomputer users' parallelization.

  • PDF

Improvement and verification of the DeCART code for HTGR core physics analysis

  • Cho, Jin Young;Han, Tae Young;Park, Ho Jin;Hong, Ser Gi;Lee, Hyun Chul
    • Nuclear Engineering and Technology
    • /
    • v.51 no.1
    • /
    • pp.13-30
    • /
    • 2019
  • This paper presents the recent improvements in the DeCART code for HTGR analysis. A new 190-group DeCART cross-section library based on ENDF/B-VII.0 was generated using the KAERI library processing system for HTGR. Two methods for the eigen-mode adjoint flux calculation were implemented. An azimuthal angle discretization method based on the Gaussian quadrature was implemented to reduce the error from the azimuthal angle discretization. A two-level parallelization using MPI and OpenMP was adopted for massive parallel computations. A quadratic depletion solver was implemented to reduce the error involved in the Gd depletion. A module to generate equivalent group constants was implemented for the nodal codes. The capabilities of the DeCART code were improved for geometry handling including an approximate treatment of a cylindrical outer boundary, an explicit border model, the R-G-B checker-board model, and a super-cell model for a hexagonal geometry. The newly improved and implemented functionalities were verified against various numerical benchmarks such as OECD/MHTGR-350 benchmark phase III problems, two-dimensional high temperature gas cooled reactor benchmark problems derived from the MHTGR-350 reference design, and numerical benchmark problems based on the compact nuclear power source experiment by comparing the DeCART solutions with the Monte-Carlo reference solutions obtained using the McCARD code.

Absorbing Boundary Conditions and Parallelization for Waveguide Electromagnetic Analysis Using Finite Element Method (유한요소법을 이용한 도파관 전자기 해석의 흡수경계조건 고찰 및 병렬화)

  • Park, Woobin;Kim, Moonseong;Lee, Woochan
    • Journal of Internet Computing and Services
    • /
    • v.23 no.3
    • /
    • pp.67-76
    • /
    • 2022
  • Power and signal transmission using electromagnetic waves are essential in modern times, and a guided structure is needed to transmit electromagnetic waves efficiently through the desired path. This paper performed an electromagnetic simulation using the in-house code for the 2-D/3-D waveguide using the finite element method. The accuracy of the analysis was verified by comparing it with the results of HFSS, a representative electromagnetic wave simulation software. In addition, the performance of the Absorbing Boundary Condition (ABC), which is essential to truncate the infinite computational domain for computational electromagnetics, was analyzed. Finally, the parallelization technique was applied to accelerate the simulation speed, demonstrating performance improvement.

The Procedure Transformation using Data Dependency Elimination Methods (자료 종속성 제거 방법을 이용한 프로시저 변환)

  • Jang, Yu-Suk;Park, Du-Sun
    • The KIPS Transactions:PartA
    • /
    • v.9A no.1
    • /
    • pp.37-44
    • /
    • 2002
  • Most researches of transforming sequential programs into parallel programs have been based on the loop structure transformation method. However, most programs have implicit interprocedure parallelism. This paper suggests a way of extracting parallelism from the loops with procedure calls using the data dependency elimination method. Most parallelization of the loop with procedure calls have been conducted for extracting parallelism from the uniform code. In this paper, we propose interprocedural transformation, which can be apply to both uniform and nonuniform code. We show the examples of uniform, nonuniform, and complex code parallelization. We then evaluated the performance of the various transformation methods using the CRAY-T3E system. The comparison results show that the proposed algorithm out-performs other conventional methods.

MPI-OpenMP Hybrid Parallelization for Multibody Peridynamic Simulations (다물체 페리다이나믹 해석을 위한 MPI-OpenMP 혼합 병렬화)

  • Lee, Seungwoo;Ha, Youn Doh
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.33 no.3
    • /
    • pp.171-178
    • /
    • 2020
  • In this study, we develop MPI-OpenMP hybrid parallelization for multibody peridynamic simulations. Peridynamics is suitable for analyzing complicated dynamic fractures and various discontinuities. However, compared with a conventional finite element method, nonlocal interactions in peridynamics cost more time and memory. In multibody peridynamic analysis, the costs increase due to the additional interactions that occur when computing the nonlocal contact and ghost interlayer models between adjacent bodies. The costs become excessive when further refinement and smaller time steps are required in cases of high-velocity impact fracturing or similar instances. Thus, high computational efficiency and performance can be achieved by parallelization and optimization of multibody peridynamic simulations. The analytical code is developed using an Intel Fortran MPI compiler and OpenMP in NURION of the KISTI HPC center and parallelized through MPI-OpenMP hybrid parallelization. Further parallelization is conducted by hybridizing with OpenMP threads in each MPI process. We also try to minimize communication operations by model-based decomposition of MPI processes. The numerical results for the impact fracturing of multiple bodies show that the computing performance improves significantly with MPI-OpenMP hybrid parallelization.