• 제목/요약/키워드: Code Parallelization

검색결과 35건 처리시간 0.019초

On-line Trace Based Automatic Parallelization of Java Programs on Multicore Platforms

  • Sun, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제6권2호
    • /
    • pp.105-118
    • /
    • 2012
  • We propose two new approaches that automatically parallelize Java programs at runtime. These approaches, which rely on run-time trace information collected during program execution, dynamically recompile Java byte code that can be executed in parallel. One approach utilizes trace information to improve traditional loop parallelization, and the other parallelizes traces instead of loop iterations. We also describe a cost/benefit model that makes intelligent parallelization decisions, as well as a parallel execution environment to execute parallelized programs. These techniques are based on Jikes RVM. Our approach is evaluated by parallelizing sequential Java programs, and its performance is compared to that of the manually parallelized code. According to the experimental results, our approach has low overheads and achieves competitive speedups compared to the manually parallelizing code. Moreover, trace parallelization can exploit parallelism beyond loop iterations.

A Study on the Automatic Parallelization Method and Tool Development

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제12권3호
    • /
    • pp.87-94
    • /
    • 2020
  • Recently, computer hardware is evolving toward increasing the number of computing cores, not increasing the clock speed. In order to use the performance of parallelized hardware to the maximum, the running program must also be parallelized. However, software developers are accustomed to sequential programs, and in most cases, write programs that operate sequentially. They also have a lot of difficulty designing and developing software in parallel. We propose a method to automatically convert a sequential C/C++ program into a parallelized program, and develop a parallelization tool that supports it. It supports open multiprocessing (OpenMP) and parallel patterns library (PPL) as a parallel framework. Perfect automatic parallelization is difficult due to dynamic features such as pointer operation and polymorphism in C/C++ language. This study focuses on verifying the conditions of parallelization rather than focusing on fully automatic parallelization, and providing advice to developers in detail if parallelization is not possible.

Parallelization and application of SACOS for whole core thermal-hydraulic analysis

  • Gui, Minyang;Tian, Wenxi;Wu, Di;Chen, Ronghua;Wang, Mingjun;Su, G.H.
    • Nuclear Engineering and Technology
    • /
    • 제53권12호
    • /
    • pp.3902-3909
    • /
    • 2021
  • SACOS series of subchannel analysis codes have been developed by XJTU-NuTheL for many years and are being used for the thermal-hydraulic safety analysis of various reactor cores. To achieve fine whole core pin-level analysis, the input preprocessing and parallel capabilities of the code have been developed in this study. Preprocessing is suitable for modeling rectangular and hexagonal assemblies with less error-prone input; parallelization is established based on the domain decomposition method with the hybrid of MPI and OpenMP. For domain decomposition, a more flexible method has been proposed which can determine the appropriate task division of the core domain according to the number of processors of the server. By performing the calculation time evaluation for the several PWR assembly problems, the code parallelization has been successfully verified with different number of processors. Subsequent analysis results for rectangular- and hexagonal-assembly core imply that the code can be used to model and perform pin-level core safety analysis with acceptable computational efficiency.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권6호
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.

정보통신기술과 전산유체역학 (Information Technology and Computational Fluid Dynamics)

  • 조금원;박형우;이상산
    • 한국전산유체공학회지
    • /
    • 제6권3호
    • /
    • pp.51-56
    • /
    • 2001
  • As IT(Information Technology) has been developing, an application engineering is advanced so quickly. Especially, CFD field that is influenced greatly by Computing Power is an outstanding example. In this paper, it says a research tendency of the KISTI Supercomputing Center that performs the CFD research based on IT. The representative researches are the National Grid Project, TeraCluster Construction and development and a supporting plan for Supercomputer users' parallelization.

  • PDF

Improvement and verification of the DeCART code for HTGR core physics analysis

  • Cho, Jin Young;Han, Tae Young;Park, Ho Jin;Hong, Ser Gi;Lee, Hyun Chul
    • Nuclear Engineering and Technology
    • /
    • 제51권1호
    • /
    • pp.13-30
    • /
    • 2019
  • This paper presents the recent improvements in the DeCART code for HTGR analysis. A new 190-group DeCART cross-section library based on ENDF/B-VII.0 was generated using the KAERI library processing system for HTGR. Two methods for the eigen-mode adjoint flux calculation were implemented. An azimuthal angle discretization method based on the Gaussian quadrature was implemented to reduce the error from the azimuthal angle discretization. A two-level parallelization using MPI and OpenMP was adopted for massive parallel computations. A quadratic depletion solver was implemented to reduce the error involved in the Gd depletion. A module to generate equivalent group constants was implemented for the nodal codes. The capabilities of the DeCART code were improved for geometry handling including an approximate treatment of a cylindrical outer boundary, an explicit border model, the R-G-B checker-board model, and a super-cell model for a hexagonal geometry. The newly improved and implemented functionalities were verified against various numerical benchmarks such as OECD/MHTGR-350 benchmark phase III problems, two-dimensional high temperature gas cooled reactor benchmark problems derived from the MHTGR-350 reference design, and numerical benchmark problems based on the compact nuclear power source experiment by comparing the DeCART solutions with the Monte-Carlo reference solutions obtained using the McCARD code.

유한요소법을 이용한 도파관 전자기 해석의 흡수경계조건 고찰 및 병렬화 (Absorbing Boundary Conditions and Parallelization for Waveguide Electromagnetic Analysis Using Finite Element Method)

  • 박우빈;김문성;이우찬
    • 인터넷정보학회논문지
    • /
    • 제23권3호
    • /
    • pp.67-76
    • /
    • 2022
  • 현대에는 전자기파를 이용한 전력 및 신호 전달이 필수적인데, 전자기파를 원하는 경로를 통해 효율적으로 전달하기 위해서는 도파 구조(guided structure)가 필요하다. 본 논문에서는 먼저 전파해석 기법인 유한요소법(FEM : Finite Element Method)을 적용하여 도파 구조 중 하나인 2-D/3-D 도파관(waveguide)에 대해 직접 in-house code를 작성하여 전자기 시뮬레이션하였다. 이후 in-house code의 해석 결과를 대표적인 전자파 상용 시뮬레이션 소프트웨어인 HFSS의 결과와 비교하여 해석의 정확성을 검증하였다. 아울러, 전자기 해석에 있어 무한대의 해석 영역을 잘라 해석하기 위해 필수적인 흡수경계조건(ABC : Absorbing Boundary Condition)의 성능을 분석한 후, 병렬화 기법의 적용에 따른 성능 향상을 제시하였다.

자료 종속성 제거 방법을 이용한 프로시저 변환 (The Procedure Transformation using Data Dependency Elimination Methods)

  • 장유숙;박두순
    • 정보처리학회논문지A
    • /
    • 제9A권1호
    • /
    • pp.37-44
    • /
    • 2002
  • 기존의 순차 프로그램에서 병렬성을 추출하는 연구들은 하나의 프로시저 내 변환에 치중되고 있다. 그러나 대부분의 프로그램들은 프로시저간 잠재된 병렬성을 가지고 있다. 본 논문에서는 자료 종속성 제거방법을 이용하여 프로시저 호출을 가진 루프에서 병렬성 추출 방식을 제안한다. 프로시저 호출을 포함하는 루프의 병렬화는 대부분 자료종석거리가 uniform 형태의 코드에서만 연구되었다. 본 논문에서는 자료종속거리가 uniform 코드와 nonuniform 코드에 대해 모두 적용 가능한 프로시저 간 변환 방법을 제시하였으며, 제시된 알고리즘의 성능평가를 위하여 CRAY T3E에서 성능평가하였고, 제시된 방법이 효과적임을 보였다.

다물체 페리다이나믹 해석을 위한 MPI-OpenMP 혼합 병렬화 (MPI-OpenMP Hybrid Parallelization for Multibody Peridynamic Simulations)

  • 이승우;하윤도
    • 한국전산구조공학회논문집
    • /
    • 제33권3호
    • /
    • pp.171-178
    • /
    • 2020
  • 본 연구에서는 다물체 페리다이나믹 해석 코드의 MPI-OpenMP 혼합 병렬화를 수행하였다. 페리다이나믹 해석 모델은 복잡한 동적파괴 거동 및 불연속 특성을 모사하는데 적합하지만, 비국부 영역을 통한 절점 간 상호작용을 계산하기 때문에 유한요소 모델에 비해 계산 시간이 많이 소요된다. 또한 다중적층구조물의 다물체 페리다이나믹 해석에서 추가된 비국부 접촉 모델과 가상 층간 결합 모델을 통한 여러 물체 간 상호작용으로 계산 부담이 증가한다. 더불어 고속 충돌 파괴와 같은 복잡한 동적 파괴 거동 해석을 위해 세밀한 절점 간격과 작은 시간 간격이 요구되기 때문에 코드 최적화와 병렬화를 통한 고성능 해석 코드 개발이 필수적이다. 해석 코드는 Intel Fortran MPI compiler와 OpenMP를 사용하여 개발되었으며, 한국과학기술정보원(KISTI)의 슈퍼컴퓨팅센터 누리온(Nurion)으로 실행되었다. 다물체 해석 코드를 최적화하기 위한 핵심 요소들을 분석하고, 모델 의존성 발생 서브루틴 분석 및 프로세스 통신 데이터 분별을 통해 MPI-OpenMP 혼합 병렬 처리 구조를 적용하였다. 다물체 충돌 파괴 현상 시뮬레이션을 통해 개발된 병렬 처리 코드의 성능을 확인하였다.