• Title/Summary/Keyword: OpenMP Programs

Search Result 26, Processing Time 0.018 seconds

A Detection Tool of First Races in OpenMP Programs with Directives (OpenMP 디렉티브 프로그램의 최초경합 탐지를 위한 도구)

  • Kang, Mun-Hye;Ha, Ok-Kyoon;Jun, Yong-Kee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.1
    • /
    • pp.1-7
    • /
    • 2010
  • Detecting data races is important for debugging programs with OpenMP directives, because races result in unintended non-deterministic executions of the program. It is especially important to detect the first data races to occur for effective debugging, because the removal of such races may make other affected races disappear or appear. The previous tools for race detecting can not guarantee that detected races are the first races to occur. This paper suggests a tool what detects the first races to occur on the program with nested parallelism using the two-pass on-the-fly technique. To show functionality of this tool, we empirically compare with the previous tools using a set of the synthetic programs with OpenMP directives.

A Preprocessor for Detecting Potential Races in Shared Memory Parallel Programs with Internal Nondeterminism (내부적 비결정성을 가진 공유 메모리 병렬 프로그램에서 잠재적 경합탐지를 위한 전처리기)

  • Kim, Young-Joo;Jung, Min-Sub;Jun, Yong-Kee
    • The KIPS Transactions:PartA
    • /
    • v.17A no.1
    • /
    • pp.9-18
    • /
    • 2010
  • Races that occur in shared-memory parallel programs such as OpenMP programs must be detected for debugging because of causing unintended non-deterministic results. Previous works which verify the existence of these races on-the-fly are limited to the programs without internal non-determinism. But in the programs with internal non-determinism, such works need at least N! execution instances for each critical section to verify the existence of races, where N is the degree of maximum parallelism. This paper presents a preprocessor that statically analyzes the locations of non-deterministic accesses using program slicing and can detect apparent races as well as potential races through single execution using the analyzed information. The suggested tool can deterministically monitor non-deterministic accesses to occur in OpenMP programs so that this tool can verify the existence of races even if it is used any race detection protocol which can apply to programs with critical section. To prove empirically this tool, we have experimented using a set of benchmark programs such as synthetic programs that involve non-deterministic accesses, OpenMP Microbenchmark, NAS Parallel Benchmark, and OpenMP application programs.

A Verification Tool of Data Races in Programs with OpenMP Directives (OpenMP 디렉티브 프로그램을 위한 자료경합 검증도구)

  • Kim, Young-Joo;Jun, Yong-Kee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.9
    • /
    • pp.395-406
    • /
    • 2007
  • Races in programs with OpenMP directives must be detected for debugging, because they may cause unexpected result by non-deterministic executions. But, Thread Checker of Intel corporation, a well-known existing tool for detecting the races, is not practical because this tool does not verify the existence of races and is known that the cost for race detection is too big. This paper presents a web-based tool which verify the existence of races with an optimal functionality and performance using the results from the property analysis of OpenMP program as well as the user requirements. Our tool is proved to be practical in the aspect of functionality and performance by experiments using synthetic programs, because the suggested tool can verify the existence of race and shows O(n) as the ratio of time consumption while Thread Checker can not verify the existence of race and shows $O(n^2)$ as the ratio, where n is the number of total accesses.

Implementation and Translation of Major OpenMP Directives for Chip Multiprocessor without using OS (단일 칩 다중 프로세서상에서 운영체제를 사용하지 않은 OpenMP 구현 및 주요 디렉티브 변환)

  • Jeun, Woo-Chul;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.4
    • /
    • pp.145-157
    • /
    • 2007
  • OpenMP is an attractive parallel programming model for a chip multiprocessor because there is no standard parallel programming method for a chip multiprocessor and it is easy to write a parallel program in OpenMP. Then, chip multiprocessor systems can have various architectures according to target application programs. So, we need to implement OpenMP in different way for each system. In this paper, we propose the implementation and the effective translation of major OpenMP directives for a chip multiprocessor without using OS to improve the performance without using special hardware and without extending the OpenMP directives. We present the experimental results on our target platform CT3400.

An Efficient Tool for Verifying Races in OpenMP Directive Programs without Interthread Synchronization (스레드 동기화가 없는 OpenMP 디렉티브 프로그램을 위한 효율적인 경합검증 도구)

  • Ha, Ok-Kyoon;Kang, Moon-Hye;Kim, Young-Joo;Jun, Yong-Ki
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.3
    • /
    • pp.301-305
    • /
    • 2008
  • Races must be detected for debugging OpenMP programs with directives, because they may cause unintended nondeterministic results of programs. Intel Thread Checker, an existing tool that can detects races, can not verify the existence of races and is often time-consuming and tends to require large space. To solve these problems, we developed a tool that verifies the existence of races using user requirements and analyzed model of programs. However, the tool does not have optimal performance in programs which have no synchronization for interthread coordination. This paper presents an optimal tool that applies the optimum labeling and protocol for program models without interthread coordination. For synthetic programs without interthread synchronization, the tool verifies races over 250 times faster than the previous tool on the average, even if the maximum parallelism increases in every case of which the number of total accesses are identical.

An Empirical Comparison of Monitoring Filtering Techniques for Dynamic Data Race Detection in Parallel Programs with OpenMP Directives (OpenMP 디렉티브 병렬프로그램에서의 동적 자료경합 탐지를 위한 감시 필터링 기술의 실험적 비교)

  • Cho, Ahra;Ha, Ok-Kyoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.07a
    • /
    • pp.1-2
    • /
    • 2016
  • 다중 스레드 기반 병렬 프로그램에서의 자료경합 탐지는 동시에 수행되는 스레드 간의 비결정적인 상호작용 때문에 탐지하기 어려운 것으로 잘 알려져 있다. 동적 분석기술을 사용하여 자료경합을 탐지할 경우 프로그램 수행의 감시와 충돌하는 모든 메모리 연산의 분석을 위해 추가적인 오버헤드가 발생한다는 단점이 있다. 이러한 동적 분석의 추가적인 오버헤드를 줄이는 방법으로 감시 필터링 기술이 소개되고 있으며, 본 논문에서는 동적 자료경합 탐지를 위한 감시 필터링 기술 중 OpenMP 디렉티브 병렬 프로그램에 적용 가능한 두 기술을 대상으로 실용성과 효율성을 실험적으로 비교한다.

  • PDF

Performance and Scalability of OpenMP Programs on Chip-MultiThreading Server (칩 멀티쓰레딩 서버에서 OpenMP 프로그램의 성능과 확장성)

  • Lee Myung-Ho;Kim Yong-Kyu
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.137-146
    • /
    • 2006
  • Shared Memory Multiprocessor (SMP) systems adopting Chip-level MultiThreading (CMT) technology are becoming mainstream servers in commercial applications and High Performance Computining (HPC) applications as well. OpenMP has become the standard paradigm to parallelize applications for SMP mostly because of its ease of use. As the demand for more computing power in HPC applications is growing rapidly, obtaining high performance and scalability for these applications parallelized using OpenMP API's will become more important. In this paper, we study the performance and scalability of HPC applications parallelized using OpenMP, SPEC OMPL (standard OpenMP benchmark suite), on the Sun Fire E25K server which adopts CMT technology. We also study the effect of CMT on SPEC OMPL.

Effective Race Visualization for Debugging OpenMP Programs (OpenMP프로그램의 디버깅을 위한 효과적 경합 시각화)

  • 김금희;김영주;전용기
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.13-15
    • /
    • 2004
  • OpenMP 프로그램에서 수행되는 스레드들간에 적절한 동기화 없이 적어도 하나의 쓰기 사건으로 동일한 공유변수에 접근하는 경우에 발생되는 오류인 경합은 비결정적인 수행결과를 초래하므로 디버깅을 위해서 반드시 탐지되어야 한다. 이러한 경합탐지를 위한 기존의 디버깅 도구는 프로그램의 복잡한 수행구조 및 디버깅 정보를 시각화하기 위한 공간이 제한적이므로 효과적인 시각화를 제공하지 못한다. 본 논문에서는 경합 시각화를 위해서 3차원적 시각화와 스레드 및 이벤트 둥의 추상화 기능으로 공간적 제약성을 해결하는 도구를 제안한다. 제안된 도구는 추상적 시각화 정보를 제공하므로 프로그램의 이해가 용이하고 효과적인 경합디버깅 환경을 제공한다.

  • PDF

An Analysis of Race Detection Tool for OpenMP Programs (OpenMP 프로그램을 위한 경합탐지 도구의 분석)

  • 김영주;강문혜;전용기
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.478-480
    • /
    • 2003
  • 공유메모리 기반의 OpenMP 프로그램에서 발생하는 경합은 의도하지 않은 비결정적 수행 결과를 초래하므로 효과적으로 경합을 탐지하는 도구가 필요하다. 본 연구는 OpenMP 프로그램의 경합탐지를 위한 Intel 사의 Thread Checker를 내포병렬성의 여부와 접근사건들의 분포 형태를 기준으로 개발한 커널프로그램 집합을 이용하여 분석한 결과로서, 스레드들을 순서적으로 수행하면서 내포된 스레드를 부모 스레드와 동일한 스레드로 간주하고 적어도 하나의 읽기와 쓰기 접근사건들을 유지하면서 수행중에 경합을 탐지하는 도구임을 보인다. 이 도구는 접근사건의 발생 시에 이전 접근사건들과의 경합 여부를 검사한 후에 그 접근사건의 유지 여부를 결정하므로, 논리적 병행성 관계를 반영하지 못하는 내포된 스레드가 존재하지 않으면 경합의 존재를 검증한다.

  • PDF

New execution model for CAPE using multiple threads on multicore clusters

  • Do, Xuan Huyen;Ha, Viet Hai;Tran, Van Long;Renault, Eric
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.825-834
    • /
    • 2021
  • Based on its simplicity and user-friendly characteristics, OpenMP has become the standard model for programming on shared-memory architectures. Checkpointing-aided parallel execution (CAPE) is an approach that utilizes the discontinuous incremental checkpointing technique (DICKPT) to translate and execute OpenMP programs on distributed-memory architectures automatically. Currently, CAPE implements the OpenMP execution model by utilizing the DICKPT to distribute parallel jobs and their data to slave machines, and then collects the results after executing these distributed jobs. Although this model has been proven to be effective in terms of performance and compatibility with OpenMP on distributed-memory systems, it cannot fully exploit the capabilities of multicore processors. This paper presents a novel execution model for CAPE that utilizes two levels of parallelism. In the proposed model, we add another level of parallelism in the form of multithreaded processes on slave machines with the goal of better exploiting their multicore CPUs. Initial experimental results presented near the end of this paper demonstrate that this model provides significantly enhanced CAPE performance.