• Title/Summary/Keyword: Parallel Program

Search Result 586, Processing Time 0.026 seconds

Evaluation Indicators for Learning Company Participating Work-Study Parallel Program (일학습병행 학습기업 평가지표)

  • Dong-Wook Kim;Hwan Young Choi
    • Journal of Practical Engineering Education
    • /
    • v.15 no.1
    • /
    • pp.223-232
    • /
    • 2023
  • The Work-Study parallel program has been promoted as a key policy to resolve the mismatch between industrial sites and school education and realize a competency-centered society, and as of December 2022, 16,664 companies participated in the training. Learning companies play a very important role as education and training supply organizations that conduct field training. In this study, for the evaluation of learning companies participating in work-study program, the authors derive important factors that determine the quality of on-site education and training by analyzing the cognitive structure of experts in charge of the company and present evaluation indicators for learning enterprises. Therefore, it is intended to lay the foundation for promoting the quality of work-study parallel program.

Tuning the Performance of Haskell Parallel Programs Using GC-Tune (GC-Tune을 이용한 Haskell 병렬 프로그램의 성능 조정)

  • Kim, Hwamok;An, Hyungjun;Byun, Sugwoo;Woo, Gyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.459-465
    • /
    • 2017
  • Although the performance of computer hardware is increasing due to the development of manycore technologies, software lacking a proportional increase in throughput. Functional languages can be a viable alternative to improve the performance of parallel programs since such languages have an inherent parallelism in evaluating pure expressions without side-effects. Specifically, Haskell is notably popular for parallel programming because it provides easy-to-use parallel constructs based on monads. However, the scalability of parallel programs in Haskell tends to fluctuate as the number of cores increases, and the garbage collector is suspected to be the source of this fluctuations because it affects both the space and the time needed to execute the programs. This paper uses the tuning tool, GC-Tune, to improve the scalability of the performance. Our experiment was conducted with a parallel plagiarism detection program, and the scalability improved. Specifically, the fluctuation range of the speedup was narrowed down by 39% compared to the original execution of the program without any tuning.

Distributed Parallel Computing Environment for Java (자바를 위한 분산된 병렬 컴퓨팅 환경)

  • 이상윤;김승호
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.6
    • /
    • pp.23-37
    • /
    • 2004
  • Since java thread is an object which is treated as independent process within one execution space in the multiprocessing environment, we can use it for independent process of parallel processing. Using thread and synchronization mechanism of java enables us to write parallel application program easily. Therefore, a lot of results are exist which is apply the feature of java that support parallel processing to the distributed computing environment. In this paper, we introduce a system of environment that support parallel execution of thread which is included in legacy java program. The system named TORB(Transparent Object Request Broker) enables us parallel execution of legacy java program after simple converting process, since it support the feature of programming transparency. TORB is extended version of distributed programming tool that is published by our research team. And it had only typical distributed processing feature that is execute a specified function at the specified computer.

A Study on the Automatic Parallelization Method and Tool Development

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.87-94
    • /
    • 2020
  • Recently, computer hardware is evolving toward increasing the number of computing cores, not increasing the clock speed. In order to use the performance of parallelized hardware to the maximum, the running program must also be parallelized. However, software developers are accustomed to sequential programs, and in most cases, write programs that operate sequentially. They also have a lot of difficulty designing and developing software in parallel. We propose a method to automatically convert a sequential C/C++ program into a parallelized program, and develop a parallelization tool that supports it. It supports open multiprocessing (OpenMP) and parallel patterns library (PPL) as a parallel framework. Perfect automatic parallelization is difficult due to dynamic features such as pointer operation and polymorphism in C/C++ language. This study focuses on verifying the conditions of parallelization rather than focusing on fully automatic parallelization, and providing advice to developers in detail if parallelization is not possible.

PERFORMANCE ENHANCEMENT OF PARALLEL MULTIFRONTAL SOLVER ON BLOCK LANCZOS METHOD

  • Byun, Wan-Il;Kim, Seung-Jo
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.13 no.1
    • /
    • pp.13-20
    • /
    • 2009
  • The IPSAP which is a finite element analysis program has been developed for high parallel performance computing. This program consists of various analysis modules - stress, vibration and thermal analysis module, etc. The M orthogonal block Lanczos algorithm with shiftinvert transformation is used for solving eigenvalue problems in the vibration module. And the multifrontal algorithm which is one of the most efficient direct linear equation solvers is applied to factorization and triangular system solving phases in this block Lanczos iteration routine. In this study, the performance enhancement procedures of the IPSAP are composed of the following stages: 1) communication volume minimization of the factorization phase by modifying parallel matrix subroutines. 2) idling time minimization in triangular system solving phase by partial inverse of the frontal matrix and the LCM (least common multiple) concept.

  • PDF

Efficient Parallel CUDA Random Number Generator on NVIDIA GPUs (NVIDIA GPU 상에서의 난수 생성을 위한 CUDA 병렬프로그램)

  • Kim, Youngtae;Hwang, Gyuhyeon
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1467-1473
    • /
    • 2015
  • In this paper, we implemented a parallel random number generation program on GPU's, which are known for high performance computing, using LCG (Linear Congruential Generator). Random numbers are important in all fields requiring the use of randomness, and LCG is one of the most widely used methods for the generation of pseudo-random numbers. We explained the parallel program using the NVIDIA CUDA model and MPI(Message Passing Interface) and showed uniform distribution and performance results. We also used a Monte Carlo algorithm to calculate pi(${\pi}$) comparing the parallel random number generator with cuRAND, which is a CUDA library function, and showed that our program is much more efficient. Finally we compared performance results using multi-GPU's with those of ideal speedups.

Implementation of Embedded Micro Web Server for Web based Remote Hardware Control and Monitor (웹 기반 하드웨어 원격감시 및 제어를 위한 초소형 내장형 웹 서버 시스템의 구현)

  • Han, Kyong-Ho
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.20 no.6
    • /
    • pp.104-110
    • /
    • 2006
  • In this paper, we proposed the micro web-server implementation on Strong ARM processor with embedded Linux. The parallel port connecting parallel I/O is controlled via HTTP protocol and web browser program HTTP protocol with Linux, the micro web server program and port control program are installed on-board memory using CGI to be accessed by web browser. The processor parallel input port is monitored and parallel output port is controlled from remote hosts via HTTP protocol. The result of the proposed embedded micro-web server can be used in remote automation systems, distributed control via internet using web browser.

Assessment of computational performance for a vector parallel implementation: 3D probabilistic model discrete cracking in concrete

  • Paz, Carmen N.M.;Alves, Jose L.D.;Ebecken, Nelson F.F.
    • Computers and Concrete
    • /
    • v.2 no.5
    • /
    • pp.345-366
    • /
    • 2005
  • This work presents an assessment of the computational performance of a vector-parallel implementation of probabilistic model for concrete cracking in 3D. This paper shows the continuing efforts towards code optimization as reported in earlier works Paz, et al. (2002a,b and 2003). The probabilistic crack approach is based on the direct Monte Carlo method. Cracking is accounted by means of 3D interface elements. This approach considers that all nonlinearities are restricted to interface elements modeling cracks. The heterogeneity governs the overall cracking behavior and related size effects on concrete fracture. Computational kernels in the implementation are the inexact Newton iterative driver to solve the non-linear problem and a preconditioned conjugate gradient (PCG) driver to solve linearized equations, using an element by element (EBE) strategy to compute matrix-vector products. In particular the paper analyzes code behavior using OpenMP directives in parallel vector processors (PVP), such as the CRAY SV1 and CRAY T94. The impact of the memory architecture on code performance, and also some strategies devised to circumvent this issue are addressed by numerical experiment.

Parallelization of a Purely Functional Bisimulation Algorithm

  • Ahn, Ki Yung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.11-17
    • /
    • 2021
  • In this paper, we demonstrate a performance boost by parallelizing a purely functional bisimulation algorithm on a multicore processor machine. The key idea of this parallelization is exploiting the referential transparency of purely functional programs to minimize refactoring of the original implementation without any parallel constructs. Both original and parallel implementations are written in Haskell, a purely functional programming language. The change from the original program to the parallel program is minuscule, maintaining almost original structure of the program. Through benchmark, we show that the proposed parallelization doubles the performance of the bisimulation test compared to the original non-parallel implementation. We also shaw that similar performance boost is also possible for a memoized version of the bisimulation implementation.

A Fast Transmission of Mobile Agents Using Binomial Trees (바이노미얼 트리를 이용한 이동 에이전트의 빠른 전송)

  • Cho, Soo-Hyun;Kim, Young-Hak
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.341-350
    • /
    • 2002
  • As network environments have been improved and the use of internet has been increased, mobile agent technologies are widely used in the fields of information retrieval, network management, electronic commerce, and parallel/distributed processing. Recently, a lot of researchers have studied the concepts of parallel/distributed processing based on mobile agents. SPMD is the parallel processing method which transmits a program to all the computers participated in parallel environment, and performs a work with different data. Therefore, to transmit fast a program to all the computers is one of important factors to reduce total execution time. In this paper, we consider the parallel environment consisting of mobile agents system, and propose a new method which transmits fast a mobile agent code to all the computers using binomial trees in order to efficiently perform the SPMD parallel processing. The proposed method is compared with another ones through experimental evaluation on the IBM's Aglets, and gets greatly better performance. Also this paper deals with fault tolerances which can be occurred in transmitting a mobile agent using binomial trees.