• Title/Summary/Keyword: Parallel Program

Search Result 584, Processing Time 0.026 seconds

Parallel Implementation of Nonlinear Analysis Program of PSC Frame Using MPI (MPI를 이용한 PSC 프레임 비선형해석 프로그램의 병렬화)

  • 이재석;최규천
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2001.04a
    • /
    • pp.61-68
    • /
    • 2001
  • A parallel nonlinear analysis program of prestressed concrete frame is migrated on a PC cluster system and a massively parallel processing system, CRAY T3E system, using MPI. The PC cluster system is configured with Pentium Ⅲ class PCs and fast ethernet. The CRAY T3E system is composed of a set of nodes each containing one Processing Element (PE), a memory subsystem and its distributed memory interconnect network. Parallel computing algorithms are implemented on element-wise processing parts including the calculation of stiffness matrix, element stresses and determination of material states, check of material failure and calculation of unbalanced loads. Parallel performance of the migrated program is evaluated through typical numerical examples.

  • PDF

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • v.18 no.5
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

Design and Analysis for Parallel Operation of Power MOSFETs Using SPICE (SPICE를 이용한 MOSFET의 병렬운전 특성해석 및 설계)

  • 김윤호;윤병도;강영록
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.43 no.2
    • /
    • pp.251-258
    • /
    • 1994
  • To apply the Power MOSFET to the high powerd circuits, the parallel operation of the Power MOSFET must be considered because of their low power rating. This means, in practical applications, design methods for the parallel operations are required. However, it is very difficult to investigate the problem of parallel operations by directly changing the internal parameters of the MOSFET. Thus, in this paper, the effects of internal parameters for the parallel operation are investigated using SPICE program which is often used and known that the program is very reliable. The investigation results show that while the gate resistance and gate capacitances are the parameters which affect to the dynamic switching operations, the drain and source resistances are the parameters which affect to the steady-state current unbalances. Through this investigation, the design methods for the parallel operation of the MOSFET are suggested, which, in turn, contributes to the practical use of Power MOSFETs.

  • PDF

Design and Implementation of Visual Environment for Parallel Object-Oriented Programming (병렬 객체지향 프로그래밍을 위한 시각 환경의 설계 및 구현)

  • Choe, Suk-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.485-496
    • /
    • 1999
  • Comparing with sequential programming, parallel programming has additional complexity due to the consideration of parallelism, communication and synchronization of processes. A synergism between users and compliers should be established, each assisting the other to produce high quality parallel programs. On the above underlying philosophy, we developed a parallel Object-Oriented specification language, POOSL, as preliminary works. However, it is still likely to hard for users to write parallel program because users have to consider grammar of POOSL and to write text-based parallel program. It would be more desirable to provide users wit visual environment for effective parallel programming. Therefore, we propose a visual programming environment. VEPO(Visual environment for Parallel Object-Oriented Programming), based on POOSL in order that users can develop parallel programs more easily and conveniently. It aims at supporting a programming environment in which users can represent their programs more naturally and visually I parallel manner with object-oriented concept and essential steps during parallel program development such as program specification, compilation, execution and animation of execution are integrated. VEPO has useful features for parallel processing. Especially, complicated parallel codes for synchronization and communication of processes are automatically generated in the translation phase, so users can be relieved of writing error-prone parallel codes. The system is targeted to the transputer-based parallel system, MC-3. The graphic user interface of VEPO was implemented using Visual C++. Visual programs descirbed on VEPO are translated into Inmos C and executed on MC-3.

  • PDF

Implementation and Performance Evaluation of Parallel Programming Translator for High Performance Fortran (High Performance Fortran 병렬 프로그래밍 변환기의 구현 및 성능 평가)

  • Kim, Jung-Gwon;Hong, Man-Pyo;Kim, Dong-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.901-915
    • /
    • 1999
  • Parallel computers are known to be excellent in performance per cost also satisfying scalability and high performance. However parallel machines have enjoyed limited success because of difficulty in parallel programming and non-portability between parallel machines. Recently, researchers have sought to develop data parallel language that provides machine independent programming systems. Data parallel language such as High Performance Fortran provides a basis to write a parallel program based on a global name space by partitioning data and computation, generating message-passing function. In this paper, we describe the Parallel Programming Translator(PPTran), source-to-source data parallel compiler, generating MPI SPMD parallel program from HPF input program through four phases such as data dependence analysis, partitioning data, partitioning computation, and code generation with explicit message-passing and verify the performance of PPTran

  • PDF

A Functional Design of Programmable Logic Controller Based on Parallel Architecture (병렬 구조에 의한 가변 논리제어장치의 기능적 설계)

  • 이정훈;신현식
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.40 no.8
    • /
    • pp.836-844
    • /
    • 1991
  • PLC(programmable logic controller) system is widely used for the control of factory. PLC system receives ladder diagram which is drawn by the user to implement hardware logic, converts the ladder diagram into sequence program which is executable in the PLC system, and executes the sequence program indefinitely unless user breaks. The sequence program processes the data of on/off signal, and endures 1 scan delay and missing of pulse-type signal shorter than a scan time. So, data dependency doesn't exist. By applying theis characteristics to multiprocessor architecture, we design parellel PLC functionally and evaluate performance upgrade. Parallel PLC consists of central processing module, N general processing unit, and a shared memory by master-slave type. Each module executes allocated sequence program by the control of central processing module. We can expect performance upgrade by parallel processing, and reliability by relocation of sequence program when error occurs in processing module.

  • PDF

A Representation for Multithreaded Data-parallel Programs : PCFG(Parallel Control Flow Graph) (다중스레드 데이타 병렬 프로그램의 표현 : PCFG(Parallel Control Flow Graph))

  • 김정환
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.12
    • /
    • pp.655-664
    • /
    • 2002
  • In many data-parallel applications massive parallelism can be easily extracted through data distribution. But it often causes very long communication latency. This paper shows that task parallelism, which is extracted from data-parallel programs, can be exploited to hide such communication latency Unlike the most previous researches over exploitation of task parallelism which has not been considered together with data parallelism, this paper describes exploitation of task parallelism in the context of data parallelism. PCFG(Parallel Control Flow Graph) is proposed to represent a multithreaded program consisting of a few task threads each of which can include a few data-parallel loops. It is also described how a PCFG is constructed from a source data-parallel program through HDG(Hierarchical Dependence Graph) and how the multithreaded program can be constructed from the PCFG.

Implementation of parallel blocked LU decomposition program for utilizing cache memory on GP-GPUs (GP-GPU의 캐시메모리를 활용하기 위한 병렬 블록 LU 분해 프로그램의 구현)

  • Kim, Youngtae;Kim, Doo-Han;Yu, Myoung-Han
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.41-47
    • /
    • 2013
  • GP-GPUs are general purposed GPUs for numerical computation based on multiple threads which are originally for graphic processing. GP-GPUs provide cache memory in a form of shared memory which user programs can access directly, unlikely typical cache memory. In this research, we implemented the parallel block LU decomposition program to utilize cache memory in GP-GPUs. The parallel blocked LU decomposition program designed with Nvidia CUDA C run 7~8 times faster than nun-blocked LU decomposition program in the same GP-GPU computation environment.

Development of a CNC Machine using a Parallel Mechanism (병렬기구 공작기계의 프로그램 개발)

  • 박근우
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2000.04a
    • /
    • pp.679-684
    • /
    • 2000
  • This paper presents the development of system and program for a Parallel-Typed CNC Machine. The system consists of parallel manipulator, PC (Personal Computer), DMC (DSP Motion Controller), and machining tools. In order to control the manipulator, the program, which is implemented in "c/c++" language, involves inverse/direct kinematics, velocity mapping, Jacobian and etc. A controller computes the kinematic formulation in real-time and generates and motion by the DMC. A monitor, which has access to program and sensory information, displays the status of manipulator.nipulator.

  • PDF

A Study on Generation of Parallel Task in High Performance Language (고성능 언어에서의 병렬 태스크 생성에 관한 연구)

  • Park, Sung-Soon;Koo, Mi-Soon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.6
    • /
    • pp.1636-1651
    • /
    • 1997
  • In task parallel language like Fortran M, programmer writes a task parallel program using parallel constructs which is provided. When some data dependencies exist between called procedures in various applications, it is difficult for programmer to write program according to their dependencies. Therefore, it is desirous that compiler can detect some implicit parallelisms and transform a program to parallelized form by using the task parallel constructs like PROCESSES block or PROCESSDO loop of Fortran M. But current task parallel language compilers can't provide these works. In this paper, we analyze the cases according to dependence relations and detect the implicit parallelism which can be transformed to task parallel constructs like PROCESSES block and PROCESSDO loop of Fortran M. Also, For the case which program can be paralleized both PROCESSES block and PROCESSDO loop, we analyze that which construct is more effective for various conditions.

  • PDF