• Title/Summary/Keyword: parallel computers

Search Result 141, Processing Time 0.024 seconds

RECENT ADVANCES IN DOMAIN DECOMPOSITION METHODS FOR TOTAL VARIATION MINIMIZATION

  • LEE, CHANG-OCK;PARK, JONGHO
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.24 no.2
    • /
    • pp.161-197
    • /
    • 2020
  • Total variation minimization is standard in mathematical imaging and there have been numerous researches over the last decades. In order to process large-scale images in real-time, it is essential to design parallel algorithms that utilize distributed memory computers efficiently. The aim of this paper is to illustrate recent advances of domain decomposition methods for total variation minimization as parallel algorithms. Domain decomposition methods are suitable for parallel computation since they solve a large-scale problem by dividing it into smaller problems and treating them in parallel, and they already have been widely used in structural mechanics. Differently from problems arising in structural mechanics, energy functionals of total variation minimization problems are in general nonlinear, nonsmooth, and nonseparable. Hence, designing efficient domain decomposition methods for total variation minimization is a quite challenging issue. We describe various existing approaches on domain decomposition methods for total variation minimization in a unified view. We address how the direction of research on the subject has changed over the past few years, and suggest several interesting topics for further research.

Realtime Air Diffusion Prediction System

  • Kim Youngtae;Kim Tae KooK;Oh Jai-Ho
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.88-90
    • /
    • 2003
  • We implement Realtime Air Diffusion Prediction System which is designed for air diffusion simulations with four-dimensional data assimilation. For realtime running, we parallelize the system using MPI (Message Passing Interface) on distributed-memory parallel computers and build a cluster computer which links high-performance PCs with high-speed interconnection networks. We use 162­CPU nodes and a Myrinet network for the cluster

  • PDF

An Implementation of a Home Automation Server Based on Linux (리눅스를 기반으로 한 홈오토메이션 서버의 구현)

  • Sung, Han-Yong;Kim, Kyu-Chil;Bang, Chul-Won;Kim, Yong-Seok
    • Journal of Industrial Technology
    • /
    • v.22 no.B
    • /
    • pp.141-146
    • /
    • 2002
  • It becomes common to use computers to control electronic devices and security facilities in newly constructed buildings and house. There are many home application devices in the market which can be controlled by computers. But they are expensive and managed by specialized companies. This paper is focused on personal computers which are available in most homes and can be used to control home electronic appliances and home security facilities. We implemented a home automation server based on Linux. The standard parallel port of personal computer is used to connect sensors and actuators. Therefore, the cost of the server is very low. Moreover, the server is connected to Internet and anywhere we can control and monitor the home security facilities and home automation systems.

  • PDF

Design of Format Converter for Pixel-Parallel Image Processing (화소-병렬 영상처리를 위한 포맷 변환기 설계)

  • 김현기;이천희
    • Journal of the Korea Society for Simulation
    • /
    • v.10 no.3
    • /
    • pp.59-70
    • /
    • 2001
  • Typical low-level image processing tasks require thousands of operations per pixel for each input image. Traditional general-purpose computers are not capable of performing such tasks in real time. Yet important features of traditional computers are not exploited by low-level image processing tasks. Since storage requirements are limited to a small number of low-precision integer values per pixel, large hierarchical memory systems are not necessary. The mismatch between the demands of low-level image processing tasks and the characteristics of conventional computers motivates investigation of alternative architectures. The structure of the tasks suggests employing an array of processing elements, one per pixel, sharing instructions issued by a single controller. In this paper we implemented various image processing filtering using the format converter. Also, we realized from conventional gray image process to color image process. This design method is based on realized the large processor-per-pixel array by integrated circuit technology This format converter design has control path implementation efficiently, and can be utilize the high technology without complicated controller hardware.

  • PDF

Parallel O.C. Algorithm for Optimal design of Plane Frame Structures (평면골조의 최적설계를 위한 병렬 O.C. 알고리즘)

  • 김철용;박효선;박성무
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2000.04b
    • /
    • pp.466-473
    • /
    • 2000
  • Optimality Criteria algorithm based on the derivation of reciprocal approximations has been applied to structural optimization of large-scale structures. However, required computational cost for the serial analysis algorithm of large-scale structures consisting of a large number of degrees of freedom and members is too high to be adopted in the solution process of O.C. algorithm Thus, parallel version of O.C. algorithm on the network of personal computers is presented in this Paper. Parallelism in O.C. algorithm may be classified into two regions such as analysis and optimizer part As the first step of development of parallel algorithm, parallel structural analysis algorithm is developed and used in O.C. algorithm The algorithm is applied to optimal design of a 54-story plane frame structure

  • PDF

A Study on the Efficient m-step Parallel Generalization

  • Kim, Sun-Kyung
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.13-16
    • /
    • 2005
  • It would be desirable to have methods for specific problems, which have low communication costs compared to the computation costs, and in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming m simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications compared to the standard methods.

  • PDF

Development of Parallel Eigenvalue Solution Algorithm with Substructuring Techniques (부구조기법을 이용한 병렬 고유치해석 알고리즘 개발)

  • 김재홍;성창원;박효선
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 1999.10a
    • /
    • pp.411-420
    • /
    • 1999
  • The computational model and a new eigenvalue solution algorithm for large-scale structures is presented in the form of parallel computation. The computational loads and data storages required during the solution process are drastically reduced by evenly distributing computational loads to each processor. As the parallel computational model, multiple personal computers are connected by 10Mbits per second Ethernet card. In this study substructuring techniques and static condensation method are adopted for modeling a large-scale structure. To reduce the size of an eigenvalue problem the interface degrees of freedom and one lateral degree of freedom are selected as the master degrees of freedom in each substructure. The performance of the proposed parallel algorithm is demonstrated by applying the algorithm to dynamic analysis of two-dimensional structures.

  • PDF

Research for Efficient Massive File I/O on Parallel Programs (병렬 프로그램에서의 효율적인 대용량 파일 입출력 방식의 비교 연구)

  • Hwang, Gyuhyeon;Kim, Youngtae
    • Journal of Internet Computing and Services
    • /
    • v.18 no.2
    • /
    • pp.53-60
    • /
    • 2017
  • Since processors are handling inputs and outputs independently on distributed memory computers, different file input/output methods are used. In this paper, we implemented and compared various file I/O methods to show their efficiency on distributed memory parallel computers. The implemented I/O systems are as following: (i) parallel I/O using NFS, (ii) sequential I/O on the host processor and domain decomposition, (iii) MPI-IO. For performance analysis, we used a separated file server and multiple processors on one or two computational servers. The results show the file I/O with NFS for inputs and sequential output with domain composition for outputs are best efficient respectively. The MPI-IO result shows unexpectedly the lowest performance.

Implementation and Performance Evaluation of Socket and RMI based Java Message Passing Systems (소켓 및 RMI 기반 자바 메시지 전달 시스템의 구현 및 성능평가)

  • Bang, Seung-Jun;Ahn, Jin-Ho
    • Journal of Internet Computing and Services
    • /
    • v.8 no.5
    • /
    • pp.11-20
    • /
    • 2007
  • This paper designs and implements a message passing library called JMPI (Java Message Passing Interface) which complies with MPJ (Message Passing in Java), the MPI standard Specification for Java language, This library provides some graphic user interface tools to enable parallel computing environments to be configured very simply by their administrators and JMPI applications to be executed very conveniently. Also in this paper, we implement two versions of systems using Socket and RPC which are both typical distributed system communication mechanisms and with three benchmark applications, compare performance of these systems with that of an existing system JPVM depending on the increasing number of the computers. Experimental results show that our systems outperform JPVM system in terms of various aspects and that the most efficient processing speedup can be obtained by increasing the number of the computers in consideration of network traffic through processing evaluation. Finally, we can see that, as the number of computers increases, using RMI to transmit a message is more effective than using object streams attached to sockets to transmit a message.

  • PDF