• Title/Summary/Keyword: message-passing

Search Result 296, Processing Time 0.027 seconds

A NOVEL PARALLEL METHOD FOR SPECKLE MASKING RECONSTRUCTION USING THE OPENMP

  • LI, XUEBAO;ZHENG, YANFANG
    • Journal of The Korean Astronomical Society
    • /
    • v.49 no.4
    • /
    • pp.157-162
    • /
    • 2016
  • High resolution reconstruction technology is developed to help enhance the spatial resolution of observational images for ground-based solar telescopes, such as speckle masking. Near real-time reconstruction performance is achieved on a high performance cluster using the Message Passing Interface (MPI). However, much time is spent in reconstructing solar subimages in such a speckle reconstruction. We design and implement a novel parallel method for speckle masking reconstruction of solar subimage on a shared memory machine using the OpenMP. Real tests are performed to verify the correctness of our codes. We present the details of several parallel reconstruction steps. The parallel implementation between various modules shows a great speed increase as compared to single thread serial implementation, and a speedup of about 2.5 is achieved in one subimage reconstruction. The timing result for reconstructing one subimage with 256×256 pixels shows a clear advantage with greater number of threads. This novel parallel method can be valuable in real-time reconstruction of solar images, especially after porting to a high performance cluster.

PERFORMANCE ANALYSIS OF THE PARALLEL CUPID CODE IN DISTRIBUTED MEMORY SYSTEM BASED ETHERNET AND INFINIBAND NETWORK (이더넷과 인피니밴드 네트워크 기반의 분산 메모리 시스템에서 병렬성능 분석)

  • Jeon, B.J.;Choi, H.G.
    • Journal of computational fluids engineering
    • /
    • v.19 no.2
    • /
    • pp.24-29
    • /
    • 2014
  • In this study, a parallel performance of CUPID-code has been investigated for both Ethernet and Infiniband network system to examine the effect of cache memory and network-speed. Bi-conjugate gradient solver of CUPID-code has been parallelised by using domain decomposition method and message passing interface (MPI). It is shown that the parallel performance of Ethernet-network system is worse than that of Infiniband-network system due to the slow network-speed and a small cache memory. It is also found that the parallel performance of each system deteriorates for a small problem due to the communication overhead, but the performance of Infiniband-network system is better than Ethernet-network system due to a much faster network-speed. For a large problem, the parallel performance depends less on network system.

A framework for parallel processing in multiblock flow computations (다중블록 유동해석에서 병렬처리를 위한 시스템의 구조)

  • Park, Sang-Geun;Lee, Geon-U
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.21 no.8
    • /
    • pp.1024-1033
    • /
    • 1997
  • The past several years have witnessed an ever-increasing acceptance and adoption of parallel processing, both for high performance scientific computing as well as for more general purpose applications. Furthermore with increasing needs to perform the complex flow calculations in an efficient manner, the use of the message passing model on distributed networks has emerged as an important alternative to the expensive supercomputers. This work attempts to provide a generic framework to enable the parallelization of all CFD-related works using the master-slave model. This framework consists of (1) input geometry, (2) domain decomposition, (3) grid generation, (4) flow computations, (5) flow visualization, and (6) output display as the sequential components, but performs computations for (2) to (5) in parallel on the workstation clustering. The flow computations are parallized by having multiple copies of the flow-code to solve a PDE on different spatial regions on different processors, while their flow data are exchanged across the region boundaries, and the solution is time-stepped. The Parallel Virtual Machine (PVM) is used for distributed communication in this work.

Task Creation and Assignment based on Object Caching for Parallel Spatial Join (병렬공간 조인을 위한 객체 캐쉬 기반 태스크 생성 및 할당)

  • 서영덕;김진덕;홍봉희
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1178-1178
    • /
    • 1999
  • A spatial join has the property that its execution time exponentially increases in proportion to the number of spatial objects. Recently, there have been many attempts for improving the performance of the spatial join by using parallel processing schemes, In the case of executing parallel spatial join using the parallel machine with shared disk architecture, the disk bottleneck of parallel processing of spatial join worsens in comparison with sequential spatial join. This paper presents the algorithms of task creation and assignment to reduce the disk bottleneck caused by accessing the shared disk at the same time, and to minimize message passing between processors, This paper proposes object caching which is a higher level of abstraction than page caching, and uses it to do creation and assignment of tasks according to temporal and spatial localities for minimizing disk access time. The object caching shows the performance improvement of 50%. The task creation and assignment using localities gives the gain of 30% and 20%. Overall performance evaluation of the proposed algorithms shows 7.2 times speed up than those of sequential execution of spatial joins.

Design of Fault-tolerant Mutual Exclusion Protocol in Asynchronous Distributed Systems (비동기적 분산 시스템에서 결함허용 상호 배제 프로토콜의 설계)

  • Park, Sung-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.1
    • /
    • pp.182-189
    • /
    • 2010
  • This paper defines the quorum-based fault-tolerant mutual exclusion problem in a message-passing asynchronous system and determines a failure detector to solve the problem. This failure detector, which we call the modal failure detector star, and which we denote by $M^*$, is strictly weaker than the perfect failure detector P but strictly stronger than the eventually perfect failure detector ◇P. The paper shows that at any environment, the problem is solvable with $M^*$.

Developing a Bioinformatics Tool for Peptide Nucleic Acid (PNA) antisense Technique Utilizing Parallel Computing System (Peptide Nucleic Acid(PNA)를 이용한 antisense 기법에 적용할 병렬 컴퓨팅용 Bioinformatics tool 개발)

  • Kim Seong-Jo;Jeon Ho-Sang;Hong Seung-Pyo;Kim Hyon-Chang;Kim Han-Jip;Min Churl-K
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.43-45
    • /
    • 2006
  • Unlike RNA interference, whose usage is limited to eukaryotic cells, Peptide Nucleic Acid (PNA) technique is applicable to both eukaryotic and prokaryotic cells. PNA has been proven to be an effective agent for blocking gene expressions and has several advantages over other antisense techniques. Here we developed a parallel computing software that provides the ideal sequences to design PNA oligos to prevent any off-target effects. We applied a new approach in our location-finding algorithm that finds a target gene from the whole genome sequence. Message Passing Interface (MPI) was used to perform parallel computing in order to reduce the calculation time. The software will help biologists design more accurate and effective antisense PNA by minimizing the chance of off-target effects.

  • PDF

Factor Graph-based Multipath-assisted Indoor Passive Localization with Inaccurate Receiver

  • Hao, Ganlin;Wu, Nan;Xiong, Yifeng;Wang, Hua;Kuang, Jingming
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.703-722
    • /
    • 2016
  • Passive wireless devices have increasing civilian and military applications, especially in the scenario with wearable devices and Internet of Things. In this paper, we study indoor localization of a target equipped with radio-frequency identification (RFID) device in ultra-wideband (UWB) wireless networks. With known room layout, deterministic multipath components, including the line-of-sight (LOS) signal and the reflected signals via multipath propagation, are employed to locate the target with one transmitter and a single inaccurate receiver. A factor graph corresponding to the joint posterior position distribution of target and receiver is constructed. However, due to the mixed distribution in the factor node of likelihood function, the expressions of messages are intractable by directly applying belief propagation on factor graph. To this end, we approximate the messages by Gaussian distribution via minimizing the Kullback-Leibler divergence (KLD) between them. Accordingly, a parametric message passing algorithm for indoor passive localization is derived, in which only the means and variances of Gaussian distributions have to be updated. Performance of the proposed algorithm and the impact of critical parameters are evaluated by Monte Carlo simulations, which demonstrate the superior performance in localization accuracy and the robustness to the statistics of multipath channels.

Efficient Parallel CUDA Random Number Generator on NVIDIA GPUs (NVIDIA GPU 상에서의 난수 생성을 위한 CUDA 병렬프로그램)

  • Kim, Youngtae;Hwang, Gyuhyeon
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1467-1473
    • /
    • 2015
  • In this paper, we implemented a parallel random number generation program on GPU's, which are known for high performance computing, using LCG (Linear Congruential Generator). Random numbers are important in all fields requiring the use of randomness, and LCG is one of the most widely used methods for the generation of pseudo-random numbers. We explained the parallel program using the NVIDIA CUDA model and MPI(Message Passing Interface) and showed uniform distribution and performance results. We also used a Monte Carlo algorithm to calculate pi(${\pi}$) comparing the parallel random number generator with cuRAND, which is a CUDA library function, and showed that our program is much more efficient. Finally we compared performance results using multi-GPU's with those of ideal speedups.

Design and Fabrication FM-VMS using Watermarking Method (워터마킹 기법을 이용한 FM-VMS 설계 및 구현)

  • Moon, Byeong-Sup;Park, Bum-Jin;Weon, Young-Su;Kim, Cheol-Seong
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.43-50
    • /
    • 2010
  • In this thesis, Traffic information which is provided to the VMS used a FM frequency and provides real-time traffic information about the mobile production unit system which designed and produced and a quality evaluated. Result of the research, we will be able to confirm converted audio and text information from traffic information is linked with VMS information, FM broadcast traffic information to motorists passing through it were found to be and as a result of this study, which sees raises the effectiveness of VMS users and using VMS to build low-cos transport infrastructure will be an opportunity.

Automatic real-time system of the global 3-D MHD model: Description and initial tests

  • Park, Geun-Seok;Choi, Seong-Hwan;Cho, Il-Hyun;Baek, Ji-Hye;Park, Kyung-Sun;Cho, Kyung-Suk;Choe, Gwang-Son
    • Bulletin of the Korean Space Science Society
    • /
    • 2009.10a
    • /
    • pp.26.2-26.2
    • /
    • 2009
  • The Solar and Space Weather Research Group (SOS) in Korea Astronomy and Space Science Institute (KASI) is constructing the Space Weather Prediction Center since 2007. As a part of the project, we are developing automatic real-time system of the global 3-D magnetohydrodynamics (MHD) simulation. The MHD simulation model of earth's magnetosphere is designed as modified leap-frog scheme by T. Ogino, and it was parallelized by using message passing interface (MPI). Our work focuses on the automatic processing about simulation of 3-D MHD model and visualization of the simulation results. We used PC cluster to compute, and virtual reality modeling language (VRML) file format to visualize the MHD simulation. The system can show the variation of earth's magnetosphere by the solar wind in quasi real time. For data assimilation we used four parameters from ACE data; density, pressure, velocity of solar wind, and z component of interplanetary magnetic field (IMF). In this paper, we performed some initial tests and made a animation. The automatic real-time system will be valuable tool to understand the configuration of the solar-terrestrial environment for space weather research.

  • PDF