• Title/Summary/Keyword: Speedup

Search Result 274, Processing Time 0.025 seconds

Design and Implementation of the DEVS-based Distributed Simulation Environment: D-DEVSim++ (DEVS에 기반한 분산 시뮬레이션 환경 $D-DEVSim^{++}$의 설계 및 구현)

  • 김기형
    • Journal of the Korea Society for Simulation
    • /
    • v.5 no.2
    • /
    • pp.41-58
    • /
    • 1996
  • The Discrete Event Systems Specification(DEVS) formalism specifies a discrete event system in a hierarchical, modular form. This paper presents a distributed simulation environment D-DEVSim++ for models specified by the DEVS formalism. D-DEVSim++ employs a new simulation scheme which is a hybrid algorithm of the hierarchical simulation and Time Warp mechanisms. The scheme can utilize both the hierarchical scheduling parallelism and the inherent parallelism of DEVS models. This hierarchical scheduling parallelism is investigated through analysis. Performance of the proposed methodology is evaluated through benchmark simulation on a 5-dimensional hypercube parallel machine. The performance results indicate that the methodology can achieve significant speedup. Also, it is shown that the analyzed speedup for the hierarchical scheduling time corresponds the experiment.

  • PDF

An Implementation of High-Speed Parallel Processing System for Neural Network Design by Using the Multicomputer Network (다중 컴퓨터 망에서 신경회로망 설계를 위한 고속병렬처리 시스템의 구현)

  • 김진호;최흥문
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.5
    • /
    • pp.120-128
    • /
    • 1993
  • In this paper, an implementation of high-speed parallel processing system for neural network design on the multicomputer network is presented. Linear speedup expandability is increased by reducing the synchronization penalty and the communication overhead. Also, we presented the parallel processing models and their performance evaluation models for each of the parallization methods of the neural network. The results of the experiments for the character recognition of the neural network bases on the proposed system show that the proposed approach has the higher linear speedup expandability than the other systems. The proposed parallel processing models and the performance evaluation models could be used effectively for the design and the performance estimation of the neural network on the multicomputer network.

  • PDF

AN EFFICIENT CODING METHODS FOR THE TWO COMPOSITION TYPES OF THE KOREAN ALPHABET ON A MASPAR MACHINE

  • Min, Yong-Sik
    • Journal of applied mathematics & informatics
    • /
    • v.5 no.1
    • /
    • pp.191-200
    • /
    • 1998
  • There are two types of composition systems for the Korean alphabet: a combined system and a composite system. This paper decribes an efficient coding method for both of these two types. Using this coding method with the combined system yields about 10.5% code-length savings per a Korean character while it yields about 45% savings with the composite system. In other words the coding method produces a better result(i.e. 34.5% better) with the composite system than with the combined system. The simulation has been performed on a MasPar machine having 64 processors. The results show that the combined system achieved a 45.851-fold speedup while the composite system achieved a 47.274-fold speedup.

Time Complexity Measurement on CUDA-based GPU Parallel Architecture of Morphology Operation

  • Izmantoko, Yonny S.;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.4
    • /
    • pp.444-452
    • /
    • 2013
  • Operation time of a function or procedure is a thing that always needs to be optimized. Parallelizing the operation is the general method to reduce the operation time of the function. One of the most powerful parallelizing methods is using GPU. In image processing field, one of the most commonly used operations is morphology operation. Three types of morphology operations kernel, na$\ddot{i}$ve, global and shared, are presented in this paper. All kernels are made using CUDA and work parallel on GPU. Four morphology operations (erosion, dilation, opening, and closing) using square structuring element are tested on MRI images with different size to measure the speedup of the GPU implementation over CPU implementation. The results show that the speedup of dilation is similar for all kernels. However, on erosion, opening, and closing, shared kernel works faster than other kernels.

PHDCM : Efficient Compression of Hangul Text in Parallel (PHDCM : 병렬 컴퓨터에서 한글 텍스트의 효율적인 축약)

  • Min, Yong-Sㅑk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2E
    • /
    • pp.50-56
    • /
    • 1995
  • This paper describes an efficient coding method for Korean characters using a three-state transition graph. To our knowledge, this is the first achievement of its kind. This new method, called the Paralle Hangul Dynamic Coding Method(PHDCM), compresses about 3.5 bits per a Korean character, which is more than 1 bit shorter than the conventional codes introduced thus far to achieve extensive code compression. When we ran the method on a MasPar machine, which is on SIMD SM (EFEW-PRAM)., it achieved a 49.314-fold speedup with 64 processors having 10 million Korean characters.

  • PDF

Context-free Marker-controlled Watershed Transform for Over-segmentation Reduction

  • Seo, Kyung-Seok;Cho, Sang-Hyun;Park, Chang-Joon;Park, Heung-Moon
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.482-485
    • /
    • 2000
  • A modified watershed transform is proposed which is context-free marker-controlled and minima imposition-free to reduce the over-segmentation and to speedup the transform. In contrast to the conventional methods in which a priori knowledge, such as flat zones, zones of homogeneous texture, and morphological distance, is required for marker extraction, context-free marker extraction is proposed by using the attention operator based on the GST (generalized symmetry transform). By using the context-free marker, the proposed watershed transform exploit marker-constrained labeling to speedup the computation and to reduce the over-segmentation by eliminating the unnecessary geodesic reconstruction such as the minima imposition and thereby eliminating the necessity of the post-processing of region merging. The simulation results show that the proposed method can extract context-free markers inside the objects from the complex background that includes multiple objects and efficiently reduces over-segmentation and computation time.

  • PDF

Speedup of Sequential Program Execution on a Network of Shared Workstations

  • Cho, Sung-Hyun;Jun, Sung-Syck
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.6
    • /
    • pp.183-190
    • /
    • 1997
  • We present competition protocols to speed up the execution of sequential programs on a network of shared workstations in the background by exploiting their wasted computing capacity, without interfering with processes of workstation owners. In order to argue that competition protocols are preferable to migration protocols in this situation, we derive the closed form solutions for the speedup of competition protocols and migration protocols, and simulate both of protocols under comparable overhead assumptions. Based on our analytic results and simulation results, we show that competitive execution is superior to process migration, and that competitive execution can finish sequential programs significantly faster than noncompetitive execution, especially when the foreground load is sufficiently high.

  • PDF

IMAGE SYNTHESIS FOR DYNAMIC SCENES

  • Feng, Chen-Chin;Chang, Su-Yuan;Yang, Shi-Nine
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1999.06a
    • /
    • pp.15.1-21
    • /
    • 1999
  • Radiosity method is a global illumination model for image synthesis. It computes all energy interactions among diffuse elements in a virtual environment. One of the major drawbacks if its time consuming computation. Existing radiosity algorithms for static scene is difficult to be applicable to dynamic environments. In this paper we proposed an hierarchical scene partition scheme to speedup the link update computations in the dynamic environments. Since the proposed spatial data structure is global, it not only can be used to speedup the culling of non-affected links after geometry change, but also can be used to accelerate the subsequent visibility computation. Several empirical tests are given to show the efficiency of our improved algorithm.

Effect of Wind Speed up by Seawall on a Wind Turbine (방파제에 의한 풍속할증이 풍력터빈에 미치는 영향)

  • Ha, Young-Cheol;Lee, Bong-Hee;Kim, Hyun-Goo
    • Journal of the Korean Solar Energy Society
    • /
    • v.33 no.3
    • /
    • pp.1-8
    • /
    • 2013
  • In order to identify positive or negative effect of seawall on wind turbine, a wind tunnel experiment has been conducted with a 1/100 scaled-down model of Goonsan wind farm which is located in West coast along seawall. Wind speedup due to the slope of seawall contributed to about 3% increment of area-averaged wind speed on rotor-plane of a wind turbine which is anticipated to augment wind power generation. From the turbulence measurement and flow visualization, it was confirmed that there would be no negative effect due to flow separation because its influence is confined below wind turbine blades' sweeping height.

Quality of Coverage Analysis on Distributed Stochastic Steady-State Simulations (분산 시뮬레이션에서의 Coverage 분석에 관한 연구)

  • Lee, Jong-Suk-R.;Park, Hyoung-Woo;Jeong, Hae-Duck-J.
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.519-524
    • /
    • 2002
  • In this paper we study the qualify of sequential coverage analysis under a scenario of distributed stochastic simulation known as MRIP(Multiple Replications In Parallel) in terms of the confidence intervals of coverage and the speedup. The estimator based in the F-distribution was applied to the sequential coverage analysis of steady-state means. in simulations of the $M/M/1/{\infty},\;M/D/I/{\infty}\;and\;M/H_{2}/1/{\infty}$ queueing systems on a single processor and multiple processors. By using multiple processors under the MRIP scenario, the time for collecting many replications needed in sequential coverage analysis is reduced. One can also easily collect more replications by executing it in distributed computers or clusters linked by a local area network.