• Title/Summary/Keyword: Parallel computing

Search Result 812, Processing Time 0.026 seconds

Task Allocation strategy for Distributed/Parallel Computing based on Realtime Network Monitoring (실시간 네트워크 모니터링 기반 분산/병렬 컴퓨팅의 작업 할당 전략)

  • 정재홍;김수자;박복자;송은하;정영식
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10c
    • /
    • pp.631-633
    • /
    • 2003
  • 인터넷 가반 분산/병렬 처리 프레임 워크 PDP(Parallel/Distributed Processing Scheme on Web)는 네트워크 내 유휴 상태 호스트들을 활용하여 대용량 작업을 병렬로 처리한다. 본 논문에서는 이러한 서브 작업을 할당받는 자원이 동작하는 네트워크 환경을 모니터링 함으로써 수시로 변화하는 네트워크 환경에 대처하는 방안을 제시한다. 특히 네트워크 환경 모니터링 예측 결과를 PDP의 작업 할당 알고리즘에 적용하여 네트워크 과부하 및 결함 등으로 인해 발생되는 작업 지연 요소에 적응적 대처함으로써 전체 작업 수행 처리율 향상을 도모하는 방법을 제안한다.

  • PDF

Design of an efficient routing algorithm on the WK-recursive network

  • Chung, Il-Yong
    • Smart Media Journal
    • /
    • v.11 no.9
    • /
    • pp.39-46
    • /
    • 2022
  • The WK-recursive network proposed by Vecchia and Sanges[1] is widely used in the design and implementation of local area networks and parallel processing architectures. It provides a high degree of regularity and scalability, which conform well to a design and realization of distributed systems involving a large number of computing elements. In this paper, the routing of a message is investigated on the WK-recursive network, which is key to the performance of this network. We present an efficient shortest path algorithm on the WK-recursive network, which is simpler than Chen and Duh[2] in terms of design complexity.

Realizing TDNN for Word Recognition on a Wavefront Toroidal Mesh-array Neurocomputer

  • Hong Jeong;Jeong, Cha-Gyun;Kim, Myung-Won
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.1
    • /
    • pp.98-107
    • /
    • 1996
  • In this paper, we propose a scheme that maps the time-delay neural network (TDNN) into the neurocomputer called EMIND-II which has the wavefront toroidal mesh-array structure. This neurocomputer is scalable, consists of many timeshared virtual neurons, is equipped with programmable on-chip learning, and is versatile for building many types of neural networks. Also we define the programming model of this array and derive the parallel algorithms about TDNN for the proposed neurocomputer EMIND-II. In addition, the computational complexities for the parallel and serial algorithms are compared. Finally, we introduce an application of this neurocomputer to word recognition.

  • PDF

An Application-Level Fault Tolerant System For Synchronous Parallel Computation (동기 병렬연산을 위한 응용수준의 결함 내성 연산시스템)

  • Park, Pil-Seong
    • Journal of Internet Computing and Services
    • /
    • v.9 no.5
    • /
    • pp.185-193
    • /
    • 2008
  • An MTBF(mean time between failures) of large scale parallel systems is known to be only an order of several hours, and large computations sometimes result in a waste of huge amount of CPU time, However. the MPI(Message Passing Interface), a de facto standard for message passing parallel programming, suggests no possibility to handle such a problem. In this paper, we propose an application-level fault tolerant computation system, purely on the basis of the current MPI standard without using any non-standard fault tolerant MPI library, that can be used for general scientific synchronous parallel computation.

  • PDF

PERFORMANCE ENHANCEMENT OF PARALLEL MULTIFRONTAL SOLVER ON BLOCK LANCZOS METHOD

  • Byun, Wan-Il;Kim, Seung-Jo
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.13 no.1
    • /
    • pp.13-20
    • /
    • 2009
  • The IPSAP which is a finite element analysis program has been developed for high parallel performance computing. This program consists of various analysis modules - stress, vibration and thermal analysis module, etc. The M orthogonal block Lanczos algorithm with shiftinvert transformation is used for solving eigenvalue problems in the vibration module. And the multifrontal algorithm which is one of the most efficient direct linear equation solvers is applied to factorization and triangular system solving phases in this block Lanczos iteration routine. In this study, the performance enhancement procedures of the IPSAP are composed of the following stages: 1) communication volume minimization of the factorization phase by modifying parallel matrix subroutines. 2) idling time minimization in triangular system solving phase by partial inverse of the frontal matrix and the LCM (least common multiple) concept.

  • PDF

Realtime Tide and Storm-Surge Computations for the Yellow Sea Using the Parallel Finite Element Model (병렬 유한요소 모형을 이용한 황해의 실시간 조석 및 태풍해일 산정)

  • Byun, Sang-Shin;Choi, Byung-Ho;Kim, Kyeong-Ok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.12 no.1
    • /
    • pp.29-36
    • /
    • 2009
  • Realtime tide and storm-surge computations for the Yellow Sea were conducted using the Parallel Finite Element Model. For these computations a high resolution grid system was constructed with a minimum node interval of loom in Gyeonggi Bay. In the modeling, eight main tidal constituents were analyzed and their results agreed well with the observed data. The realtime tide computation with the eight main tidal constituents and the storm-surge simulation for Typhoon Sarah(1959) were also conducted using parallel computing system of MPI-based LINUX clusters. The result showed a good performance in simulating Typhoon Sarah and reducing the computation time.

Performance Enhancement of Parallel Prime Sieving with Hybrid Programming and Pipeline Scheduling (혼합형 병렬처리 및 파이프라이닝을 활용한 소수 연산 알고리즘)

  • Ryu, Seung-yo;Kim, Dongseung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.10
    • /
    • pp.337-342
    • /
    • 2015
  • We develop a new parallelization method for Sieve of Eratosthenes algorithm, which enhances both computation speed and energy efficiency. A pipeline scheduling is included for better load balancing after proper workload partitioning. They run on multicore CPUs with hybrid parallel programming model which uses both message passing and multithreading computation. Experimental results performed on both small scale clusters and a PC with a mobile processor show significant improvement in execution time and energy consumptions.

Effect of Representation Methods on Time Complexity of Genetic Algorithm based Task Scheduling for Heterogeneous Network Systems

  • Kim, Hwa-Sung
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.1 no.1
    • /
    • pp.35-53
    • /
    • 1997
  • This paper analyzes the time complexity of Genetic Algorithm based Task Scheduling (GATS) which is designed for the scheduling of parallel programs with diverse embedded parallelism types in a heterogeneous network systems. The analysis of time complexity is performed based on two representation methods (REIA, REIS) which are proposed in this paper to encode the scheduling information. And the heterogeneous network systems consist of a set of loosely coupled parallel and vector machines connected via a high-speed network. The objective of heterogeneous network computing is to solve computationally intensive problems that have several types of parallelism, on a suite of high performance and parallel machines in a manner that best utilizes the capabilities of each machine. Therefore, when scheduling in heterogeneous network systems, the matching of the parallelism characteristics between tasks and parallel machines should be carefully handled in order to obtain more speedup. This paper shows how the parallelism type matching affects the time complexity of GATS.

  • PDF

A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing (IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구)

  • Cho, Doosan
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.1
    • /
    • pp.69-77
    • /
    • 2021
  • The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.

해외 선진 슈퍼컴퓨팅센터 동향 분석 및 시사점

  • Choe, Jae-Yeong
    • Journal of Scientific & Technological Knowledge Infrastructure
    • /
    • s.13
    • /
    • pp.40-47
    • /
    • 2004
  • 최근 NSF는 Cyberinfrastructure를 통한 과학과 공학의 혁명을 위해 산.학.연 전문가의 의견을 수렴하여 Blue-Ribbon Advisory Panel on Cyberinfrastructure(2003. 1)의 보고서를 통해 ACP(Advanced Cyberinfrastructure Program)을 제안하였다. ACP의 목표는 정보기술을 응용하여 과학 및 공학 연구에 혁신적인 발전을 가져오는데 있다. 본고에서는 미국, 유럽의 대표적 슈퍼컴퓨팅센터인 NCSA와 EPCC(Edenbugh Parallel Computing Center, 영국), HLRS(High Performance Computing Center Stuttgart, 독일)에 대한 분석을 수행해보고자 한다.

  • PDF