• 제목/요약/키워드: Parallel interface

검색결과 439건 처리시간 0.034초

Grid-Enabled Parallel Simulation Based on Parallel Equation Formulation

  • Andjelkovic, Bojan;Litovski, Vanco B.;Zerbe, Volker
    • ETRI Journal
    • /
    • 제32권4호
    • /
    • pp.555-565
    • /
    • 2010
  • Parallel simulation is an efficient way to cope with long runtimes and high computational requirements in simulations of modern complex integrated electronic circuits and systems. This paper presents an algorithm for parallel simulation based on parallelization in equation formulation and simultaneous calculation of matrix contributions for nonlinear analog elements. In addition, the paper describes the development of a grid interface for a parallel simulator that enables a designer to perform simulations on distant computer clusters. Performances of the developed parallel simulation algorithm are evaluated by simulation of a microelectromechanical system.

PERFORMANCE OF A KNIGHT TOUR PARALLEL ALGORITHM ON MULTI-CORE SYSTEM USING OPENMP

  • VIJAYAKUMAR SANGAMESVARAPPA;VIDYAATHULASIRAMAN
    • Journal of applied mathematics & informatics
    • /
    • 제41권6호
    • /
    • pp.1317-1326
    • /
    • 2023
  • Today's computers, desktops and laptops were build with multi-core architecture. Developing and running serial programs in this multi-core architecture fritters away the resources and time. Parallel programming is the only solution for proper utilization of resources available in the modern computers. The major challenge in the multi-core environment is the designing of parallel algorithm and performance analysis. This paper describes the design and performance analysis of parallel algorithm by taking the Knight Tour problem as an example using OpenMP interface. Comparison has been made with performance of serial and parallel algorithm. The comparison shows that the proposed parallel algorithm achieves good performance compared to serial algorithm.

기존 시스템 환경에서의 병렬 미디어 서버의 설계 및 구현 (Design and Implementation of parallel Media server in current system environment)

  • 김경훈;류재상;김서균;남지승
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(3)
    • /
    • pp.97-100
    • /
    • 2000
  • As network resources have become faster and demands for multimedia service through network have increased, the demand for Media server system has increased. These kinds of media server solve their bottle neck problem of internal storage device by using parallel system which takes advantage of fast network resource. Many vendors have suggested each of their media server system to solve these problem radically, but most of them require major modification of infra component and additional drawback has added. For example, storage mechanism for specific media requires new file system which is totally different from traditional one, and algorithm for enhancing performance may not suit for traditional operating system environment. In this paper, we designed a parallel media server based on web interface of traditional system and implemented a program for media server. Implemented server system performs parallel processing through web interface without any modification of traditional system, and controls which is related to merging load by distributed data is charged only to client and control server and consequently load of storage server can be minimized. And also, data transfer protocol for streaming media includes Retransfer algorithm and client Admission control policy relevant to performance of whole system.

  • PDF

BCI 시스템의 성능 개선을 위한 병렬 모델 특징 추출 (Parallel Model Feature Extraction to Improve Performance of a BCI System)

  • ;박승민;심귀보
    • 제어로봇시스템학회논문지
    • /
    • 제19권11호
    • /
    • pp.1022-1028
    • /
    • 2013
  • It is well knowns that based on the CSP (Common Spatial Pattern) algorithm, the linear projection of an EEG (Electroencephalography) signal can be made to spaces that optimize the discriminant between two patterns. Sharing disadvantages from linear time invariant systems, CSP suffers from the non-stationary nature of EEGs causing the performance of the classification in a BCI (Brain-Computer Interface) system to drop significantly when comparing the training data and test data. The author has suggested a simple idea based on the parallel model of CSP filters to improve the performance of BCI systems. The model was tested with a simple CSP algorithm (without any elaborate regularizing methods) and a perceptron learning algorithm as a classifier to determine the improvement of the system. The simulation showed that the parallel model could improve classification performance by over 10% compared to conventional CSP methods.

NGN 자원제어 스킴의 고속화 방안에 관한 연구 (Study of High-Speed NGN Resource Control Schemes)

  • 차영욱;한태만;정유현
    • 정보처리학회논문지C
    • /
    • 제16C권3호
    • /
    • pp.383-392
    • /
    • 2009
  • 차세대 네트워크(NGN)는 QoS가 지원되는 광 대역 전달 망에서 세션 및 비-세션 서비스를 지원하기 위한 패킷 기반의 융합 망이다. NGN에서 네트워크 사용자에 따라 차별화된 서비스를 제공하기 위해서는 QoS 기반 자원제어가 이루어져 한다. 본 논문에서는 세션 및 자원제어 지연을 최소화하기 위하여 NGN 자원제어 인터페이스에 병행형 제어 스킴을 정의하였다. 시뮬레이션을 통하여 기존 및 제안된 NGN 제어 스킴들의 제어지연과 완료율을 측정 및 분석하였다. 두 단계 자원 제어에서 도착율 120 까지는 병행형과 순차형의 완료율이 100%를 달성하였으며, 병행형의 제어지연이 순차형에 비하여 약 21.5% 개선되었음을 확인하였다.

One-node and two-node hybrid coarse-mesh finite difference algorithm for efficient pin-by-pin core calculation

  • Song, Seongho;Yu, Hwanyeal;Kim, Yonghee
    • Nuclear Engineering and Technology
    • /
    • 제50권3호
    • /
    • pp.327-339
    • /
    • 2018
  • This article presents a new global-local hybrid coarse-mesh finite difference (HCMFD) method for efficient parallel calculation of pin-by-pin heterogeneous core analysis. In the HCMFD method, the one-node coarse-mesh finite difference (CMFD) scheme is combined with a nodal expansion method (NEM)-based two-node CMFD method in a nonlinear way. In the global-local HCMFD algorithm, the global problem is a coarse-mesh eigenvalue problem, whereas the local problems are fixed source problems with boundary conditions of incoming partial current, and they can be solved in parallel. The global problem is formulated by one-node CMFD, in which two correction factors on an interface are introduced to preserve both the surface-average flux and the net current. Meanwhile, for accurate and efficient pin-wise core analysis, the local problem is solved by the conventional NEM-based two-node CMFD method. We investigated the numerical characteristics of the HCMFD method for a few benchmark problems and compared them with the conventional two-node NEM-based CMFD algorithm. In this study, the HCMFD algorithm was also parallelized with the OpenMP parallel interface, and its numerical performances were evaluated for several benchmarks.

Assessment of computational performance for a vector parallel implementation: 3D probabilistic model discrete cracking in concrete

  • Paz, Carmen N.M.;Alves, Jose L.D.;Ebecken, Nelson F.F.
    • Computers and Concrete
    • /
    • 제2권5호
    • /
    • pp.345-366
    • /
    • 2005
  • This work presents an assessment of the computational performance of a vector-parallel implementation of probabilistic model for concrete cracking in 3D. This paper shows the continuing efforts towards code optimization as reported in earlier works Paz, et al. (2002a,b and 2003). The probabilistic crack approach is based on the direct Monte Carlo method. Cracking is accounted by means of 3D interface elements. This approach considers that all nonlinearities are restricted to interface elements modeling cracks. The heterogeneity governs the overall cracking behavior and related size effects on concrete fracture. Computational kernels in the implementation are the inexact Newton iterative driver to solve the non-linear problem and a preconditioned conjugate gradient (PCG) driver to solve linearized equations, using an element by element (EBE) strategy to compute matrix-vector products. In particular the paper analyzes code behavior using OpenMP directives in parallel vector processors (PVP), such as the CRAY SV1 and CRAY T94. The impact of the memory architecture on code performance, and also some strategies devised to circumvent this issue are addressed by numerical experiment.

MPI-GWAS: a supercomputing-aided permutation approach for genome-wide association studies

  • Paik, Hyojung;Cho, Yongseong;Cho, Seong Beom;Kwon, Oh-Kyoung
    • Genomics & Informatics
    • /
    • 제20권1호
    • /
    • pp.14.1-14.4
    • /
    • 2022
  • Permutation testing is a robust and popular approach for significance testing in genomic research that has the advantage of reducing inflated type 1 error rates; however, its computational cost is notorious in genome-wide association studies (GWAS). Here, we developed a supercomputing-aided approach to accelerate the permutation testing for GWAS, based on the message-passing interface (MPI) on parallel computing architecture. Our application, called MPI-GWAS, conducts MPI-based permutation testing using a parallel computing approach with our supercomputing system, Nurion (8,305 compute nodes, and 563,740 central processing units [CPUs]). For 107 permutations of one locus in MPI-GWAS, it was calculated in 600 s using 2,720 CPU cores. For 107 permutations of ~30,000-50,000 loci in over 7,000 subjects, the total elapsed time was ~4 days in the Nurion supercomputer. Thus, MPI-GWAS enables us to feasibly compute the permutation-based GWAS within a reason-able time by harnessing the power of parallel computing resources.

분산 메모리 시스템에서 압력방정식의 해법을 위한 MPI와 Hybrid 병렬 기법의 비교 (Comparison of Message Passing Interface and Hybrid Programming Models to Solve Pressure Equation in Distributed Memory System)

  • 전병진;최형권
    • 대한기계학회논문집B
    • /
    • 제39권2호
    • /
    • pp.191-197
    • /
    • 2015
  • 본 연구에서는 분산 메모리시스템에서의 압력 방정식의 병렬해법을 위하여 MPI(Message Passing Interface)와 하이브리드 병렬기법을 사용하였다. 두 모델은 영역분할 기법을 활용하며, 하이브리드 기법은 성능이 양호한 두 가지 영역분할에 대해 수행하였다. 두 병렬기법의 성능을 비교하기 위해서 다양한 문제 크기에 대해 최대 96개의 쓰레드를 사용하여 속도향상을 측정하였다. 병렬 성능은 캐쉬 메모리에 따른 문제의 크기 및 MPI 통신, OpenMP 지시어의 부하에 대해 영향을 받음을 확인하였다. 문제의 크기가 작은 경우에는 쓰레드가 증가할수록 MPI 통신 및 OpenMP 지시어 부하에 대한 비율이 상대적으로 크기 때문에 병렬 성능이 좋지 않으며, MPI 통신 부하보다는 OpenMP 지시어 부하가 상대적으로 크므로 MPI 병렬 기법의 병렬 성능이 더 우수하다. 문제의 크기가 큰 경우에는 캐쉬 메모리의 활용도가 높고 MPI 통신 및 OpenMP 지시어 부하에 대한 비율이 낮아 병렬 성능이 좋으며, OpenMP 지시어보다 MPI 통신에 의한 부하가 더 지배적이어서 하이브리드 병렬 성능이 MPI 병렬 성능보다 더 양호하다.

CPU 클러스터 구축 및 3차원 공간분할 병렬 FDTD 알고리즘 구현 (Construction of a CPU Cluster and Implementation of a 3-D Domain Decomposition Parallel FDTD Algorithm)

  • 박성민;추광욱;주세훈;박윤미;김기백;정경영
    • 한국전자파학회논문지
    • /
    • 제25권3호
    • /
    • pp.357-364
    • /
    • 2014
  • 본 연구에서는 빠르게 전자파 해석을 수행할 수 있는 병렬 유한차분 시간영역(Finite-Difference Time-Domain: FDTD) 알고리즘을 구현하기 위하여 CPU 클러스터를 구축하였다. 병렬 FDTD 알고리즘은 단일 프로세서를 이용한 FDTD 알고리즘에 비해 해석 시간을 크게 줄일 수 있으며, 전기적으로 매우 큰 구조물에 대한 전자파 해석도 가능하다. 본 연구팀에서는 CPU 클러스터 기반의 병렬 FDTD 알고리즘에서 요구되는 프로세스 간의 통신을 위해 MPI(Message Passing Interface) 라이브러리를 이용하였으며, 3차원 공간분할을 적용하여 프로세스 간의 통신 시간을 최소화하였다. 단일 프로세서를 이용한 FDTD 알고리즘 대비 CPU 클러스터 기반의 병렬 FDTD 알고리즘의 계산속도 향상도를 기본 모드와 하이퍼 모드에서 분석하였으며, 전기적으로 매우 큰 콘크리트 구조물의 전자파 해석을 하였다.