• Title/Summary/Keyword: Parallel I/O

Search Result 214, Processing Time 0.027 seconds

On the parallel merging algorithm (Heap 병합 병렬 알고리즘)

  • 민용식
    • The Journal of the Acoustical Society of Korea
    • /
    • v.12 no.2
    • /
    • pp.5-13
    • /
    • 1993
  • The purpose of this paper is to suggest and analyze the parallel algorithm for merging two heaps, on SIMD-SM-R parallel computer. In order to create the parallel algorithm for merging two heaps, we have classified two subproblems. For the first method, to select node p as a LEVEL-FIND function, Wyllie(19) suggests the method with time complexity O(log n) while this thesis has O(log(n/k)). For the second method, to merge two subheap, our algorithm has O(log(n/k)*log(n)) using max(2**(i-1), 「(m+1)/4」)'s processors while Dekel and Sahni(4)'s method and Hong's method(18) have O(log m). Also our parallel algorithm's EPU is close to 1 and so has an optimal speed-up ratio.

  • PDF

A Parallel I/O System on Workstation Clustering Environment for Irregular Applications (비정형 응용을 위한 워크스테이션 클러스터링 환경에서의 병렬 입출력 시스템)

  • No, Jae-Chun;Park, Sung-Soon;Choudhary, Alok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.5
    • /
    • pp.496-505
    • /
    • 2000
  • Clusters of workstations (COW) are becoming an attractive option for parallel scientific computing, a field formerly reserved to the MPPs, because their cost-performance ratio is usuallybetter than that of comparable MPPS, and their hardware and software can be easily enhanced to thelatest generations. In this paper we present the design and implementation of our runtime library forclusters of workstations, called "Collective I/O Clustering". The library provides a friendlyprogramming model for the I/O of irregular applications on clusters of workstations, being completelyintegrated with the underlying communication and I/O system. In the collective I/O clustering, two I/Oconfigurations are possible. In the first I/O configuration, all processors allocated can act as I/Oservers as well as compute nodes. In the second I/O configuration, only a subset of processors canact as I/O servers, The compression and software caching facilities have been incorporated into thecollective 1/0 clustering to optimize the communication and I/O costs. All the performance results wereobtained on the IBM-SP machine, located at Argonne National Labs.

  • PDF

Placement and Performance Analysis of I/O Resources for Torus Multicomputer (토러스 다중컴퓨터를 위한 입출력 자원의 배치와 성능 분석)

  • 안중석
    • Journal of the Korea Society for Simulation
    • /
    • v.6 no.2
    • /
    • pp.89-104
    • /
    • 1997
  • Performance bottleneck of parallel computer systems has mostly been I/O devices because of disparity between processor speed and I/O speed. Therefore I/O node placement strategy is required such that it can minimize the number of I/O nodes, I/O access time and I/O traffic in an interconnection network. In this paper, we propose an optimal distance-k embedding algorithm, and analyze its effect on system performance when this algorithm is applied to n x n torus architecture. We prove this algorithm is an efficient I/O node placement using software simulation. I/O node placement using the proposed algorithm shows the highest performance among other I/O node placements in all cases. It is because locations of I/O nodes are uniformly distributed in the whole network, resulting in reduced traffic in the intE'rconnection network.

  • PDF

Two-level Prefetching method for I/O bandwidth enhancement in Parallel File System (병렬파일 시스템에서 I/O 대역폭 개선을 위한 이단 선반입 기법)

  • HwangBo, Jun-Hyung;Cho, Jong-Hyun;Lee, Yoon-Young;Seo, Dae-Wha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.657-660
    • /
    • 2000
  • 병렬 파일 시스템은 늦은 디스크 I/O로 인한 성능 저하를 개선하기 위해 병렬 I/O를 제공한다. 이때 계산과 디스크 I/O를 중첩시키는 선반입 기법으로 디스크 I/O로 인한 성능 저하를 더욱 개선할 수 있다. 하지만 I/O 위주의 프로그램에서는 선반입으로 인하여 시스템에서 제공하는 I/O 대역폭을 넘어 최악의 경우 기존의 선반입 기법은 성능개선을 위한 최선이 될 수 없을 뿐 아니라 선반입 기법 자체가 과부하가 될 수 있다. 본 논문에서는 이런 상황을 고려하여 I/O 대역폭 개선을 위한 이단 선반입 기법을 제시하여 성능개선을 제공한다.

  • PDF

A Methodology to Simulate I/O-Intensive Applications (I/O 집약적인 응용의 시뮬레이션 방법론)

  • Eom, Hyeon-Sang
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.445-454
    • /
    • 2006
  • We introduce a family of simulators for I/O-intensive distributed or parallel applications, and a methodology that permits selecting the most efficient simulator meeting a given user-defined accuracy requirement. This methodology consists of a series of tests to choose an appropriate simulation based on the attributes of the application. In addition, each simulator provides two estimates of application execution time: the minimum expected time and the maximum. We present the results of applying our methodology to existing applications, and show that we can accurately simulate applications tens to hundreds of tunes faster than the application execution times.

Optimizing LRU Lock Management in the Linux Kernel for Improving Parallel Write Throughout in Many-Core CPU Systems (매니코어 CPU 시스템의 병렬 쓰기 성능 향상을 위한 리눅스 커널의 LRU 관리 최적화 기법)

  • Eun-Kyu Byun;Gibeom Gu;Kwang-Jin Oh;Jiwoo Bang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.7
    • /
    • pp.209-216
    • /
    • 2023
  • Modern HPC systems are equipped with many-core CPUs with dozens of cores. When performing parallel I/O in such a system, there is a limit to scalability due to the problem of the LRU lock management policy of the Linux system. The study proposes an improved FinerLRU to solve this problem. Our new FinerLRU improves the parallel write performance of file systems using the buffer cache through granular lock management by increasing the number of LRU locks upto the maximum number of cores. The proposed method was implemented in Linux 5.18.11, and the performance was measured on two types of CPUs, Intel Icelake Xeon and Intel Knights landing, with different characteristics, and it was found that a performance improvement of about two times can be obtained in both types of systems.

ON THE CONVERGENCE OF SERIES OF MARTINGALE DIFFERENCES WITH MULTIDIMENSIONAL INDICES

  • SON, TA CONG;THANG, DANG HUNG
    • Journal of the Korean Mathematical Society
    • /
    • v.52 no.5
    • /
    • pp.1023-1036
    • /
    • 2015
  • Let {Xn; $n{\succeq}1$} be a field of martingale differences taking values in a p-uniformly smooth Banach space. The paper provides conditions under which the series ${\sum}_{i{\preceq}n}\;Xi$ converges almost surely and the tail series {$Tn={\sum}_{i{\gg}n}\;X_i;n{\succeq}1$} satisfies $sup_{k{\succeq}n}{\parallel}T_k{\parallel}=\mathcal{O}p(b_n)$ and ${\frac{sup_{k{\succeq}n}{\parallel}T_k{\parallel}}{B_n}}{\rightarrow\limits^p}0$ for given fields of positive numbers {bn} and {Bn}. This result generalizes results of A. Rosalsky, J. Rosenblatt [7], [8] and S. H. Sung, A. I. Volodin [11].

THE 3D BOUSSINESQ EQUATIONS WITH REGULARITY IN THE HORIZONTAL COMPONENT OF THE VELOCITY

  • Liu, Qiao
    • Bulletin of the Korean Mathematical Society
    • /
    • v.57 no.3
    • /
    • pp.649-660
    • /
    • 2020
  • This paper proves a new regularity criterion for solutions to the Cauchy problem of the 3D Boussinesq equations via one directional derivative of the horizontal component of the velocity field (i.e., (∂iu1; ∂ju2; 0) where i, j ∈ {1, 2, 3}) in the framework of the anisotropic Lebesgue spaces. More precisely, for 0 < T < ∞, if $$\large{\normalsize\displaystyle\smashmargin{2}{\int\nolimits_o}^T}({\HUGE\left\|{\small{\parallel}{\partial}_iu_1(t){\parallel}_{L^{\alpha}_{x_i}}}\right\|}{\small^{\gamma}_{L^{\beta}_{x_{\hat{i}}x_{\bar{i}}}}+}{\HUGE\left\|{\small{\parallel}{\partial}_iu_2(t){\parallel}_{L^{\alpha}_{x_j}}}\right\|}{\small^{\gamma}_{L^{\beta}_{x_{\hat{i}}x_{\bar{i}}}}})dt<{{\infty}},$$ where ${\frac{2}{{\gamma}}}+{\frac{1}{{\alpha}}}+{\frac{2}{{\beta}}}=m{\in}[1,{\frac{3}{2}})$ and ${\frac{3}{m}}{\leq}{\alpha}{\leq}{\beta}<{\frac{1}{m-1}}$, then the corresponding solution (u, θ) to the 3D Boussinesq equations is regular on [0, T]. Here, (i, ${\hat{i}}$, ${\tilde{i}}$) and (j, ${\hat{j}}$, ${\tilde{j}}$) belong to the permutation group on the set 𝕊3 := {1, 2, 3}. This result reveals that the horizontal component of the velocity field plays a dominant role in regularity theory of the Boussinesq equations.

Design and Implementation of the Parallel Multimedia File System on Fast Ethernet (Fast Ethernet 환경에서 병렬 멀티미디어 파일 시스템의 설계와 구현)

  • Park, Seong-Ho;Kim, Gwang-Mun;Jeong, Gi-Dong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.1
    • /
    • pp.89-97
    • /
    • 2001
  • 대용량 멀티미디어 미디어 서버를 구성함에 있어 I/O 병목현상을 극복하기 위하여 저장 서버들과 제어 서버로 구성되어진 2계층 분산 클러스터 서버구조가 많이 사용된다. 2 계층 분산 클러스터 서버는 부하 균등, 대역폭 관리 및 저장 서버의 관리 측면에서 유리한 반면, 저장 서버와 제어 서버간의 통신 오버헤드를 발생시킨다. 이러한 오버헤드를 줄이기 위해서는 저장 서버에서 읽은 미디어 데이터를 제어 서버를 거치지 않고 직접 클라이언트에 전송할 수 있어야 한다. 그리고, 저장 용량을 확장하거나 손상된 디스크를 교체하는 경우를 대비하여 분산 클러스터 서버는 다양한 성능의 이기종 디스크를 지원하여야 한다. 또한, I/O 장치와 운영체제가 빠르게 발전됨에 따라 미디어 서버는 새로운 I/O 장치 및 운영체제 등에 쉽게 이식될 수 있어야 하고, 응용 소프트웨어 개발자가 시스템의 환경에 따라 블록크기, 데이터 배치정책, 사본 정책 등을 유연하게 조절할 수 있어야 한다. 본 논문에서 위에서 언급한 멀티미디어 서버의 요구를 고려하여 Fast Ethernet 환경에서 병렬 멀티미디어 파일 시스템(PMFS : Parallel Multimedia File System)을 설계 및 구현하고 실험을 통해 PVFS(Parallel Virtual File System)와 성능을 비교 분석하였다. 이 실험의 결과에 따르면 PMFS는 멀티미디어 데이터에 대하여 PVFS보다 3%∼15%의 향상된 성능을 보였다.

  • PDF

EPR : Enhanced Parallel R-tree Indexing Method for Geographic Information System (EPR : 지리 정보 시스템을 위한 향상된 병렬 R-tree 색인 기법)

  • Lee, Chun-Geun;Kim, Jeong-Won;Kim, Yeong-Ju;Jeong, Gi-Dong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2294-2304
    • /
    • 1999
  • Our research purpose in this paper is to improve the performance of query processing in GIS(Geographic Information System) by enhancing the I/O performance exploiting parallel I/O and efficient disk access. By packing adjacent spatial data, which are very likely to be referenced concurrently, into one block or continuous disk blocks, the number of disk accesses and the disk access overhead for query processing can be decreased, and this eventually leads to the I/O time decrease. So, in this paper, we proposes EPR(Enhanced Parallel R-tree) indexing method which integrates the parallel I/O method of the previous Parallel R-tree method and a packing-based clustering method. The major characteristics of EPR method are as follows. First, EPR method arranges spatial data in the increasing order of proximity by using Hilbert space filling curve, and builds a packed R-tree by bottom-up manner. Second, with packing-based clustering in which arranged spatial data are clustered into continuous disk blocks, EPR method generates spatial data clusters. Third, EPR method distributes EPR index nodes and spatial data clusters on multiple disks through round-robin striping. Experimental results show that EPR method achieves up to 30% or more gains over PR method in query processing speed. In particular, the larger the size of disk blocks is and the smaller the size of spatial data objects is, the better the performance of query processing by EPR method is.

  • PDF