• Title/Summary/Keyword: I/O 성능

Search Result 677, Processing Time 0.041 seconds

Performance Analysis of NVMe SSDs and Design of Direct Access Engine on Virtualized Environment (가상화 환경에서 NVMe SSD 성능 분석 및 직접 접근 엔진 개발)

  • Kim, Sewoog;Choi, Jongmoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.129-137
    • /
    • 2018
  • NVMe(Non-Volatile Memory Express) SSD(Solid State Drive) is a high-performance storage that makes use of flash memory as a storage cell, PCIe as an interface and NVMe as a protocol on the interface. It supports multiple I/O queues which makes it feasible to process parallel-I/Os on multi-core environments and to provide higher bandwidth than SATA SSDs. Hence, NVMe SSD is considered as a next generation-storage for data-center and cloud computing system. However, in the virtualization system, the performance of NVMe SSD is not fully utilized due to the bottleneck of the software I/O stack. Especially, when it uses I/O stack of the hypervisor or the host operating system like Xen and KVM, I/O performance degrades seriously due to doubled-I/O stack between host and virtual machine. In this paper, we propose a new I/O engine, called Direct-AIO (Direct-Asynchronous I/O) engine, that can access NVMe SSD directly for I/O performance improvements on QEMU emulator. We develop our proposed I/O engine and analyze I/O performance differences between the existed I/O engine and Direct-AIO engine.

Supporitng for CrownFS in MPI-IO (MPI-IO의 CrownFS 지원 방안)

  • 조미옥;강봉직;최경희;정기현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.636-638
    • /
    • 2000
  • 가장 느린 서비스시템인 I/O의 성능이 전체적인 컴퓨터 시스템의 성능을 결정짓게 된다. 따라서 전반적인 시스템의 성능 향상을 위해서는 I/O의 성능이 높아져야 한다. 분산병렬환경에서 I/O의 성능을 높이기 위해서 parallel I/O를 사용한다. 하위레벨에서 최적화된 병렬 파일시스템을 사용하고, 어플리케이션 레벨에서 병렬 에플리케이션의 개발을 쉽게 해줄 수 있는 인터페이스를 사용하면 더 효과적인 parallel I/O를 구현할 수 있다. 본 논문에서는 MPI에서 병렬 파일시스템인 CrownFS를 지원하도록 하기 위해서 MPI-IO에 CrownFS를 추가하여 병렬환경에서 높은 성능을 나타낼수 있는 parallel I/O 환경을 구현한다.

  • PDF

A performance analysis of Solid State Disk for Linux I/O scheduler (리눅스 I/O 스케줄러에 대한 SSD 성능 분석)

  • Park, Hyun-Chan;Yoo, Chuck
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06b
    • /
    • pp.460-464
    • /
    • 2010
  • SSD는 뛰어난 성능으로 인해 서버 시장에서 HDD를 빠르게 대체하며 각광받고 있다. 우리는 기존 SSD의 성능 분석이 단일한 I/O 패턴에 대해서만 이루어진 점을 주목하여, 다양한 패턴의 I/O가 동시에 수행 될 경우, 성능에 어떠한 영향이 있는지 평가해보고자 한다. 이를 위해 4KB부터 64MB까지 다양한 블록크기로 순차적/임의적 읽기/쓰기 연산을 수행함과 동시에 4KB 단위의 읽기/쓰기 I/O를 수행시켜 성능에 미치는 영향을 알아보았다. 이러한 평가를 네 가지 리눅스 I/O 스케줄러에 대해 각각 수행함으로써 스케줄러에 의한 영향 또한 평가하였다. 그 결과로 우리는 새로운 SSD의 성능 특성을 발견할 수 있었으며, 이는 새로운 I/O 스케줄러 및 SSD의 FTL 개발의 기반이 되리라 예상된다.

  • PDF

Two-level Prefetching method for I/O bandwidth enhancement in Parallel File System (병렬파일 시스템에서 I/O 대역폭 개선을 위한 이단 선반입 기법)

  • HwangBo, Jun-Hyung;Cho, Jong-Hyun;Lee, Yoon-Young;Seo, Dae-Wha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.657-660
    • /
    • 2000
  • 병렬 파일 시스템은 늦은 디스크 I/O로 인한 성능 저하를 개선하기 위해 병렬 I/O를 제공한다. 이때 계산과 디스크 I/O를 중첩시키는 선반입 기법으로 디스크 I/O로 인한 성능 저하를 더욱 개선할 수 있다. 하지만 I/O 위주의 프로그램에서는 선반입으로 인하여 시스템에서 제공하는 I/O 대역폭을 넘어 최악의 경우 기존의 선반입 기법은 성능개선을 위한 최선이 될 수 없을 뿐 아니라 선반입 기법 자체가 과부하가 될 수 있다. 본 논문에서는 이런 상황을 고려하여 I/O 대역폭 개선을 위한 이단 선반입 기법을 제시하여 성능개선을 제공한다.

  • PDF

An Analysis of the Performance of Collective I/Os and the Subgroup Method (집합 I/O와 부분군 기법의 성능 분석)

  • Cha, Kwangho;Cho, Hyeyoung;Kim, Sungho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.513-516
    • /
    • 2007
  • Because many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measure and analyze the performance of original collective I/Os and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the two kinds of subgroup method showed different performance. In terms of collective write operation, the subgroup method caused the performance degradation. However, the subgroup method for collective read showed good performance with small data size.

  • PDF

A Study for High Performance of Intelligent I/O Architecture of RAID System (지능형 I/O구조를 갖는 RAID 시스템의 성능 향상을 위한 연구)

  • Choi, Gwi-Yeol;Park, Kye-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.11
    • /
    • pp.1989-1995
    • /
    • 2006
  • RAID(Redundant mays of inexpensive disks) were proposed as a way to use parallelism between multiple disks to improve aggregate I/O performance. The emerging of intelligent I/O architecture provides a standard for high performance I/O subsystems and introducer intelligence at the hardware level. With an embedded processor, intelligent I/O adaptors can offload the major I/O processing workload from the CPU and, at the same time, increase the I/O performance. This parer addresses the essential issue in the design of disk scheduling for intelligent I/O devices. In this paper we compare with MB throughput per second and maximum I/O respond time in RAID.

Prefetch R-tree: A Disk and Cache Optimized Multidimensional Index Structure (Prefetch R-tree: 디스크와 CPU 캐시에 최적화된 다차원 색인 구조)

  • Park Myung-Sun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.463-476
    • /
    • 2006
  • R-trees have been traditionally optimized for the I/O performance with the disk page as the tree node. Recently, researchers have proposed cache-conscious variations of R-trees optimized for the CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called the PR-tree (Prefetching R-tree). For the cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For the I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of the I/O performance.

Para-virtualized Library for Bare-metal Network Performance in Virtualized Environment (가상화 환경의 고성능 I/O를 위한 반가상화 라이브러리)

  • Lee, Dongwoo;Cho, Youngjoong;Eom, Young Ik
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.605-610
    • /
    • 2014
  • Now, virtualization is no more emerging research area, and we can easily find its application in our circumstance. Nevertheless, I/O workloads are reluctant to be applied in virtual environment since they still suffer from unacceptable performance degradation due to virtualization latency. Many previous papers identified that virtual I/O overhead is mainly caused by exits and redundant I/O stack, and proposed several techniques to reduce them. However, they still have some limitations. In this paper, we introduce a novel I/O virtualization framework which improves I/O performance by exploiting multicore architecture. We applied our framework to the virtual network, and it improves TCP throughput up to 169%, and decreases UDP latency up to 38% on the network with the 10Gbps NIC.

Dynamic Bandwidth Distribution Method for High Performance Non-volatile Memory in Cloud Computing Environment (클라우드 환경에서 고성능 저장장치를 위한 동적 대역폭 분배 기법)

  • Kwon, Piljin;Ahn, Sungyong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.3
    • /
    • pp.97-103
    • /
    • 2020
  • Linux Cgroups takes a fundamental role for sharing system resources among multiple containers on container-based cloud computing environment. Especially for I/O resource, Linux Cgroups supports a mechanism for sharing I/O bandwidth in proportion to I/O weight. However, the current mechanism of Linux Cgroups using BFQ I/O scheduler seriously degrades the I/O performance with high bandwidth storage device such as NVMe SSDs. In this paper, we proposed a new feedback based I/O bandwidth sharing scheme for Linux Cgroups which allocates I/O credits to containers according to I/O weights and adjusts the amount of credits to performance fluctuation of NVMe SSDs. The proposed scheme is implemented on Linux kernel 5.3 and evaluated. The evaluation results show that it can share the I/O bandwidth among multiple containers proportionally to I/O weights while improving I/O performance more than twice as high as the existing scheme.

A Study of HDD Performance Improvement through Filter Driver & NAND FLASH Memory (Filter Driver 와 NAND FLASH Memory를 이용한 HDD 장치의 성능 개선에 관한 연구)

  • Kim, Woo-Gil;Kim, Young-Kil
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.58-61
    • /
    • 2010
  • In this paper, we research the method for HDD I/O Performance improvement by Filter Driver &NAND FLASH Memory. We analyze the effect of the operation of the Device Driver & NAND FLASH Memory and propose the method for the HDD I/O Performance improvement.

  • PDF