• Title/Summary/Keyword: I/O Performance

Search Result 1,087, Processing Time 0.025 seconds

Performance Analysis of NVMe SSDs and Design of Direct Access Engine on Virtualized Environment (가상화 환경에서 NVMe SSD 성능 분석 및 직접 접근 엔진 개발)

  • Kim, Sewoog;Choi, Jongmoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.129-137
    • /
    • 2018
  • NVMe(Non-Volatile Memory Express) SSD(Solid State Drive) is a high-performance storage that makes use of flash memory as a storage cell, PCIe as an interface and NVMe as a protocol on the interface. It supports multiple I/O queues which makes it feasible to process parallel-I/Os on multi-core environments and to provide higher bandwidth than SATA SSDs. Hence, NVMe SSD is considered as a next generation-storage for data-center and cloud computing system. However, in the virtualization system, the performance of NVMe SSD is not fully utilized due to the bottleneck of the software I/O stack. Especially, when it uses I/O stack of the hypervisor or the host operating system like Xen and KVM, I/O performance degrades seriously due to doubled-I/O stack between host and virtual machine. In this paper, we propose a new I/O engine, called Direct-AIO (Direct-Asynchronous I/O) engine, that can access NVMe SSD directly for I/O performance improvements on QEMU emulator. We develop our proposed I/O engine and analyze I/O performance differences between the existed I/O engine and Direct-AIO engine.

A Study for High Performance of Intelligent I/O Architecture of RAID System (지능형 I/O구조를 갖는 RAID 시스템의 성능 향상을 위한 연구)

  • Choi, Gwi-Yeol;Park, Kye-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.11
    • /
    • pp.1989-1995
    • /
    • 2006
  • RAID(Redundant mays of inexpensive disks) were proposed as a way to use parallelism between multiple disks to improve aggregate I/O performance. The emerging of intelligent I/O architecture provides a standard for high performance I/O subsystems and introducer intelligence at the hardware level. With an embedded processor, intelligent I/O adaptors can offload the major I/O processing workload from the CPU and, at the same time, increase the I/O performance. This parer addresses the essential issue in the design of disk scheduling for intelligent I/O devices. In this paper we compare with MB throughput per second and maximum I/O respond time in RAID.

Performance Analysis of Flash File System for the Efficient I/O on Smart Device (스마트 기기의 효율적인 I/O를 위한 플래시 파일 시스템 성능 분석)

  • Chung, Kyung-Ho;Kim, Yong-Hwan;Kim, Sang-Jin;Jung, Young-Seok;Kim, Sung-Soo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.10 no.3
    • /
    • pp.171-178
    • /
    • 2015
  • Recently NAND flash memory has been found to be the primary cause of low performance in the smart device. NAND flash memory is different from each other the execution time of I/O operations that flash file system is required. Therefore, it is necessary to compare and analyze the flash file system I/O performance for the efficient I/O on smart device. In this paper, it was tested and analyzing the I/O performance of the YAFFS2, JFFS2, UBIFS. Experimental results most read I/O performance is good, but the writing I/O performance is not good. For UBIFS, showed a more good I/O performance compared to other flash file system.

Placement and Performance Analysis of I/O Resources for Torus Multicomputer (토러스 다중컴퓨터를 위한 입출력 자원의 배치와 성능 분석)

  • 안중석
    • Journal of the Korea Society for Simulation
    • /
    • v.6 no.2
    • /
    • pp.89-104
    • /
    • 1997
  • Performance bottleneck of parallel computer systems has mostly been I/O devices because of disparity between processor speed and I/O speed. Therefore I/O node placement strategy is required such that it can minimize the number of I/O nodes, I/O access time and I/O traffic in an interconnection network. In this paper, we propose an optimal distance-k embedding algorithm, and analyze its effect on system performance when this algorithm is applied to n x n torus architecture. We prove this algorithm is an efficient I/O node placement using software simulation. I/O node placement using the proposed algorithm shows the highest performance among other I/O node placements in all cases. It is because locations of I/O nodes are uniformly distributed in the whole network, resulting in reduced traffic in the intE'rconnection network.

  • PDF

A Performance Analysis of I/O Scheduler for NAND Flash File System (NAND 플래시 파일시스템의 I/O 스케줄러 성능분석)

  • Lee, Yeongseok;Lee, Changhee;Chung, Kyungho;Kim, Yonghwan;Ahn, Kwangseon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.2
    • /
    • pp.27-34
    • /
    • 2013
  • NAND Flash Memory has been used in several devices by low cost and high capacity, and the demand for mass NAND Flash Memory has increased due to the multimedia extension of mobile devices. The JFFS2, NILFS2, and YAFFS2 file systems are used mainly in NAND Flash Memory. In this paper, the performance of Sequential read/write of the 3 file systems are analyzed for the 4 I/O schedulers : CFQ(Complete Fair Queuing) I/O scheduler, NOOP(No Operation) I/O scheduler, Anticipatory I/O scheduler, and Deadline I/O scheduler. In JFFS2 file system, Anticipatory I/O scheduler has the best performance by 8% decreasing speed in writing time and 1.5% decreasing speed in reading time compared to the other I/O scheduler. In YAFFS2 file system, it results are similar to performance in reading and writing for the 4 I/O schedulers. In NILFS2 file system, NOOP I/O scheduler has 2% faster in writing and Deadline I/O scheduler has 6% faster in reading than other I/O schedulers.

Performance Evaluation of Disk I/O for Web Proxy Servers (웹 프락시 서버의 디스크 I/O 성능 평가)

  • Shim Jong-Ik
    • The KIPS Transactions:PartC
    • /
    • v.12C no.4 s.100
    • /
    • pp.603-608
    • /
    • 2005
  • Disk I/O is a major performance bottleneck of web proxy server. Today's most web proxy sowers are design to run on top of a general purpose file system. But general purpose file system can not efficiently handle web cache workload, small files, leading to the performance degradation of entire web proxy servers. In this paper we evaluate the performance potential of raw disk to reduce disk I/O overhead of web proxy servers. To show the performance potential of raw disk, we design a storage management system called Block-structured Storage Management System (BSMS). And we also actually implement web proxy server that incorporate BSMS in Squid. Comprehensive experimental evaluations show that raw disk can be a good solution to improve disk I/O performance significantly for web proxy servers.

An Analysis of the Performance of Collective I/Os and the Subgroup Method (집합 I/O와 부분군 기법의 성능 분석)

  • Cha, Kwangho;Cho, Hyeyoung;Kim, Sungho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.513-516
    • /
    • 2007
  • Because many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measure and analyze the performance of original collective I/Os and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the two kinds of subgroup method showed different performance. In terms of collective write operation, the subgroup method caused the performance degradation. However, the subgroup method for collective read showed good performance with small data size.

  • PDF

Design and Implementation of I/O Performance Benchmarking Framework for Linux Container

  • Oh, Gijun;Son, Suho;Yang, Junseok;Ahn, Sungyong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.180-186
    • /
    • 2021
  • In cloud computing service it is important to share the system resource among multiple instances according to user requirements. In particular, the issue of efficiently distributing I/O resources across multiple instances is paid attention due to the rise of emerging data-centric technologies such as big data and deep learning. However, it is difficult to evaluate the I/O resource distribution of a Linux container, which is one of the core technologies of cloud computing, since conventional I/O benchmarks does not support features related to container management. In this paper, we propose a new I/O performance benchmarking framework that can easily evaluate the resource distribution of Linux containers using existing I/O benchmarks by supporting container-related features and integrated user interface. According to the performance evaluation result with trace-replay benchmark, the proposed benchmark framework has induced negligible performance overhead while providing convenience in evaluating the I/O performance of multiple Linux containers.

A Study of HDD Performance Improvement through Filter Driver & NAND FLASH Memory (Filter Driver 와 NAND FLASH Memory를 이용한 HDD 장치의 성능 개선에 관한 연구)

  • Kim, Woo-Gil;Kim, Young-Kil
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.58-61
    • /
    • 2010
  • In this paper, we research the method for HDD I/O Performance improvement by Filter Driver &NAND FLASH Memory. We analyze the effect of the operation of the Device Driver & NAND FLASH Memory and propose the method for the HDD I/O Performance improvement.

  • PDF

Prefetch R-tree: A Disk and Cache Optimized Multidimensional Index Structure (Prefetch R-tree: 디스크와 CPU 캐시에 최적화된 다차원 색인 구조)

  • Park Myung-Sun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.463-476
    • /
    • 2006
  • R-trees have been traditionally optimized for the I/O performance with the disk page as the tree node. Recently, researchers have proposed cache-conscious variations of R-trees optimized for the CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called the PR-tree (Prefetching R-tree). For the cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For the I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of the I/O performance.