• Title/Summary/Keyword: multi-queue SSD

Search Result 3, Processing Time 0.017 seconds

Multi-core Scalable Fair I/O Scheduling for Multi-queue SSDs (멀티큐 SSD를 위해 멀티코어 확장성을 제공하는 공정한 입출력 스케줄링)

  • Cho, Minjung;Kang, Hyeongseok;Kim, Kanghee
    • Journal of KIISE
    • /
    • v.44 no.5
    • /
    • pp.469-475
    • /
    • 2017
  • The emerging NVMe-based multi-queue SSDs provides a high bandwidth by parallel I/O, i.e., each core performs I/O through its dedicated queue in parallel with other cores. To provide a bandwidth share for each application with I/O, a fair-share scheduler that provides a bandwidth share to each core is required. In this study, we proposed a multi-core scalable fair-queuing algorithm for multi-queue SSDs. The algorithm adopts randomization to minimize the inter-core synchronization overheads and provides a weight-proportional bandwidth share to each core. The results of our experiments indicated that the proposed algorithm gives accurate bandwidth partitioning and outperforms the existing FlashFQ scheduler, regardless of the number of cores for a Linux kernel with block-mq.

Improving Performance of I/O Virtualization Framework based on Multi-queue SSD (다중 큐 SSD 기반 I/O 가상화 프레임워크의 성능 향상 기법)

  • Kim, Tae Yong;Kang, Dong Hyun;Eom, Young Ik
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.27-33
    • /
    • 2016
  • Virtualization has become one of the most helpful techniques in computing systems, and today it is prevalent in several computing environments including desktops, data-centers, and enterprises. However, since I/O layers are implemented to be oblivious to the I/O behaviors on virtual machines (VM), there still exists an I/O scalability issue in virtualized systems. In particular, when a multi-queue solid state drive (SSD) is used as a secondary storage, each system reveals a semantic gap that degrades the overall performance of the VM. This is due to two key problems, accelerated lock contentions and the I/O parallelism issue. In this paper, we propose a novel approach, including the design of virtual CPU (vCPU)-dedicated queues and I/O threads, which efficiently distributes the lock contentions and addresses the parallelism issue of Virtio-blk-data-plane in virtualized environments. Our approach is based on the above principle, which allocates a dedicated queue and an I/O thread for each vCPU to reduce the semantic gap. Our experimental results with various I/O traces clearly show that our design improves the I/O operations per second (IOPS) in virtualized environments by up to 155% over existing QEMU-based systems.

Optimizing Fsync Performance with Dynamic Queue Depth Adaptation

  • Park, Daejun;Kim, Min Ji;Shin, Dongkun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.5
    • /
    • pp.570-576
    • /
    • 2015
  • Existing flash storage devices such as universal flash storage and solid state disk support command queuing to improve storage I/O bandwidth. Command queuing allows multiple read/write requests to be pending in a device queue. Because multi-channel and multi-way architecture of flash storage devices can handle multiple requests simultaneously, command queuing is an indispensable technique for utilizing parallel architecture. However, command queuing can be harmful to the latency of fsync system call, which is critical to application responsiveness. We propose a dynamic queue depth adaptation technique, which reduces the queue depth if user application is expected to send fsync calls. Experiments show that the proposed technique reduces the fsync latency by 79% on average compared to the original scheme.