• Title/Summary/Keyword: page table overhead

Search Result 12, Processing Time 0.022 seconds

Mapping Cache for High-Performance Memory Mapped File I/O in Memory File Systems (메모리 파일 시스템 기반 고성능 메모리 맵 파일 입출력을 위한 매핑 캐시)

  • Kim, Jiwon;Choi, Jungsik;Han, Hwansoo
    • Journal of KIISE
    • /
    • v.43 no.5
    • /
    • pp.524-530
    • /
    • 2016
  • The desire to access data faster and the growth of next-generation memories such as non-volatile memories, contribute to the development of research on memory file systems. It is recommended that memory mapped file I/O, which has less overhead than read-write I/O, is utilized in a high-performance memory file system. Memory mapped file I/O, however, brings a page table overhead, which becomes one of the big overheads that needs to be resolved in the entire file I/O performance. We find that same overheads occur unnecessarily, because a page table of a file is removed whenever a file is opened after being closed. To remove the duplicated overhead, we propose the mapping cache, a technique that does not delete a page table of a file but saves the page table to be reused when the mapping of the file is released. We demonstrate that mapping cache improves the performance of traditional file I/O by 2.8x and web server performance by 12%.

Persistent Page Table and File System Journaling Scheme for NVM Storage (비휘발성 메모리 저장장치를 위한 영속적 페이지 테이블 및 파일시스템 저널링 기법)

  • Ahn, Jae-hyeong;Hyun, Choul-seung;Lee, Dong-hee
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.80-90
    • /
    • 2019
  • Even though Non-Volatile Memory (NVM) is used for data storage, a page table should be built to access data in it. And this observation leads us to the Persistent Page Table (PPT) scheme that keeps the page table in NVM persistently. By the way, processors have different page table structures and really operational page table cannot be built without virtual and physical addresses of NVM. However, those addresses are determined dynamically when NVM storage is attached to the system. Thus, the PPT should have system-independent and also address-independent structure and really working system-dependent page table should be built from the PPT. Moreover, entries of PPT should be updated atomically and, in this paper, we describe the design of PPT that meets those requirements. And we investigate how file systems can decrease the journaling overhead with the swap operation, which is a new operation created by the PPT. We modified the Ext4 file system in Linux and experiments conducted with Filebench workloads show that the swap operation enhances file system performance up to 60%.

STP-FTL: An Efficient Caching Structure for Demand-based Flash Translation Layer

  • Choi, Hwan-Pil;Kim, Yong-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.7
    • /
    • pp.1-7
    • /
    • 2017
  • As the capacity of NAND flash module increases, the amount of RAM increases for caching and maintaining the FTL mapping information. In order to reduce the amount of mapping information managed in the RAM, a demand-based address mapping method stores the entire mapping information in the flash and some valid mapping information in the form of cache in the RAM so that the RAM can be used efficiently. However, when cache miss occurs, it is necessary to read the mapping information recorded in the flash, so overhead occurs to translate the address. If the RAM space is not enough, the cache hit ratio decreases, resulting in greater overhead. In this paper, we propose a method using two tables called TPMT(Translation Page Mapping Table) and SMT(Segmented Translation Page Mapping Table) to utilize both temporal locality and spatial locality more efficiently. A performance evaluation shows that this method can improve the cache hit ratio by up to 30% and reduces the extra translation operations by up to 72%, compared to the TPM scheme.

A Multi-Level Flash Translation Layer for Large Capacity Solid State Drives

  • Kim, Yong-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.2
    • /
    • pp.11-18
    • /
    • 2021
  • The flash translation layer(FTL) of SSD maps the logical page number requested from the host to the actual recorded flash memory page number. It is very important to reduce the amount of RAM used to manage the mapping information. In the existing demand-based FTLs, two-level method is applied in which mapping information is also recorded in flash memory pages and only their addresses are managed as a table in RAM. As the capacities of SSDs are growing to tens of terabytes, the amount of RAM for mapping table becomes too large. In this paper, ML-FTL was proposed as a method of managing mapping information in three levels to reduce the amount of RAM required drastically. From an evaluation, the increase in overhead was minimal compared to the conventional two-level method by properly utilizing cache.

Implementation of Memory Efficient Flash Translation Layer for Open-channel SSDs

  • Oh, Gijun;Ahn, Sungyong
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.142-150
    • /
    • 2021
  • Open-channel SSD is a new type of Solid-State Disk (SSD) that improves the garbage collection overhead and write amplification due to physical constraints of NAND flash memory by exposing the internal structure of the SSD to the host. However, the host-level Flash Translation Layer (FTL) provided for open-channel SSDs in the current Linux kernel consumes host memory excessively because it use page-level mapping table to translate logical address to physical address. Therefore, in this paper, we implemente a selective mapping table loading scheme that loads only a currently required part of the mapping table to the mapping table cache from SSD instead of entire mapping table. In addition, to increase the hit ratio of the mapping table cache, filesystem information and mapping table access history are utilized for cache replacement policy. The proposed scheme is implemented in the host-level FTL of the Linux kernel and evaluated using open-channel SSD emulator. According to the evaluation results, we can achieve 80% of I/O performance using the only 32% of memory usage compared to the previous host-level FTL.

An Efficient Address Mapping Table Management Scheme for NAND Flash Memory File System Exploiting Page Address Cache (페이지 주소 캐시를 활용한 NAND 플래시 메모리 파일시스템에서의 효율적 주소 변환 테이블 관리 정책)

  • Kim, Cheong-Ghil
    • Journal of Digital Contents Society
    • /
    • v.11 no.1
    • /
    • pp.91-97
    • /
    • 2010
  • Flash memory has been used by many digital devices for data storage, exploiting the advantages of non-volatility, low power, stability, and so on, with the help of high integrity, large capacity, and low price. As the fast growing popularity of flash memory, the density of it increases so significantly that its entire address mapping table becomes too big to be stored in SRAM. This paper proposes the associated page address cache with an efficient table management scheme for hybrid flash translation layer mapping. For this purpose, all tables are integrated into a map block containing entire physical page tables. Simulation results show that the proposed scheme can save the extra memory areas and decrease the searching time with less 2.5% of miss ratio on PC workload and can decrease the write overhead by performing write operation 33% out of total writes requested.

An Address Translation Technique Large NAND Flash Memory using Page Level Mapping (페이지 단위 매핑 기반 대용량 NAND플래시를 위한 주소변환기법)

  • Seo, Hyun-Min;Kwon, Oh-Hoon;Park, Jun-Seok;Koh, Kern
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.371-375
    • /
    • 2010
  • SSD is a storage medium based on NAND Flash memory. Because of its short latency, low power consumption, and resistance to shock, it's not only used in PC but also in server computers. Most SSDs use FTL to overcome the erase-before-overwrite characteristic of NAND flash. There are several types of FTL, but page mapped FTL shows better performance than others. But its usefulness is limited because of its large memory footprint for the mapping table. For example, 64MB memory space is required only for the mapping table for a 64GB MLC SSD. In this paper, we propose a novel caching scheme for the mapping table. By using the mapping-table-meta-data we construct a fully associative cache, and translate the address within O(1) time. The simulation results show more than 80 hit ratio with 32KB cache and 90% with 512KB cache. The overall memory footprint was only 1.9% of 64MB. The time overhead of cache miss was measured lower than 2% for most workload.

Index Management Method using Page Mapping Log in B+-Tree based on NAND Flash Memory (NAND 플래시 메모리 기반 B+ 트리에서 페이지 매핑 로그를 이용한 색인 관리 기법)

  • Kim, Seon Hwan;Kwak, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.5
    • /
    • pp.1-12
    • /
    • 2015
  • NAND flash memory has being used for storage systems widely, because it has good features which are low-price, low-power and fast access speed. However, NAND flash memory has an in-place update problem, and therefore it needs FTL(flash translation layer) to run for applications based on hard disk storage. The FTL includes complex functions, such as address mapping, garbage collection, wear leveling and so on. Futhermore, implementation of the FTL on low-power embedded systems is difficult due to its memory requirements and operation overhead. Accordingly, many index data structures for NAND flash memory have being studied for the embedded systems. Overall performances of the index data structures are enhanced by a decreasing of page write counts, whereas it has increased page read counts, as a side effect. Therefore, we propose an index management method using a page mapping log table in $B^+$-Tree based on NAND flash memory to decrease page write counts and not to increase page read counts. The page mapping log table registers page address information of changed index node and then it is exploited when retrieving records. In our experiment, the proposed method reduces the page read counts about 61% at maximum and the page write counts about 31% at maximum, compared to the related studies of index data structures.

Performance Comparison between Hardware & Software Cache Partitioning Techniques (하드웨어 캐시 파티셔닝과 소프트웨어 캐시 파티셔닝의 성능 비교)

  • Park, JiWoong;Yeom, HeonYoung;Eom, Hyeonsang
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.177-182
    • /
    • 2015
  • The era of multi-core processors has begun since the limit of the clock speed has been reached. These days, multi-core technology is used not only in desktops, servers, and table PCs, but also in smartphones. In this architecture, there is always interference between processes, because of the sharing of system resources. To address this problem, cache partitioning is used, which can be roughly divided into two types: software and hardware cache partitioning. When it comes to dynamic cache partitioning, hardware cache partitioning is superior to software cache partitioning, because it needs no page copy. In this paper, we compare the effectiveness of hardware and software cache partitioning on the AMD Opteron 6282 SE, which is the only commodity processor providing hardware cache partitioning, to see whether this technique can be effectively deployed in dynamic environments.

Request Distribution for Fairness with a Non-Periodic Load-Update Mechanism for Cyber Foraging Dynamic Applications in Web Server Cluster (웹 서버 클러스터에서 Cyber Foraging 응용을 위한 비주기적 부하 갱신을 통한 부하 분산 기법)

  • Lu, Xiaoyi;Fu, Zhen;Choi, Won-Il;Kang, Jung-Hun;Ok, Min-Hwan;Park, Myong-Soon
    • The KIPS Transactions:PartA
    • /
    • v.14A no.1 s.105
    • /
    • pp.63-72
    • /
    • 2007
  • This paper introduces a load-balancing algorithm focusing on distributing web requests evenly into the web cluster servers. The load-balancing algorithms based on conventional periodic load-information update mechanism are not suitable for dynamic page applications, which are common in Cyber Foraging services, due to the problems caused by periodic synchronized load-information updating and the difficulties of work load estimation caused by embedded executing scripts of dynamic pages. Update-on-Finish algorithm solves this problem by using non-periodic load-update mechanism, and the web switch knows the servers' real load information only after their reporting and then distributes new loads according to the new load-information table, however it results in much communication overhead. Our proposed mechanism improve update-on-finish algorithm by using K-Percents-Finish mechanism and thus largely reduce the communication overhead. Furthermore, we consider the different capabilities of servers with a threshold Ti value and propose a load-balancing algorithm for servers with various capabilities. Simulation results show that the proposed K-Percents-Finish Reporting mechanism can at least reduce 50% communication overhead than update-on-finish approach while sustaining better load balancing performance than periodic mechanisms in related work.