• Title/Summary/Keyword: page cache

Search Result 69, Processing Time 0.023 seconds

An Efficient Cache Management Scheme of Flash Translation Layer for Large Size Flash Memory Drives

  • Choi, Hwan-Pil;Kim, Yong-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.11
    • /
    • pp.31-38
    • /
    • 2015
  • Nowadays, large size flash memory drives with more than a couple of hundreds of gigabytes are common. This paper presents an efficient cache management scheme of flash translation layer, called TPC-FTL, for large size flash memory drives. Since flash drives of large size usually contain large size RAM, we can enhance the performance of page mapping cache by using more RAM for the cache. But if the size exceeds a threshold, the existing schemes are impractical for real devices, because the time for cache manipulation becomes too long. TPC-FTL manages the cache in translation page unit, not in logical page number unit used in existing schemes. Since a translation page covers a large number of logical page numbers (for example, 512 for 2KB size page), the number of cache elements can be reduced up to a practical level. A performance evaluation shows that average response time, an important performance measure, is better than existing schemes via the effect of utilizing spacial locality in addition to temporal locality.

A Study on an Efficient Solution to the Synonym Problem using Page Alignment (페이지 정렬을 이용한 효과적인 동의어 문제 해결 기법에 관한 연구)

  • 김제성;민상렬;전상훈;안병철;정덕균;김종상
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.37-46
    • /
    • 1996
  • This paper proposes a cost-effective solution to the synonym problem of virtual caches. In the proposed solution, a minimal hardware addition guarantees the correctness whereas the software counterpart helps improve the performance. The key to this proposed solution is an addition of a small physically-indexed cache called U-cache. The U-cache maintains the reverse translation information of the cache blocks that belong to unaligned virtual pages only, where aligned measns that the lower bits of the virtual page number match those of the corresponding physical page number. The page alignment is a simple software optimization to improve the performance of the U-cche hardware. With the combination of both hardware and software, the proposed solution reduces the hardware costs and minimizes software modification and performance degradation. Performance evaluation base on ATUM traces shows that a U-cache, with only a few entries, performs almost as well as fully-configured hardware-based solution when more than 95% of the pages are aligned.

  • PDF

A Comparative Analysis on Page Caching Strategies Affecting Energy Consumption in the NAND Flash Translation Layer (NAND 플래시 변환 계층에서 전력 소모에 영향을 미치는 페이지 캐싱 전략의 비교·분석)

  • Lee, Hyung-Bong;Chung, Tae-Yun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.3
    • /
    • pp.109-116
    • /
    • 2018
  • SSDs that are not allowed in-place update within the allocated page cause another allocation of a new page that will replace the previous page at the moment data modification occurs. This intrinsic characteristic of SSDs requires many changes to the existing HDD-based IO theory. In this paper, we conduct a performance comparison of FTL caching strategy in perspective of cache hashing (Global vs. grouped) and caching algorithm (LRU vs. NUR) through a simulation. Experimental results show that in terms of energy consumption for flash operation the grouped management of cache is not suitable and NUR algorithm is superior to LRU algorithm. In particular, we found that the cache hit ratio of LRU algorithm is about 10% point higher than that of NUR algorithm while the energy consumption of LRU algorithm is about 32% high.

An Efficient Encryption/Decryption Approach to Improve the Performance of Cryptographic File System in Embedded System (내장형 시스템에서 암호화 파일 시스템을 위한 효율적인 암복호화 기법)

  • Heo, Jun-Young;Park, Jae-Min;Cho, Yoo-Kun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.2
    • /
    • pp.66-74
    • /
    • 2008
  • Since modem embedded systems need to access, manipulate or store sensitive information, it requires being equipped with cryptographic file systems. However, cryptographic file systems result in poor performance so that they have not been widely adapted to embedded systems. Most cryptographic file systems degrade the performance unnecessarily because of system architecture. This paper proposes ISEA (Indexed and Separated Encryption Approach) that supports for encryption/decryption in system architecture and removes redundant performance loss. ISEA carries out encryption and decryption at different layers according to page cache layer. Encryption is carried out at lower layer than page cache layer while decryption at upper layer. ISEA stores the decrypted data in page cache so that it can be reused in followed I/O request without decryption. ISEA provides page-indexing which divides page cache into cipher blocks and manages it by a block. It decrypts pages partially so that it can eliminate unnecessary decryption. In synthesized experiment of read/write with various cache hit rates, it gives results suggesting that ISEA has improved the performance of encryption file system efficiently.

Effect of ASLR on Memory Duplicate Ratio in Cache-based Virtual Machine Live Migration

  • Piao, Guangyong;Oh, Youngsup;Sung, Baegjae;Park, Chanik
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.4
    • /
    • pp.205-210
    • /
    • 2014
  • Cache based live migration method utilizes a cache, which is accessible to both side (remote and local), to reduce the virtual machine migration time, by transferring only irredundant data. However, address space layout randomization (ASLR) is proved to reduce the memory duplicate ratio between targeted migration memory and the migration cache. In this pager, we analyzed the behavior of ASLR to find out how it changes the physical memory contents of virtual machines. We found that among six virtual memory regions, only the modification to stack influences the page-level memory duplicate ratio. Experiments showed that: (1) the ASLR does not shift the heap region in sub-page level; (2) the stack reduces the duplicate page size among VMs which performed input replay around 40MB, when ASLR was enabled; (3) the size of memory pages, which can be reconstructed from the fresh booted up state, also reduces by about 60MB by ASLR. With those observations, when applying cache-based migration method, we can omit the stack region. While for other five regions, even a coarse page-level redundancy data detecting method can figure out most of the duplicate memory contents.

An Efficient Instruction Prefetching Scheme Based on the Page Access Information (페이지 접근 정보에 기반한 효율적인 명령어 캐쉬 선인출 기법)

  • Shin Soong-Hyun;Kim Cheol-Hong;Jhon Chu-Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.5
    • /
    • pp.306-315
    • /
    • 2006
  • In general, the hit ratio of the first level cache is one of the most important factors in determining the performance of computer systems. Prefetching from lower level memory structure is one of the most useful techniques for improving the hit ratio of the first level cache. In this paper, we propose a prefetch on continuous same page access (CSPA) scheme which improves the prefetch efficiency of the instruction cache and reduces prefetch cost at the same time. The proposed CSPA scheme traces the page addresses of executed instructions to count how many times the same memory page is accessed continuously. To increase the prefetch efficiency, the CSPA scheme initiates prefetch only if the number of accesses to the same page exceeds the threshold value. Generally, the size of a L1 cache block is smaller than that of a L2 cache block. Therefore, one L2 cache block contains a number of L1 cache blocks. To reduce the number of unnecessary accesses to the L2 cache due to prefetch, the CSPA scheme enables prefetch only when the missed L1 block and the prefetch L1 block are in the same L2 cache block, leading to reduced prefetch cost. According to our simulations, the proposed prefetching scheme improves the performance by up to 6.7%.

Separate Factor Caching Scheme for Mobile Web Service (모바일 웹 서비스를 위한 요소분할 캐싱 기법)

  • Sim, Kun-Jung;Kang, Eui-Sun;Kim, Jong-Keun;Ko, Hee-Ae;Lim, Young-Hwan
    • The KIPS Transactions:PartD
    • /
    • v.14D no.4 s.114
    • /
    • pp.447-458
    • /
    • 2007
  • The objective of this study is to provide faster mobile web service by improving performance of Contents Cache used for mobile web service in the existing Mobile Gate System. It was found that two elements existed in Mark-Up page transcoded by Contents Generator. One of the elements was dependent only on the requested DIDL page and Mark-Up type. The other was dependent on each of the requested DIDL page, Mark-Up type, size of mobile display 모바일 장치 to request service, type of images available and color depth count of the images available. The conventional Contents Cache saved the entire Mark-Up page to hold both of the two elements. This caused the problem where storage space was not effectively used because reusable elements were repetitively saved in cache memory domain due to change in one of the elements even though all the other elements were the same. As a result, a larger number of transcoded Mark-Up pages could not be saved in the same cache memory size. Therefore, in this study, Mark-Up pages transcoded by Contents Generator were divided into two elements and were separately saved. Also, in order to respond to the demand for replacing data in cache with new data, this study applied two algorithms of LFU and LRU. This study proposed the method to implement cache performance of faster speed by enabling to save more number of the transcoded Mark-Up pages in the same cache storage space.

WWW Cache Replacement Algorithm Based on the Network-distance

  • Kamizato, Masaru;Nagata, Tomokazu;Taniguchi, Yuji;Tamaki, Shiro
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.238-241
    • /
    • 2002
  • With the popularity of utilization of the Internet among people, the amount of data in the network rapidly increased. So that, the fall of response time from WWW server, which is caused by the network traffic and the burden on m server, has become more of an issue. This problem is encouraged the rearch by redundancy of requesting the same pages by many people, even though they browse the same the ones. To reduce these redundancy, WWW cache server is used commonly in order to store m page data and reuse them. However, the technical uses of WWW cache that different from CPU and Disk cache, is known for its difficulty of improving the cache hit rate. Consecuently, it is difficult to choose effective WWW data to be stored from all data flowing through the WWW cache server. On the other hand, there are room for improvement in commonly used cache replacement algorithms by WWW cache server. In our study, we try to realize a WWW cache server that stresses on the improvement of the stresses of response time. To this end, we propose the new cache replacement algorithm by focusing on the utilizable information of network distance from the WWW cache server to WWW server that possessing the page data of the user requesting.

  • PDF

Prefetch R-tree: A Disk and Cache Optimized Multidimensional Index Structure (Prefetch R-tree: 디스크와 CPU 캐시에 최적화된 다차원 색인 구조)

  • Park Myung-Sun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.463-476
    • /
    • 2006
  • R-trees have been traditionally optimized for the I/O performance with the disk page as the tree node. Recently, researchers have proposed cache-conscious variations of R-trees optimized for the CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called the PR-tree (Prefetching R-tree). For the cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For the I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of the I/O performance.

Performance Comparison between Hardware & Software Cache Partitioning Techniques (하드웨어 캐시 파티셔닝과 소프트웨어 캐시 파티셔닝의 성능 비교)

  • Park, JiWoong;Yeom, HeonYoung;Eom, Hyeonsang
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.177-182
    • /
    • 2015
  • The era of multi-core processors has begun since the limit of the clock speed has been reached. These days, multi-core technology is used not only in desktops, servers, and table PCs, but also in smartphones. In this architecture, there is always interference between processes, because of the sharing of system resources. To address this problem, cache partitioning is used, which can be roughly divided into two types: software and hardware cache partitioning. When it comes to dynamic cache partitioning, hardware cache partitioning is superior to software cache partitioning, because it needs no page copy. In this paper, we compare the effectiveness of hardware and software cache partitioning on the AMD Opteron 6282 SE, which is the only commodity processor providing hardware cache partitioning, to see whether this technique can be effectively deployed in dynamic environments.