• 제목/요약/키워드: write latency

검색결과 47건 처리시간 0.028초

누적 버퍼를 활용한 효율적인 Latency Hiding기법 (An Efficient Latency Hiding method using accumulation buffer)

  • 이민우;한탁돈
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2012년도 제46차 하계학술발표논문집 20권2호
    • /
    • pp.297-300
    • /
    • 2012
  • 현재 cache의 성능 향상을 위한 많은 기법들이 제안되고 있으며, Latency Hiding 기법 역시 cache의 효율적인 사용을 위해 많은 연구가 진행 되어 왔다. write buffer를 사용한 write Latency hiding기법이나 multi threading을 사용한 Latency Hiding 방법 등 여러 기법들이 연구되어 왔으며, 지금도 Latency hiding을 위한 많은 연구들이 지속적으로 진행되고 있다. 본 논문 역시 효율적인 Latency Hiding을 위한 누적 버퍼를 제안한다. 본 논문은 누적 버퍼의 활용도를 조사하여 얼마나 효율적으로 Latency를 은폐했는지, 또 버퍼를 사용함으로써 얻는 다른 이점에 대해 집중적으로 연구하였다.

  • PDF

A Locality-Aware Write Filter Cache for Energy Reduction of STTRAM-Based L1 Data Cache

  • Kong, Joonho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제16권1호
    • /
    • pp.80-90
    • /
    • 2016
  • Thanks to superior leakage energy efficiency compared to SRAM cells, STTRAM cells are considered as a promising alternative for a memory element in on-chip caches. However, the main disadvantage of STTRAM cells is high write energy and latency. In this paper, we propose a low-cost write filter (WF) cache which resides between the load/store queue and STTRAM-based L1 data cache. To maximize efficiency of the WF cache, the line allocation and access policies are optimized for reducing energy consumption of STTRAM-based L1 data cache. By efficiently filtering the write operations in the STTRAM-based L1 data cache, our proposed WF cache reduces energy consumption of the STTRAM-based L1 data cache by up to 43.0% compared to the case without the WF cache. In addition, thanks to the fast hit latency of the WF cache, it slightly improves performance by 0.2%.

데이터 망각을 활용한 비휘발성 메모리 기반 파일 캐시 관리 기법 (Forgetting based File Cache Management Scheme for Non-Volatile Memory)

  • 강동우;최종무
    • 정보과학회 논문지
    • /
    • 제42권8호
    • /
    • pp.972-978
    • /
    • 2015
  • 비휘발성 메모리는 바이트 단위 접근과 비휘발성을 지원한다. 이러한 특성들은 비휘발성 메모리를 캐시, 메모리, 디스크와 같은 메모리 계층 구조 가운데 하나의 영역으로 사용을 가능케 한다. 비휘발성 메모리의 흥미로운 특성은 데이터 보존 기간이 실제로는 제한적인 기간을 가지고 있다는 것이다. 게다가 데이터 보존 기간과 쓰기 지연간의 트레이드오프가 존재 한다. 본 논문에서는 이를 활용하여 비휘발성 메모리를 파일 캐시로 사용하는 새로운 관리 기법을 제안한다. 제안하는 기법은 기존의 캐시 관리 기법과는 반대로 짧은 데이터 보존 시간으로 데이터를 저장하고 쓰기 성능을 개선한다. 제안하는 기법은 LRU 대비 평균 접근 지연 시간을 최대 31%, 평균 24.4%로 감소시킴을 보인다.

대용량 SSD를 위한 요구 기반 FTL 캐시 분리 기법 (Demand-based FTL Cache Partitioning for Large Capacity SSDs)

  • 배진욱;김한별;임준수;이성진
    • 대한임베디드공학회논문지
    • /
    • 제14권2호
    • /
    • pp.71-78
    • /
    • 2019
  • As the capacity of SSDs rapidly increases, the amount of DRAM to keep a mapping table size in SSDs becomes very huge. To address a Demand-based FTL (DFTL) scheme that caches part of mapping entries in DRAM is considered to be a feasible alternative. However, owing to its unpredictable behaviors, DFTL fails to provide consistent I/O response times. In this paper, we a) analyze a root cause that results in fluctuation on read latency and b) propose a new demand-based FTL scheme that ensures guaranteed read response time with low write amplification. By preventing mapping evictions while serving reads, the proposed technique guarantees every host read requests to be done in 2 NAND read operations. Moreover, only with 25% of a cache ratio, the proposed scheme improves random write performance and random mixed performance by 1.65x and 1.15x, respectively, over the traditional DFTL.

Buffer Policy based on High-capacity Hybrid Memories for Latency Reduction of Read/Write Operations in High-performance SSD Systems

  • Kim, Sungho;Hwang, Sang-Ho;Lee, Myungsub;Kwak, Jong Wook;Park, Chang-Hyeon
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권7호
    • /
    • pp.1-8
    • /
    • 2019
  • Recently, an SSD with hybrid buffer memories is actively researching to reduce the overall latency in server computing systems. However, existing hybrid buffer policies caused many swapping operations in pages because it did not consider the overall latency such as read/write operations of flash chips in the SSD. This paper proposes the clock with hybrid buffer memories (CLOCK-HBM) for a new hybrid buffer policy in the SSD with server computing systems. The CLOCK-HBM constructs new policies based on unique characteristics in both DRAM buffer and NVMs buffer for reducing the number of swapping operations in the SSD. In experimental results, the CLOCK-HBM reduced the number of swapping operations in the SSD by 43.5% on average, compared with LRU, CLOCK, and CLOCK-DNV.

Technology of the next generation low power memory system

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제10권4호
    • /
    • pp.6-11
    • /
    • 2018
  • As embedded memory technology evolves, the traditional Static Random Access Memory (SRAM) technology has reached the end of development. For deepening the manufacturing process technology, the next generation memory technology is highly required because of the exponentially increasing leakage current of SRAM. Non-volatile memories such as STT-MRAM (Spin Torque Transfer Magnetic Random Access Memory), PCM (Phase Change Memory) are good candidates for replacing SRAM technology in embedded memory systems. They have many advanced characteristics in the perspective of power consumption, leakage power, size (density) and latency. Nonetheless, nonvolatile memories have two major problems that hinder their use it the next-generation memory. First, the lifetime of the nonvolatile memory cell is limited by the number of write operations. Next, the write operation consumes more latency and power than the same size of the read operation.These disadvantages can be solved using the compiler. The disadvantage of non-volatile memory is in write operations. Therefore, when the compiler decides the layout of the data, it is solved by optimizing the write operation to allocate a lot of data to the SRAM. This study provides insights into how these compiler and architectural designs can be developed.

메모리 지연을 감추는 기법들 (Memory Latency Hiding Techniques)

  • 기안도
    • 전자통신동향분석
    • /
    • 제13권3호통권51호
    • /
    • pp.61-70
    • /
    • 1998
  • The obvious way to make a computer system more powerful is to make the processor as fast as possible. Furthermore, adopting a large number of such fast processors would be the next step. This multiprocessor system could be useful only if it distributes workload uniformly and if its processors are fully utilized. To achieve a higher processor utilization, memory access latency must be reduced as much as possible and even more the remaining latency must be hidden. The actual latency can be reduced by using fast logic and the effective latency can be reduced by using cache. This article discusses what the memory latency problem is, how serious it is by presenting analytical and simulation results, and existing techniques for coping with it; such as write-buffer, relaxed consistency model, multi-threading, data locality optimization, data forwarding, and data prefetching.

비대칭적 성능의 고용량 비휘발성 메모리를 위한 계층적 구조의 이진 탐색 트리 (A Hierarchical Binary-search Tree for the High-Capacity and Asymmetric Performance of NVM)

  • 정민성;이미정;이은지
    • 대한임베디드공학회논문지
    • /
    • 제14권2호
    • /
    • pp.79-86
    • /
    • 2019
  • For decades, in-memory data structures have been designed for DRAM-based main memory that provides symmetric read/write performances and has no limited write endurance. However, such data structures provide sub-optimal performance for NVM as it has different characteristics to DRAM. With this motivation, we rethink a conventional red-black tree in terms of its efficacy under NVM settings. The original red-black tree constantly rebalances sub-trees so as to export fast access time over dataset, but it inevitably increases the write traffic, adversely affecting the performance for NVM with a long write latency and limited endurance. To resolve this problem, we present a variant of the red-black tree called a hierarchical balanced binary search tree. The proposed structure maintains multiple keys in a single node so as to amortize the rebalancing cost. The performance study reveals that the proposed hierarchical binary search tree effectively reduces the write traffic by effectively reaping the high capacity of NVM.

A Novel Memory Hierarchy for Flash Memory Based Storage Systems

  • Yim, Keno-Soo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제5권4호
    • /
    • pp.262-269
    • /
    • 2005
  • Semiconductor scientists and engineers ideally desire the faster but the cheaper non-volatile memory devices. In practice, no single device satisfies this desire because a faster device is expensive and a cheaper is slow. Therefore, in this paper, we use heterogeneous non-volatile memories and construct an efficient hierarchy for them. First, a small RAM device (e.g., MRAM, FRAM, and PRAM) is used as a write buffer of flash memory devices. Since the buffer is faster and does not have an erase operation, write can be done quickly in the buffer, making the write latency short. Also, if a write is requested to a data stored in the buffer, the write is directly processed in the buffer, reducing one write operation to flash storages. Second, we use many types of flash memories (e.g., SLC and MLC flash memories) in order to reduce the overall storage cost. Specifically, write requests are classified into two types, hot and cold, where hot data is vulnerable to be modified in the near future. Only hot data is stored in the faster SLC flash, while the cold is kept in slower MLC flash or NOR flash. The evaluation results show that the proposed hierarchy is effective at improving the access time of flash memory storages in a cost-effective manner thanks to the locality in memory accesses.

비휘발성 메모리 시스템을 위한 저전력 연쇄 캐시 구조 및 최적화된 캐시 교체 정책에 대한 연구 (A Study on Design and Cache Replacement Policy for Cascaded Cache Based on Non-Volatile Memories)

  • 최주희
    • 반도체디스플레이기술학회지
    • /
    • 제22권3호
    • /
    • pp.106-111
    • /
    • 2023
  • The importance of load-to-use latency has been highlighted as state-of-the-art computing cores adopt deep pipelines and high clock frequencies. The cascaded cache was recently proposed to reduce the access cycle of the L1 cache by utilizing differences in latencies among banks of the cache structure. However, this study assumes the cache is comprised of SRAM, making it unsuitable for direct application to non-volatile memory-based systems. This paper proposes a novel mechanism and structure for lowering dynamic energy consumption. It inserts monitoring logic to keep track of swap operations and write counts. If the ratio of swap operations to total write counts surpasses a set threshold, the cache controller skips the swap of cache blocks, which leads to reducing write operations. To validate this approach, experiments are conducted on the non-volatile memory-based cascaded cache. The results show a reduction in write operations by an average of 16.7% with a negligible increase in latencies.

  • PDF