• 제목/요약/키워드: write latency

검색결과 47건 처리시간 0.018초

고성능 PCM&DRAM 하이브리드 메모리 시스템 (High Performance PCM&DRAM Hybrid Memory System)

  • 정보성;이정훈
    • 대한임베디드공학회논문지
    • /
    • 제11권2호
    • /
    • pp.117-123
    • /
    • 2016
  • In general, PCM (Phase Change Memory) is unsuitable as a main memory because it has limitations: high read/write latency and low endurance. However, the DRAM&PCM hybrid memory with the same level is one of the effective structures for a next generation main memory because it can utilize an advantage of both DRAM and PCM. Therefore, it needs an effective page management method for exploiting each memory characteristics dynamically and adaptively. So we aim reducing an access time and write count of PCM by using an effective page replacement. According to our simulation, the proposed algorithm for the DRAM&PCM hybrid can reduce the PCM access count by around 60% and the PCM write count by 42% given the same PCM size, compared with Clock-DWF algorithm.

쓰기 횟수 감소를 위한 하이브리드 캐시 구조에서의 캐시간 직접 전송 기법에 대한 연구 (A Study on Direct Cache-to-Cache Transfer for Hybrid Cache Architecture to Reduce Write Operations)

  • 최주희
    • 반도체디스플레이기술학회지
    • /
    • 제23권1호
    • /
    • pp.65-70
    • /
    • 2024
  • Direct cache-to-cache transfer has been studied to reduce the latency and bandwidth consumption related to the shared data in multiprocessor system. Even though these studies lead to meaningful results, they assume that caches consist of SRAM. For example, if the system employs the non-volatile memory, the one of the most important parts to consider is to decrease the number of write operations. This paper proposes a hybrid write avoidance cache coherence protocol that considers the hybrid cache architecture. A new state is added to finely control what is stored in the non-volatile memory area, and experimental results showed that the number of writes was reduced by about 36% compared to the existing schemes.

  • PDF

Optimizing Fsync Performance with Dynamic Queue Depth Adaptation

  • Park, Daejun;Kim, Min Ji;Shin, Dongkun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제15권5호
    • /
    • pp.570-576
    • /
    • 2015
  • Existing flash storage devices such as universal flash storage and solid state disk support command queuing to improve storage I/O bandwidth. Command queuing allows multiple read/write requests to be pending in a device queue. Because multi-channel and multi-way architecture of flash storage devices can handle multiple requests simultaneously, command queuing is an indispensable technique for utilizing parallel architecture. However, command queuing can be harmful to the latency of fsync system call, which is critical to application responsiveness. We propose a dynamic queue depth adaptation technique, which reduces the queue depth if user application is expected to send fsync calls. Experiments show that the proposed technique reduces the fsync latency by 79% on average compared to the original scheme.

Optimizing Garbage Collection Overhead of Host-level Flash Translation Layer for Journaling Filesystems

  • Son, Sehee;Ahn, Sungyong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제13권2호
    • /
    • pp.27-35
    • /
    • 2021
  • NAND flash memory-based SSD needs an internal software, Flash Translation Layer(FTL) to provide traditional block device interface to the host because of its physical constraints, such as erase-before-write and large erase block. However, because useful host-side information cannot be delivered to FTL through the narrow block device interface, SSDs suffer from a variety of problems such as increasing garbage collection overhead, large tail-latency, and unpredictable I/O latency. Otherwise, the new type of SSD, open-channel SSD exposes the internal structure of SSD to the host so that underlying NAND flash memory can be managed directly by the host-level FTL. Especially, I/O data classification by using host-side information can achieve the reduction of garbage collection overhead. In this paper, we propose a new scheme to reduce garbage collection overhead of open-channel SSD by separating the journal from other file data for the journaling filesystem. Because journal has different lifespan with other file data, the Write Amplification Factor (WAF) caused by garbage collection can be reduced. The proposed scheme is implemented by modifying the host-level FTL of Linux and evaluated with both Fio and Filebench. According to the experiment results, the proposed scheme improves I/O performance by 46%~50% while reducing the WAF of open-channel SSDs by more than 33% compared to the previous one.

An Efficient Variable Rearrangement Technique for STT-RAM Based Hybrid Caches

  • 윤종희;조두산
    • 대한임베디드공학회논문지
    • /
    • 제11권2호
    • /
    • pp.67-78
    • /
    • 2016
  • The emerging Spin-Transfer Torque RAM (STT-RAM) is a promising component that can be used to improve the efficiency as a result of its high storage density and low leakage power. However, the state-of-the-art STT-RAM is not ready to replace SRAM technology due to the negative effect of its write operations. The write operations require longer latency and more power than the same operations in SRAM. Therefore, a hybrid cache with SRAM and STT-RAM technologies is proposed to obtain the benefits of STT-RAM while minimizing its negative effects by using SRAM. To efficiently use of the hybrid cache, it is important to place write intensive data onto the cache. Such data should be placed on SRAM to minimize the negative effect. Thus, we propose a technique that optimizes placement of data in main memory. It drives the proper combination of advantages and disadvantages for SRAM and STT-RAM in the hybrid cache. As a result of the proposed technique, write intensive data are loaded to SRAM and read intensive data are loaded to STT-RAM. In addition, our technique also optimizes temporal locality to minimize conflict misses. Therefore, it improves performance and energy consumption of the hybrid cache architecture in a certain range.

Analysis of read speed latency in 6T-SRAM cell using multi-layered graphene nanoribbon and cu based nano-interconnects for high performance memory circuit design

  • Sandip, Bhattacharya;Mohammed Imran Hussain;John Ajayan;Shubham Tayal;Louis Maria Irudaya Leo Joseph;Sreedhar Kollem;Usha Desai;Syed Musthak Ahmed;Ravichander Janapati
    • ETRI Journal
    • /
    • 제45권5호
    • /
    • pp.910-921
    • /
    • 2023
  • In this study, we designed a 6T-SRAM cell using 16-nm CMOS process and analyzed the performance in terms of read-speed latency. The temperaturedependent Cu and multilayered graphene nanoribbon (MLGNR)-based nanointerconnect materials is used throughout the circuit (primarily bit/bit-bars [red lines] and word lines [write lines]). Here, the read speed analysis is performed with four different chip operating temperatures (150K, 250K, 350K, and 450K) using both Cu and graphene nanoribbon (GNR) nano-interconnects with different interconnect lengths (from 10 ㎛ to 100 ㎛), for reading-0 and reading-1 operations. To execute the reading operation, the CMOS technology, that is, the16-nm PTM-HPC model, and the16-nm interconnect technology, that is, ITRS-13, are used in this application. The complete design is simulated using TSPICE simulation tools (by Mentor Graphics). The read speed latency increases rapidly as interconnect length increases for both Cu and GNR interconnects. However, the Cu interconnect has three to six times more latency than the GNR. In addition, we observe that the reading speed latency for the GNR interconnect is ~10.29 ns for wide temperature variations (150K to 450K), whereas the reading speed latency for the Cu interconnect varies between ~32 ns and 65 ns for the same temperature ranges. The above analysis is useful for the design of next generation, high-speed memories using different nano-interconnect materials.

Gen-Z memory pool system implementation and performance measurement

  • Kwon, Won-ok;Sok, Song-Woo;Park, Chan-ho;Oh, Myeong-Hoon;Hong, Seokbin
    • ETRI Journal
    • /
    • 제44권3호
    • /
    • pp.450-461
    • /
    • 2022
  • The Gen-Z protocol is a memory semantic protocol between the memory and CPU used in computer architectures with large memory pools. This study presents the implementation of the Gen-Z hardware system configured using Gen-Z specification 1.0 and reports its performance. A hardware prototype of a DDR4 Gen-Z memory pool with an optimized character, a block device driver, and a file system for the Gen-Z hardware was designed. The Gen-Z IP was targeted to the FPGA, and a 512 GB Gen-Z memory pool was configured on an ×86 server. In the experiments, the latency and throughput of the Gen-Z memory were measured and compared with those of the local memory, SATA SSD, and NVMe using character or block device interfaces. The Gen-Z hardware exhibited superior throughput and latency performance compared with SATA SSD and NVMe at block sizes under 4 kB. The MySQL and File IO benchmark of Gen-Z showed good write performance in all block sizes and threads. Besides, it showed low latency in RocksDB's fillseq dbbench using the ext4 direct access filesystem.

SoC 설계용 고성능 SDRAM Controller 설계 (A Design of high performance SDRAM Controller for SoC design)

  • 권오현;양훈모;이문기
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 II
    • /
    • pp.1209-1212
    • /
    • 2003
  • In this paper, we propose a SDRAM Controller. The SDRAM is often used a mainstream memory as embedded system memory due to its short latency, burst access and pipeline features. The proposed Controller provides essential functions for SDRAM initialization, read/write accesses, memory refresh and Burst access. Furthermore, the proposed controller is implemented in the form of SOFT IP. Therefore, it reduces the designer's effort greatly.

  • PDF

CPWL : Clock and Page Weight based Disk Buffer Management Policy for Flash Memory Systems

  • Kang, Byung Kook;Kwak, Jong Wook
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권2호
    • /
    • pp.21-29
    • /
    • 2020
  • IT 산업 환경에서 모바일 데이터의 수요 증가로 인해 NAND 플래시 메모리의 사용이 지속적으로 증가하고 있다. 하지만, 플래시 메모리의 소거 동작은 긴 대기 시간과 높은 소비 전력을 요구하여 각 셀의 수명을 제한한다. 따라서 쓰기와 삭제 작업을 자주 수행하면 플래시 메모리의 성능과 수명이 단축된다. 이런 문제를 해결하기 위해 디스크 버퍼를 이용, 플래시 메모리에 할당되는 쓰기 및 지우기 연산을 감소시켜 플래시 메모리의 성능을 향상시키는 기술이 연구되고 있다. 본 논문에서는 쓰기 횟수를 최소화하기 위한 CPWL 기법을 제안한다. CPWL 기법은 버퍼 메모리 액세스 패턴에 따라 읽기 및 쓰기 페이지를 나누어 관리한다. 이렇게 나뉜 페이지를 정렬하여 쓰기 횟수를 줄이고 결과적으로 플래시 메모리의 수명을 늘리고 에너지 소비를 감소시킨다.

하이브리드 메인 메모리의 성능 향상을 위한 페이지 교체 기법 (Page Replacement Algorithm for Improving Performance of Hybrid Main Memory)

  • 이민호;강동현;김정훈;엄영익
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제21권1호
    • /
    • pp.88-93
    • /
    • 2015
  • DRAM은 빠른 쓰기/읽기 속도와 무한한 쓰기 횟수로 인해 컴퓨터 시스템에서 주로 메인 메모리로 사용되지만 저장된 데이터를 유지하기 위해 지속적인 전원공급이 필요하다. 반면, PCM은 비휘발성 메모리로 전원공급 없이 저장된 데이터를 유지할 수 있으며 DRAM과 같이 바이트 단위의 접근과 덮어쓰기가 가능하다는 점에서 DRAM을 대체할 수 있는 메모리로 주목받고 있다. 하지만 PCM은 느린 쓰기/읽기 속도와 제한된 쓰기 횟수로 인해 메인 메모리로 사용되기 어렵다. 이런 이유로 DRAM과 PCM의 장점을 모두 활용하기 위한 하이브리드 메인 메모리가 제안되었고 이에 대한 연구가 활발하다. 본 논문에서는 DRAM과 PCM으로 구성된 하이브리드 메인 메모리를 위한 새로운 페이지 교체 기법을 제안한다. PCM의 단점을 보완하기 위해 제안 기법은 PCM 쓰기 횟수를 줄이는 것을 목표로 하며 실험결과에서 알 수 있듯이 본 논문의 제안 기법은 다른 페이지 교체 기법에 비해 PCM 쓰기 횟수를 80.5% 줄인다.