통합 검색 | Korea Science

A Locality-Aware Write Filter Cache for Energy Reduction of STTRAM-Based L1 Data Cache

Kong, Joonho
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제16권1호
- /
- pp.80-90
- /
- 2016
Thanks to superior leakage energy efficiency compared to SRAM cells, STTRAM cells are considered as a promising alternative for a memory element in on-chip caches. However, the main disadvantage of STTRAM cells is high write energy and latency. In this paper, we propose a low-cost write filter (WF) cache which resides between the load/store queue and STTRAM-based L1 data cache. To maximize efficiency of the WF cache, the line allocation and access policies are optimized for reducing energy consumption of STTRAM-based L1 data cache. By efficiently filtering the write operations in the STTRAM-based L1 data cache, our proposed WF cache reduces energy consumption of the STTRAM-based L1 data cache by up to 43.0% compared to the case without the WF cache. In addition, thanks to the fast hit latency of the WF cache, it slightly improves performance by 0.2%.
https://doi.org/10.5573/JSTS.2016.16.1.080 인용 PDF KSCI

내장형 시스템을 위한 에너지-성능 측면에서 효율적인 2-레벨 데이터 캐쉬 구조의 설계 (Energy-Performance Efficient 2-Level Data Cache Architecture for Embedded System)

이종민;김순태
- 한국정보과학회논문지:시스템및이론
- /
- 제37권5호
- /
- pp.292-303
- /
- 2010
온칩(on-chip) 캐쉬는 외부 메모리로의 접근을 감소시키며 빈번하게 접근되기 때문에 내장형 시스템의 성능과 에너지 소비 측면에서 중요한 역할을 한다. 본 논문에서는 내장형 시스템에 맞추어 설계된 2-레벨 데이터 캐쉬 메모리 구조를 제안하고자 한다. 레벨1(L1) 캐쉬의 구성으로 작은 크기, 직접시장(direct-mapped) 그리고 바로쓰기(write-through)를 채용한다. 대조적으로 레벨2(L2) 캐쉬는 보통의 캐쉬 크기와 집합연관(set-associativity) 그리고 나중쓰기(write-back) 정책을 채용한다. 결과적으로 L1 캐쉬는 빠른 접근 시간을 가지며 (한 사이클 이내) L2 캐쉬는 전체 캐쉬의 미스율(global miss rate)을 낮추는데 효과적이다. 작은 크기의 L1 데이터 캐쉬로 인한 증가된 캐쉬 미스율(miss rate)을 줄이기 위해 ECP(Early Cache hit Predictor)기법을 제안하였다. 제안된 ECP기법은 L1 캐쉬 히트 예측을 통해서 요청된 데이터가 L1 캐쉬에 있는지 예측할 수 있으며 추가적으로, ALU를 필요로 하지 않고 빠르게 유효주소(effective address)계산을 할 수 있다. 또한, 두 캐쉬 계층간 바로쓰기(write-through) 정책에서 오는 빈번한 L2 캐쉬 접근으로 인한 에너지 소비를 줄이기 위해 지정웨이 쓰기(one-way write) 기법을 제안하였다. 제안된 지정웨이 쓰기 기법을 이용하면 바로쓰기 정책으로 인한 L1 캐쉬에서 L2 캐쉬로의 쓰기 접근시 태그(tag) 비교 과정을 거치지 않고 하나의 지정된 웨이를 바로 접근할 수 있다. 사이클 단위 정확도의 시뮬레이터와 내장형 벤치마크를 이용한 실험 결과 본 논문에서 제안한 2-레벨 데이터 캐쉬 메모리 구조는 평균적으로 3.6%의 성능향상과 50%의 데이터 캐쉬 에너지 소비를 감소 시켰다.
PDF KSCI

비휘발성 메모리 시스템을 위한 저전력 연쇄 캐시 구조 및 최적화된 캐시 교체 정책에 대한 연구 (A Study on Design and Cache Replacement Policy for Cascaded Cache Based on Non-Volatile Memories)

최주희
- 반도체디스플레이기술학회지
- /
- 제22권3호
- /
- pp.106-111
- /
- 2023
The importance of load-to-use latency has been highlighted as state-of-the-art computing cores adopt deep pipelines and high clock frequencies. The cascaded cache was recently proposed to reduce the access cycle of the L1 cache by utilizing differences in latencies among banks of the cache structure. However, this study assumes the cache is comprised of SRAM, making it unsuitable for direct application to non-volatile memory-based systems. This paper proposes a novel mechanism and structure for lowering dynamic energy consumption. It inserts monitoring logic to keep track of swap operations and write counts. If the ratio of swap operations to total write counts surpasses a set threshold, the cache controller skips the swap of cache blocks, which leads to reducing write operations. To validate this approach, experiments are conducted on the non-volatile memory-based cascaded cache. The results show a reduction in write operations by an average of 16.7% with a negligible increase in latencies.
PDF

Write Back 모드용 FIFO 버퍼 기능을 갖는 비동기식 데이터 캐시 (Design of an Asynchronous Data Cache with FIFO Buffer for Write Back Mode)

박종민;김석만;오명훈;조경록
- 한국콘텐츠학회논문지
- /
- 제10권6호
- /
- pp.72-79
- /
- 2010
본 논문에서는 32bit 비동기 임베디드 프로세서용 쓰기 버퍼 기능을 갖는 데이터 캐시 구조를 제안하고 성능을 검증하였다. 데이터 캐시는 비동기 시스템에서 메인 메모리 장치와 프로세서 사이의 데이터 처리속도 향상을 목적으로 한다. 제안된 데이터 캐시의 메모리 크기는 8KB, 매핑 방식으로는 4 words(16byte)의 라인 크기를 가지며, 사상 기법으로는 4 way set associative, 교체 알고리즘으로는 pusedo LRU방식을 사용하였으며, 쓰기 정책을 위한 dirty 레지스터와 쓰기 버퍼를 적용시켰다. 설계한 데이터 캐시는 $0.13-{\mu}m$ CMOS공정으로 합성하였으며, MI벤치마크 검증 결과 평균 히트율은 94%이고 처리 속도가 46% 향상되었다.
https://doi.org/10.5392/JKCA.2010.10.6.072 인용 PDF KSCI

라이트 백 캐쉬를 위한 빠른 라이트 백 기법 (The Early Write Back Scheme For Write-Back Cache)

정영진;이길환;이용석
- 대한전자공학회논문지SD
- /
- 제46권11호
- /
- pp.101-109
- /
- 2009
일반적으로 3차원 그래픽 깊이 캐쉬와 픽셀 캐쉬는 메모리 대역폭의 효율적인 사용을 위하여 라이트 백(write-back) 캐쉬로 설계된다. 또한 3차원 그래픽 특성상 캐쉬 읽기 접근을 시도한 주소에 대한 캐쉬 쓰기 접근 혹은 읽기 접근이 발생하지 않고 캐쉬 쓰기 접근만 발생하는 경우가 많다. 캐쉬 메모리의 모든 블록이 사용되고 있는 상태에서 캐쉬 접근 실패가 발생하면 캐쉬 메모리 한 블록이 교체 알고리즘(replacement algorithm)에 의하여 한 블록을 라이트 백 동작을 실행하고 그 블록에 다른 데이터를 저장한다. 이러한 캐쉬 접근 실패 발생은 방출되는 캐쉬 메모리 한 블록의 데이터를 저장하기 위한 외부 메모리 쓰기 접근과 캐쉬 접근 실패를 처리하기 위한 외부 메모리 접근을 동시에 발생시킨다. 따라서 연속적인 캐쉬 접근 실패가 발생하는 경우 다량의 메모리 읽기와 쓰기 접근이 동시에 발생되어 메모리 병목현상을 유발시키고 이는 결국 메모리 접근 소요 시간을 길어지게 한다. 이와 같이 연속적인 캐쉬 접근 실패는 캐쉬를 사용하는 프로세서나 IP의 성능 저하와 전력소비 증가를 유발한다. 본 논문에서는 캐쉬 사용 시 발생하는 메모리 병목현상을 최소화하기 위하여 빠른 라이트 백이라는 새로운 방법을 사용하였다. 이 방법은 캐쉬 메모리 블록에 들어있는 유효 데이터를 방출하는 시점을 조절하여 외부 메모리 접근이 다량으로 몰리는 것을 방지하는 것이다. 즉 같은 메모리 용량과 접근 성공율을 가지는 캐쉬의 성능을 증가시킬 수 있는 방법이다. 이를 통하여 메모리 병목 현상을 완화시킬 수 있고 또한 캐쉬 접근 실패 시 소요되는 평균 메모리 접근 소요시간을 줄일 수 있다. 이러한 새로운 캐쉬 구조를 위한 실험은 ARM11, 3차원 그래픽 가속기 및 다양한 IP들이 내장되어 있는 SoC 환경에서 3차원 그래픽 가속기의 깊이 캐쉬와 픽셀 캐쉬에 적용하여 진행하였으며 여러 가지 실험 벡터를 이용하여 결과를 측정하였을때 성능을 향상시킬 수 있다.
PDF KSCI

쓰기 횟수 감소를 위한 하이브리드 캐시 구조에서의 캐시간 직접 전송 기법에 대한 연구 (A Study on Direct Cache-to-Cache Transfer for Hybrid Cache Architecture to Reduce Write Operations)

최주희
- 반도체디스플레이기술학회지
- /
- 제23권1호
- /
- pp.65-70
- /
- 2024
Direct cache-to-cache transfer has been studied to reduce the latency and bandwidth consumption related to the shared data in multiprocessor system. Even though these studies lead to meaningful results, they assume that caches consist of SRAM. For example, if the system employs the non-volatile memory, the one of the most important parts to consider is to decrease the number of write operations. This paper proposes a hybrid write avoidance cache coherence protocol that considers the hybrid cache architecture. A new state is added to finely control what is stored in the non-volatile memory area, and experimental results showed that the number of writes was reduced by about 36% compared to the existing schemes.
PDF

Improving Energy Efficiency and Lifetime of Phase Change Memory using Delta Value Indicator

Choi, Ju Hee;Kwak, Jong Wook
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제16권3호
- /
- pp.330-338
- /
- 2016
Phase change memory (PCM) has been studied as an emerging memory technology for last-level cache (LLC) due to its extremely low leakage. However, it consumes high levels of energy in updating cells and its write endurance is limited. To relieve the write pressure of LLC, we propose a delta value indicator (DVI) by employing a small cache which stores the difference between the value currently stored and the value newly loaded. Since the write energy consumption of the small cache is less than the LLC, the energy consumption is reduced by access to the small cache instead of the LLC. In addition, the lifetime of the LLC is further extended because the number of write accesses to the LLC is decreased. To this end, a delta value indicator and controlling circuits are inserted into the LLC. The simulation results show a 26.8% saving of dynamic energy consumption and a 31.7% lifetime extension compared to a state-of-the-art scheme for PCM.
https://doi.org/10.5573/JSTS.2016.16.3.330 인용 PDF KSCI

MI-MESI 쓰기-무효화 스누핑 캐쉬 일관성 유지 프로토콜 (MI-MESI Write-invalidate Snooping Cache Coherence Protocol)

장성태
- 한국정보처리학회논문지
- /
- 제2권5호
- /
- pp.757-767
- /
- 1995
본 논문에서는 분리형 트랜잭션 버스를 기반으로한 다중 프로세서 환경하에서 MESI와I-MESI 캐쉬 일관성 유지 프로토콜의 문제점을 개선한 MI-MESI 쓰기-무효화 스누핑 캐쉬 일관성 유지 프로토콜을 제시한다. 이 프로토콜에서 각 캐쉬 블럭은 여섯 개의 캐쉬 상태 즉, Modified-shared, Invalid-by-other, Modified, Exclusive, Shared 및 Invalid 상태중의 하나를 유지하여, 기존의 MESI와데 I-MESI 캐쉬 일관성 유지 프 로토콜에서 발생하는 불필요한 메모리 모듈의 갱신과 메모리 모듈에서의 접근 충돌을 크게 줄여서 빠른 메모리 접근 시간을 제공할 수 있다.
PDF

마이크로프로세서 캐쉬메모리의 적중률 개선을 위한 제안 (A Proposal for Hit Ratio Improvement of a Microprocessor's Cache Memory)

조용훈;김정선
- 한국통신학회논문지
- /
- 제25권4B호
- /
- pp.783-787
- /
- 2000
현재 사용되고 있는 개인용 컴퓨터의 중앙처리장치로서 주종을 이루고 있는 마이크로프로세서는 256KB, 혹은 512KB의 L2(Second Level) 캐쉬를 Direct Mapping, 32B 라인사이즈, 그리고 Write Allocation을 채택하지 않는 형태로 사용하고 있는데, 이러한 L2 캐쉬에서 Mapping 방식을 8-way Set Associative Mapping Procedure로 바꾸고, 라인사이즈를 늘려서 128B 이상으로 변경하고, 그리고 Write Allocation을 채택하였을 경우 그 적중률(Hit Ratio)이 약간의 하드웨어적 추가 비용만으로 2.5% 정도 개선됨을 확인하였다.
PDF

An Efficient Variable Rearrangement Technique for STT-RAM Based Hybrid Caches

윤종희;조두산
- 대한임베디드공학회논문지
- /
- 제11권2호
- /
- pp.67-78
- /
- 2016
The emerging Spin-Transfer Torque RAM (STT-RAM) is a promising component that can be used to improve the efficiency as a result of its high storage density and low leakage power. However, the state-of-the-art STT-RAM is not ready to replace SRAM technology due to the negative effect of its write operations. The write operations require longer latency and more power than the same operations in SRAM. Therefore, a hybrid cache with SRAM and STT-RAM technologies is proposed to obtain the benefits of STT-RAM while minimizing its negative effects by using SRAM. To efficiently use of the hybrid cache, it is important to place write intensive data onto the cache. Such data should be placed on SRAM to minimize the negative effect. Thus, we propose a technique that optimizes placement of data in main memory. It drives the proper combination of advantages and disadvantages for SRAM and STT-RAM in the hybrid cache. As a result of the proposed technique, write intensive data are loaded to SRAM and read intensive data are loaded to STT-RAM. In addition, our technique also optimizes temporal locality to minimize conflict misses. Therefore, it improves performance and energy consumption of the hybrid cache architecture in a certain range.
https://doi.org/10.14372/IEMEK.2016.11.2.67 인용 PDF KSCI

검색결과 89건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)