• 제목/요약/키워드: Hybrid cache

검색결과 54건 처리시간 0.023초

Low Power Scheme Using Bypassing Technique for Hybrid Cache Architecture

  • Choi, Juhee
    • 반도체디스플레이기술학회지
    • /
    • 제20권4호
    • /
    • pp.10-15
    • /
    • 2021
  • Cache bypassing schemes have been studied to remove unnecessary updating the data in cache blocks. Among them, a statistics-based cache bypassing method for asymmetric-access caches is one of the most efficient approach for non-voliatile memories and shows the lowest cache access latency. However, it is proposed under the condition of the normal cache system, so further study is required for the hybrid cache architecture. This paper proposes a novel cache bypassing scheme, called hybrid bypassing block selector. In the proposal, the new model is established considering the SRAM region and the non-volatile memory region separately. Based on the model, hybrid bypassing decision block is implemented. Experiments show that the hybrid bypassing decision block saves overall energy consumption by 21.5%.

고성능 저전력 하이브리드 L2 캐시 메모리를 위한 연관사상 집합 관리 (Way-set Associative Management for Low Power Hybrid L2 Cache Memory)

  • 정보성;이정훈
    • 대한임베디드공학회논문지
    • /
    • 제13권3호
    • /
    • pp.125-131
    • /
    • 2018
  • STT-RAM is attracting as a next generation Non-volatile memory for replacing cache memory with low leakage energy, high integration and memory access performance similar to SRAM. However, there is problem of write operations as the other Non_volatile memory. Hybrid cache memory using SRAM and STT-RAM is attracting attention as a cache memory structure with lowe power consumption. Despite this, reducing the leakage energy consumption by the STT-RAM is still lacking access to the Dynamic energy. In this paper, we proposed as energy management method such as a way-selection approach for hybrid L2 cache fo SRAM and STT-RAM and memory selection method of write/read operation. According to the simulation results, the proposed hybrid cache memory reduced the average energy consumption by 40% on SPEC CPU 2006, compared with SRAM cache memory.

쓰기 횟수 감소를 위한 하이브리드 캐시 구조에서의 캐시간 직접 전송 기법에 대한 연구 (A Study on Direct Cache-to-Cache Transfer for Hybrid Cache Architecture to Reduce Write Operations)

  • 최주희
    • 반도체디스플레이기술학회지
    • /
    • 제23권1호
    • /
    • pp.65-70
    • /
    • 2024
  • Direct cache-to-cache transfer has been studied to reduce the latency and bandwidth consumption related to the shared data in multiprocessor system. Even though these studies lead to meaningful results, they assume that caches consist of SRAM. For example, if the system employs the non-volatile memory, the one of the most important parts to consider is to decrease the number of write operations. This paper proposes a hybrid write avoidance cache coherence protocol that considers the hybrid cache architecture. A new state is added to finely control what is stored in the non-volatile memory area, and experimental results showed that the number of writes was reduced by about 36% compared to the existing schemes.

  • PDF

Enhancing GPU Performance by Efficient Hardware-Based and Hybrid L1 Data Cache Bypassing

  • Huangfu, Yijie;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제11권2호
    • /
    • pp.69-77
    • /
    • 2017
  • Recent GPUs have adopted cache memory to benefit general-purpose GPU (GPGPU) programs. However, unlike CPU programs, GPGPU programs typically have considerably less temporal/spatial locality. Moreover, the L1 data cache is used by many threads that access a data size typically considerably larger than the L1 cache, making it critical to bypass L1 data cache intelligently to enhance GPU cache performance. In this paper, we examine GPU cache access behavior and propose a simple hardware-based GPU cache bypassing method that can be applied to GPU applications without recompiling programs. Moreover, we introduce a hybrid method that integrates static profiling information and hardware-based bypassing to further enhance performance. Our experimental results reveal that hardware-based cache bypassing can boost performance for most benchmarks, and the hybrid method can achieve performance comparable to state-of-the-art compiler-based bypassing with considerably less profiling cost.

Delay Reduction by Providing Location Based Services using Hybrid Cache in peer to peer Networks

  • Krishnan, C. Gopala;Rengarajan, A.;Manikandan, R.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권6호
    • /
    • pp.2078-2094
    • /
    • 2015
  • Now a days, Efficient processing of Broadcast Queries is of critical importance with the ever-increasing deployment and use of mobile technologies. BQs have certain unique characteristics that the traditional spatial query processing in centralized databases does not address. In novel query processing technique, by maintaining high scalability and accuracy, latency is reduced considerably in answering BQs. Novel approach is based on peer-to-peer sharing, which enables us to process queries without delay at a mobile host by using query results cached in its neighboring mobile peers. We design and evaluate cooperative caching techniques to efficiently support data access in ad hoc networks. We first propose two schemes: Cache Data, which caches the data, and Cache Path, which caches the data path. After analyzing the performance of those two schemes, we propose a hybrid approach (Hybrid Cache), which can further improve the performance by taking advantage of Cache Data and Cache Path while avoiding their weaknesses. Cache replacement policies are also studied to further improve the performance. Simulation results show that the proposed schemes can significantly reduce the query delay and message complexity when compared to other caching schemes.

An Efficient Variable Rearrangement Technique for STT-RAM Based Hybrid Caches

  • 윤종희;조두산
    • 대한임베디드공학회논문지
    • /
    • 제11권2호
    • /
    • pp.67-78
    • /
    • 2016
  • The emerging Spin-Transfer Torque RAM (STT-RAM) is a promising component that can be used to improve the efficiency as a result of its high storage density and low leakage power. However, the state-of-the-art STT-RAM is not ready to replace SRAM technology due to the negative effect of its write operations. The write operations require longer latency and more power than the same operations in SRAM. Therefore, a hybrid cache with SRAM and STT-RAM technologies is proposed to obtain the benefits of STT-RAM while minimizing its negative effects by using SRAM. To efficiently use of the hybrid cache, it is important to place write intensive data onto the cache. Such data should be placed on SRAM to minimize the negative effect. Thus, we propose a technique that optimizes placement of data in main memory. It drives the proper combination of advantages and disadvantages for SRAM and STT-RAM in the hybrid cache. As a result of the proposed technique, write intensive data are loaded to SRAM and read intensive data are loaded to STT-RAM. In addition, our technique also optimizes temporal locality to minimize conflict misses. Therefore, it improves performance and energy consumption of the hybrid cache architecture in a certain range.

고성능 내장형 프로세서의 에너지 소비 감소를 위한 데이타 캐쉬 통합 설계 방법 (Hybrid Scheme of Data Cache Design for Reducing Energy Consumption in High Performance Embedded Processor)

  • 심성훈;김철홍;장성태;전주식
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제33권3호
    • /
    • pp.166-177
    • /
    • 2006
  • 현재 내장형 프로세서에서 캐쉬 사이즈는 더 많은 트랜지스터 집적도와 낮은 공급 전력에 기인하여 점점 더 증가 되어지는 추세이다. 하지만 캐쉬 사이즈가 커질수록 더욱 더 많은 에너지 소비가 발생하게 되며, 결과적으로 프로세서 전체에서 소비하는 에너지 중에서 캐쉬에서 소비되는 에너지의 비중이 점점 더 증가 되고 있다. 이에 따라 캐쉬 에너지 소비를 줄이기 위한 많은 기법들이 제시되어져 왔다. 하지만 이러한 기존의 기법들은 캐쉬 에너지 소비의 2가지 방면, 즉, 정적 캐쉬 에너지 소비와 동적 캐쉬 에너지 소비 중에서 어느 한쪽에 초점을 맞추어 제시되어진 기법들이었다. 본 논문에서는 고성능 내장형 프로세서에서 캐쉬 에너지 소비의 2가지 방면인, 정적 캐쉬 에너지 소비와 동적 캐쉬 에너지 소비를 동시에 감소시키는 정적 에너지 소비 감소와 동적 에너지 소비 감소의 통합 기법을 제안한다. 이 통합 기법에는 이미 제안되어진 두 가지 기법, 동적 에너지 소비를 감소시키기 위한 웨이 예측 기법과 정적 에너지 소비를 감소시키기 위한 드라우지 캐쉬(drowsy cache) 기법을 적용한다. 또한 드라우지 캐쉬 기법을 사용하였을 때 생기는 추가적인 프로그램 실행 사이클들을 줄이기 위한 "프로그램 카운트를 이용하는 드라우지 상태의 데이타 캐쉬 라인 미리 깨움" 기법을 제안한다. 이러한 기법 적용을 레벨 1 데이타 캐쉬에 적용한다. 제안 되어진 통합 기법을 통해서 정적 데이타 캐쉬 에너지 소비와 동적 데이타 캐쉬 에너지 소비를 동시에 줄일 수 있게 되며, 같이 제안되어진 "드라우지 상태의 데이타 캐쉬 라인 미리 깨움"기법은 통합 기법 때문에 발생하는 추가적인 프로그램 실행 사이클의 증가를 감소시킬 수 있다.서 58.98로 줄이면서 계산시간은 평균 71ms에서 44ms 으로 빠르게 됨을 알 수 있었다.적외선 분광법을 이용한 사일리지의 화학적 조성분 함량 측정은 적은 오차 범위 내에서 신속하고 정확한 분석법이 될 수 있음을 확인 할 수 있었다. 비록 원물 생시료(IF)에 대한 직접적인 측정은 다소 예측 정확성이 떨어지지만 현장 적용성과 편리성을 높이기 위해서는 생시료의 측정시 오차를 줄일 수 있는 스펙트럼의 수처리 방법이나 산란보정 방법과 같은 데이터 처리기법에 대한 더 많은 연구가 앞으로 진행되어야 한다고 생각되어진다.상자의 50% 이상이 매일 생선 콩 및 콩제품과 채소류를 먹고 있었고, 인스턴트나 패스트푸드는 정상 체중군이 저체중군이나 과체중보다 매일 섭취하는 빈도가 낮았다(p<0.0177). 7. 가장 낮은 영양 섭취 상태를 보여준 영양소(% RDA< 75%)는 철분과 칼슘으로 조사 대상자의 3/4에 해당하는 조사 대상자가 영양 부족 상태였다. 칼슘 섭취의 경우 정상 체중군이 과체중군과 저체중군보다 섭취율이 낮았으나(p<0.0257) 철분은 군간 유의차는 없었다. 8. 칼슘의 경우 과체중군이 저체중군이나 정상 체중군에 비해 영양소 적정비율(NAR) 값이 높았으며(p<0.0257) 철분, 단백질, 비타민 $B_1$$B_2$, 나이아신의 경우도 통계적으로 유의하지는 않으나 과체중군이 저체중군 또는 정상 체중군의 NAR 값이 높은 경향을 보여주었다. 9가지 영양소의 NAR을 평균한 MAR 값은 군간 유의적이지는 않으나 과체중군(0.76)이 정상체중(0.73) 또는 저체중군(0.73)에 비해 높은 값은 보여주었다. 9. 철분은 과체중군(1.67)이 저체중(0.

A New Hybrid Architecture for Cooperative Web Caching

  • Baek, Jin-Suk;Kaur, Gurpreet;Yang, Jung-Hoon
    • Journal of Ubiquitous Convergence Technology
    • /
    • 제2권1호
    • /
    • pp.1-11
    • /
    • 2008
  • An effective solution to the problems caused by the explosive growth of World Wide Web is a web caching that employing an additional server, called proxy cache, between the clients and main server for caching the popular web objects near the clients. However, a single proxy cache can easily become the bottleneck. Deploying groups of cooperative caches provides scalability and robustness by eliminating the limitations caused by a single proxy cache. Two common architectures to implement the cooperative caching are hierarchical and distributed caching systems. Unfortunately, both architectures suffer from performance limitations. We propose an efficient hybrid caching architecture eliminating these limitations by using both the hierarchical and same level caches. Our performance evaluation with our investigated simulator shows that the proposed architecture offers the best of both existing architectures in terms of cache hit rate, the number of query messages from clients, and response time.

  • PDF

A Hybrid Prefix Cashing Scheme for Efficient IP Address Lookup

  • Kim, Jinsoo;Kim, Junghwan
    • 한국컴퓨터정보학회논문지
    • /
    • 제20권12호
    • /
    • pp.45-52
    • /
    • 2015
  • We propose a hybrid prefix caching scheme to enable high speed IP address lookup. All prefixes loaded in a prefix cache should not be overlapped in address range for correct IP lookup. So, every non-leaf prefix needs to be expanded not so as to be overlapped. The shorter expanded prefix is more preferable because it can cover wider address range just as an single entry in a prefix cache. We exploits advantages of two dynamic prefix expansion techniques, bounded prefix expansion technique and bitmap-based prefix expansion technique. The proposed scheme uses dual bound values whereas just one bound value is used in bounded prefix expansion. Our elaborated technique make the dual bound values be associated with several subtries flexibly using bitmap information, rather than with fixed subtries. We evaluate the performance of the proposed scheme in terms of the average length of the expanded prefixes and cache miss ratio. The experiment results show the proposed scheme has lower cache miss ratio than other previous schemes including both bounded prefix expansion and bitmap-based expansion irrespective of the cache size.

초마디 멀티프런탈 방법의 효율적인 구현 (An Efficient Implementation of the Supernodal Multifrontal Method)

  • 박찬규;박순달
    • 경영과학
    • /
    • 제19권2호
    • /
    • pp.155-168
    • /
    • 2002
  • In this paper, some efficient implementation techniques for the multifrontal method, which can be used to compute the Cholesky factor of a symmetric positive definite matrix, are presented. In order to use the cache effect in the cache-based computer architecture, a hybrid method for factorizing a frontal matrix is considered. This hybrid method uses the column Cholesky method and the submatrix Cholesky method alternatively. Experiments show that the hybrid method speeds up the performance of the supernodal multifrontal method by 5%~10%, and it is superior to the Cholesky method in some problems with dense columns or large frontal matrices.