Search | Korea Science

L2A Cache Replacement Scheme for Label Switching Network (레이블 스위칭 네트웍 상에서 L2A 캐쉬 대체기법)

김남기;황인철;윤현수
- Proceedings of the Korea Multimedia Society Conference
- /
- 2000.04a
- /
- pp.386-389
- /
- 2000
인터넷이 급속도로 발전되면서 트래픽이 폭발적으로 증가하여 현재 라우터에 많은 부담을 주고 있다. 반면 스위칭 기술은 라우팅보다 빠르게 데이터를 전송할수 있다. 그 결과 라우터 병목 현상을 해결하고자 IP 라우팅이 스위칭 기술을 접목한 레이블 스위칭 네트웍이 출현하게 되었다. 레이블 스위칭 기술중 데이터 기반 레이블 스위칭에서 매우 중요한 것은 캐쉬 테이블 관리이다. 캐쉬 테이블에는 흐름 분류를 위한 정보와 레이블 스위칭을 위한 정보를 저장하고 있는데 캐쉬 테이블 크기는 라우터자원에 의해 제약을 받으므로 캐쉬 대체기법이 필요하게 된다. 따라서 효율적이 캐쉬테이블 관리를 위해 인터넷 트래픽 특성을 고려한 캐쉬 대체 기법에 관한 연구가 필요하다. 본 논문에서는 인터넷 트래픽 특성을 고려해 LFC 기법과 LRU 기법의 단점을 보완한 L2A 캐쉬 대체 기법을 제안한다. L2A 기법은 기본적인 FIFO , LFC, LRU 기법보다 나은 성능을 보이며 특히 캐쉬 크기가 작을 경우에도 타 기법에 비해 탁월한 성능을 유지한다.
PDF

Design and Implementation of a Home-based Cooperative Cache for PVFS (PVFS를 위한 홈 기반 상호 협력 캐쉬의 설계 및 구현)

황인철;정한조;맹승렬;조정완
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.04a
- /
- pp.58-60
- /
- 2004
요즘 값싼 PC들을 빠른 네트웍으로 묶어 높은 성능을 얻고자하는 클러스터 컴퓨팅에 대한 연구가 활발히 이루어지면서 CPU나 메모리, 네트웍보다 상대적으로 느린 디스크에서 데이터를 읽어 효율적으로 파일서비스를 하는 분산 파일 시스템이 개발되었다. 기존 분산 파일 시스템 중 클러스터 컴퓨팅에서 많이 사용하는 Linux 운영 체제에서 병렬 I/O를 사용하여 사용자에게 빠른 파일 서비스를 제공하여 주는 PVFS가 개발되었다. 기존 PVFS에서는 캐쉬 시스템을 제공하고 있지 않기 때문에 읽기 성능을 향상시키기 위하여 PVFS를 위한 상호 협력 캐쉬를 설계하고 구현하였다. 기존에 구현된 PVFS를 위한 상호 협력 캐쉬는 힌트 기반 상호 협력 캐쉬로서 부정확한 읽기/쓰기를 수행함으로서 읽기/쓰기 부하가 커지는 단점이 있다. 따라서 본 논문에서는 기존 PVFS를 위한 상호 협력 캐쉬의 읽기/쓰기 성능 향상을 위해 PVFS를 위한 상호 협력 캐쉬를 홈 기반 상호 협력 캐쉬로서 설계 및 구현한다. 그리고 PVFS, 기존 PVFS를 위한 힌트 기반 상호 협력 캐처와 PVFS를 위한 홈 기반 상호 협력캐쉬의 성능을 비교, 분석한다.
PDF

Specific-Way Cache System: An Efficient Location Cache System (로케이션 캐쉬 시스템의 효율을 개선한 스피시픽-웨이 캐쉬 시스템)

Yun, Sang-Ho;Lee, In-Hwan
- Proceedings of the Korean Information Science Society Conference
- /
- 2007.10b
- /
- pp.243-246
- /
- 2007
집합-연관 캐쉬는 직접-사상 캐쉬보다 적중률이 높다는 장점이 있는 반면, 전력 소모가 많다는 단점이 있다. 그러한 단점을 보완하기 위해 웨이-프리딕팅 셋-어소시에이티브 캐쉬, 로케이션 캐쉬 시스템 등의 연구들이 계속 되어왔다. 본 논문에서는 로케이견 캐쉬 시스템에서 생각할 수 있는 논점들을 살펴보고, 이를 효율적으로 극복할 수 있는 스피시픽-웨이 캐쉬 시스템을 제안하였다. 또한 Simplescalar와 MiBench를 이용하여 스피시픽-웨이 캐쉬 시스템의 성능을 측정하였고, 그 결과 39.6%의 예상-적중률이 나타난 것으로 확인되었다.
PDF

An Analysis on The Optimal Partitioning Configuration of Cache for Meeting Deadlines of Real-Time Tasks (실시간 태스크의 마감시간 만족을 위한 캐쉬 최적 분할 형태의 분석)

Kim, Myung-Hee;Joo, Su-Chong
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.11
- /
- pp.2891-2902
- /
- 1997
This paper presents an analysis on the optimal partitioning configuration of cache (memory) for meeting deadlines of periodic and aperiodic real-time task set. Our goal is not only to decrease the deadline missing ratio of each task by minimizing the task utilization, but also to allocate another tasks to idle spaces of cache. For this reason, we suggest an algorithm so that tasks could be allocated to cache segments. Here, the set of cache segments allocated tasks is called a cache partitioning configuration. Based on how tasks allocate to cache segments, we can get various cache partitioning configurations. From these configurations, we obtain the boundary of task utilization that tasks are possible to schedule, and analyze the cache optimal partitioning configuration that can be executed to minimize the task utilization.
PDF

Data Cache System based on the Selective Bank Algorithm for Embedded System (내장형 시스템을 위한 선택적 뱅크 알고리즘을 이용한 데이터 캐쉬 시스템)

Jung, Bo-Sung;Lee, Jung-Hoon
- The KIPS Transactions:PartA
- /
- v.16A no.2
- /
- pp.69-78
- /
- 2009
One of the most effective way to improve cache performance is to exploit both temporal and spatial locality given by any program executive characteristics. In this paper we present a high performance and low power cache structure with a bank selection mechanism that enhances exploitation of spatial and temporal locality. The proposed cache system consists of two parts, i.e., a main direct-mapped cache with a small block size and a fully associative buffer with a large block size as a multiple of the small block size. Especially, the main direct-mapped cache is constructed as two banks for low power consumption and stores a small block which is selected from fully associative buffer by the proposed bank selection algorithm. By using the bank selection algorithm and three state bits, We selectively extend the lifetime of those small blocks with high temporal locality by storing them in the main direct-mapped caches. This approach effectively reduces conflict misses and cache pollution at the same time. According to the simulation results, the average miss ratio, compared with the Victim and STAS caches with the same size, is improved by about 23% and 32% for Mibench applications respectively. The average memory access time is reduced by about 14% and 18% compared with the he victim and STAS caches respectively. It is also shown that energy consumption of the proposed cache is around 10% lower than other cache systems that we examine.
https://doi.org/10.3745/KIPSTA.2009.16-A.2.69 인용 PDF KSCI

A New Cache Replacement Policy for Improving Last Level Cache Performance (라스트 레벨 캐쉬 성능 향상을 위한 캐쉬 교체 기법 연구)

Do, Cong Thuan;Son, Dong Oh;Kim, Jong Myon;Kim, Cheol Hong
- Journal of KIISE
- /
- v.41 no.11
- /
- pp.871-877
- /
- 2014
Cache replacement algorithms have been developed in order to reduce miss counts. In modern processors, the performance gap between the processor and main memory has been increasing, creating a more important role for cache replacement policies. The Least Recently Used (LRU) policy is one of the most common policies used in modern processors. However, recent research has shown that the performance gap between the LRU and the theoretical optimal replacement algorithm (OPT) is large. Although LRU replacement has been proven to be adequate over and over again, the OPT/LRU performance gap is continuously widening as the cache associativity becomes large. In this study, we observed that there is a potential chance to improve cache performance based on existing LRU mechanisms. We propose a method that enhances the performance of the LRU replacement algorithm based on the access proportion among the lines in a cache set during a period of two successive replacement actions that make the final replacement action. Our experimental results reveals that the proposed method reduced the average miss rate of the baseline 512KB L2 cache by 15 percent when compared to conventional LRU. In addition, the performance of the processor that applied our proposed cache replacement policy improved by 4.7 percent over LRU, on average.
https://doi.org/10.5626/JOK.2014.41.11.871 인용

An Energy-Delay Efficient System with Adaptive Victim Caches (선택적 희생 캐쉬를 이용한 저전력 고성능 시스템 설계 방안)

Kim Cheol Hong;Shim Sunghoon;Jhon Chu Shik;Jhang Seong Tae
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.663-674
- /
- 2005
We propose a system aimed at achieving high energy-delay efficiency by using adaptive victim caches. Particularly, we investigate methods to improve the hit rates in the first level of memory hierarchy, which reduces the number of accesses to mort power consuming memory structures such as L2 cache. Victim cache is a memory element for reducing conflict misses in a direct-mapped L1 cache. We present two techniques to fill the victim cache with the blocks that have higher probability to be re-reqeusted by processor. Hit-based victim cache ks tilled with the blocks which were referenced frequently by processor. Replacement-based victim cache is filled with the blocks which were evicted from the sets where block replacements had happened frequently According to our simulations, replacement-based victim cache scheme outperforms the conventional victim cache scheme about $2\%$ on average and refutes the power consumption by up to $8\%$.
PDF KSCI

Advanced Victim Cache with Processor Reuse Information (프로세서의 재사용 정보를 이용하는 개선된 고성능 희생 캐쉬)

Kwak Jong Wook;Lee Hyunbae;Jhang Seong Tae;Jhon Chu Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.12
- /
- pp.704-715
- /
- 2004
Recently, a single or multi processor system uses the hierarchical memory structure to reduce the time gap between processor clock rate and memory access time. A cache memory system includes especially two or three levels of caches to reduce this time gap. Moreover, one of the most important things In the hierarchical memory system is the hit rate in level 1 cache, because level 1 cache interfaces directly with the processor. Therefore, the high hit rate in level 1 cache is critical for system performance. A victim cache, another high level cache, is also important to assist level 1 cache by reducing the conflict miss in high level cache. In this paper, we propose the advanced high level cache management scheme based on the processor reuse information. This technique is a kind of cache replacement policy which uses the frequency of processor's memory accesses and makes the higher frequency address of the cache location reside longer in cache than the lower one. With this scheme, we simulate our policy using Augmint, the event-driven simulator, and analyze the simulation results. The simulation results show that the modified processor reuse information scheme(LIVMR) outperforms the level 1 with the simple victim cache(LIV), 6.7% in maximum and 0.5% in average, and performance benefits become larger as the number of processors increases.
PDF KSCI

Determination of a Grain Size for Reducing Cache Miss Rate of Direct-Mapped Caches (직접 사상 캐쉬의 캐쉬 실패율을 감소시키기 위한 성김도 정책)

Jung, In-Bum;Kong, Ki-Sok;Lee, Joon-Won
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.7
- /
- pp.665-674
- /
- 2000
In data parallel programs incurring high cache locality, the choice of grain sizes affects cache performance. Though the grain sizes chosen provide fair load balance among processors, the grain sizes that ignore underlying caching effect result in address interferences between grains allocated to a processor. These address interferences appear to have a negative impact on the cache locality, since they result in cache conflict misses. To address this problem, we propose a best grain size driven from a cache size and the number of processors based on direct mapped cache's characteristic. Since the proposed method does not map the grains to the same location in the cache, cache conflict misses are reduced. Simulation results show that the proposed best grain size substantially improves the performance of tested data parallel programs through the reduction of cache misses on direct-mapped caches.
PDF

Design and Performance Evaluation of Expansion Buffer Cache (확장 버퍼 캐쉬의 설계 및 성능 평가)

Hong Won-Kee
- The KIPS Transactions:PartA
- /
- v.11A no.7 s.91
- /
- pp.489-498
- /
- 2004
VLIW processor is considered to be an appropriate processor for the embedded system, provided with high performance and low power con-sumption due to its simple hardware structure. Unfortunately, the VLIW processor often suffers from high memory access latency due to the variable length of I-packets, which consist of independent instructions to be issued in parallel. It is because of the variable I-packet length that some I-packets must be placed over two cache blocks, which are called straddle I-packets, so that two cache accesses are required to fetch such I-packets. In this paper, an expansion buffer cache is proposed to improve not only the instruction fetch bandwidth, but also the power consumption of the I-cache with moderate hardware cost. The expansion buffer cache has a small expansion buffer containing a fraction of a straddle packet along with the main cache to reduce the additional cache accesses due to the straddle I-packets. With a great reduction in the cache accesses due to the straddle packets, the expansion buffer cache can achieve $5{\~}9{\%}$improvement over the conventional I-caches in the $Delay{\cdot}Power{\cdot}Area$ metric.
https://doi.org/10.3745/KIPSTA.2004.11A.7.489 인용 PDF KSCI

Search Result 594, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)