Search | Korea Science

Filter Cache Predictor Using Mode Selection Bit (모드 선택 비트를 사용한 필터 캐시 예측기)

Kwak, Jong-Wook
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.46 no.5
- /
- pp.1-13
- /
- 2009
Filter cache has been introduced as one solution of reducing cache power consumption. More than 50% of the power reduction results from the filter cache, whereas more than 20% of the performance is compromised. To minimize the performance degradation of the filter cache, the predictive filter cache has been proposed. In this paper, we review the previous filter cache predictors and analyze the problems of the solutions. As a result, we found main problems that cause prediction misses in previous filter cache schemes and, to resolve the problems, this paper proposes a new prediction policy. In our scheme, some reference bit entries, called MSBs, are inserted into filter cache and BTB, to adaptively control the filter cache access. In simulation parts, we use a modified SimpleScalar simulator with MiBench benchmark programs to verify the proposed filter cache. The simulation result shows in average 5% performance improvement, compared to previous ones.
PDF KSCI

Filter Cache Predictor using Mode Selection Bit (모드 선택 비트를 활용한 필터 캐시 예측 모델)

Kwak, Jong-Wook;Choi, Ju-Hee;Jhang, Seong-Tae;Jhon, Chu-Shik
- Proceedings of the Korea Information Processing Society Conference
- /
- 2008.05a
- /
- pp.493-495
- /
- 2008
캐시 에너지의 소비 전력을 줄이기 위해 필터 캐시가 제안되었다. 필터 캐시의 사용으로 인해 많은 전력 사용 감소 효과를 가져왔으나, 상대적으로 시스템 성능도 더불어 감소하게 되었다. 필터 캐시의 사용으로 인한 성능 감소를 최소화하기 위해서, 본 논문에서는 기존에 제안된 주요 필터 캐시 예측 모델들을 소개하며, 각각의 방식에 있어서의 핵심 특징 및 해당 방식의 문제점을 분석한다. 이를 바탕으로 본 논문에서는 모드 선택 비트를 활용하는 개선된 형태의 새로운 필터 캐시 예측기 모델을 제안한다. 제안된 방식은 MSB라 불리는 참조 비트를 고안하여, 이를 기존의 필터캐시와 BTB에 새롭게 활용한다. 실험 결과, 제안된 방식은 기존 방식 대비, 전력 소모량 시간 지연면에서 평균 5%의 성능 향상을 가져 왔다.
https://doi.org/10.3745/PKIPS.y2008m05a.493 인용 PDF

An Active Prefetch Filtering Schemes using Exclusive Prefetch Cache (선인출 전용 캐시를 이용한 적극적 선인출 필터링 기법)

Chon Young-Suk;Kim Suk-il;Jeon Joong-nam
- The KIPS Transactions:PartA
- /
- v.12A no.1 s.91
- /
- pp.41-52
- /
- 2005
Memory reference instruction caused by cache miss is the critical factor that limits the processing power of processor. Cache prefetching technique is an effective way to reduce the latency due to memory access. However, excessively aggressive prefetch leads to cache pollution and finally to cancel out the advantage of prefetch. In this study, an active prefetch filtering scheme is introduced which dynamically decides whether to commence prefetching after referring a filtering table to reduce the cache pollution due to unnecessary prefetches. For the precision filtering, an evicted address referencing scheme has been proposed where the filter directly compares the current prefetch address with previous unnecessary prefetch addresses stored in filtering table. Moreover, a small sized exclusive prefetch cache has been introduced to increase the amount of eviction of unnecessarily prefetched addresses to enhance the accuracy of dynamic filtering. The exclusive prefetch cache also prevents useful demand data from being pushed out by prefetched data, while the evicted address direct referencing scheme enables the prefetch cache to keep most of useful prefetch data within its small size. Experimental results from commonly used general and multimedia benchmarks show that the average cache miss ratio has been decreased by $13.3{\%}$ by virtue of enhanced filtering accuracy compared with conventional schemes.
https://doi.org/10.3745/KIPSTA.2005.12A.1.041 인용 PDF KSCI

Dynamic Prefetch Filtering Schemes to Enhance Utilization of Data Cache (데이터 캐시의 활용도를 높이는 동적 선인출 필터링 기법)

전영숙;이병권;김석일;전중남
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.10a
- /
- pp.562-564
- /
- 2004
캐시 선인출 기법은 메모리 참조에 따른 지연시간을 줄이는 효과적인 방법이다. 그러나 너무 적극적인 선인출은 캐시 오염을 유발시켜 선인출에 의한 장점을 상쇄시킨다. 본 연구에서는 캐시의 오염을 줄이기 위해 동적으로 필터 테이블을 참조하여 선인출 명령을 수행할 지의 여부를 결정하는 4가지 필터링 방법들을 비교 평가한다. 비교 연구를 위한 이상적인 필터링 구조를 제안하였으며, 기존 연구에서의 잠김 현상을 개선하기 위한 이진 상태 구조를 제안하였다. 또한, 정교한 필터링을 위한 블록주소 참조 방식을 제안하였다. 일반적으로 많이 사용되는 일반 벤치마크 프로그램과 멀티미디어 벤치마크 프로그램들에 대하여 실험한 결과, 캐시 미스율이 이진 상태 구조는 평균 5.6%, 블록주소 참조 구조는 7.9% 각각 감소하였다.
PDF

An L1 Cache Prefetching Scheme using Excessively Aggressive Prefetchering and a Small Direct-mapped Filtering Cache (공격적인 선인출 및 직접 사상 필터링을 이용한 L1 캐시 선인출 기법)

Chon, Young-Suk
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.11
- /
- pp.836-852
- /
- 2006
This paper proposes an L1 cache prefetch scheme using an excessively aggressive hardware prefetcher and a hardware prefetch filter having a small direct-mapped filtering cache. A quantitative analysis method has been introduced and applied to analyze nonideal effects of aggressive cache prefetching. From those analysis results, the structure and algorithm of a prefetch filter has been derived and simulated, and the overall system performance has been measured using a cycle-by-cycle cache simulator. Experimental results show that the proposed scheme improves the overall system performance by 18% on the average over several benchmarks
PDF KSCI

Dynamic Prefetch Filtering Schemes to enhance Utilization of Data Cache (데이타 캐시의 활용도를 높이는 동적 선인출 필터링 기법)

Chon, Young-Suk;Kim, Suk-Il;Jeon, Joong-Nam
- Journal of KIISE:Computer Systems and Theory
- /
- v.35 no.1
- /
- pp.30-43
- /
- 2008
Memory reference instructions such as loads or stores are critical factors that limit the processing power of processor. The prefetching technique is an effective way to reduce the latency caused from memory access. However, excessively aggressive prefetch leads to cache pollution so as to cancel out the advantage of prefetch. In this study, four filtering schemes have been compared and evaluated which dynamically decide whether to begin prefetch after referring a filtering table to decrease cache pollution. First, A bi-states scheme has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete state scheme has been introduced to be used as a reference for the comparative study. A block address lookup scheme has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the bi-states scheme, the contents of each entry have the fields the same as the complete state scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. Experimental results from commonly used general benchmarks and multimedia programs show that average cache miss ratio have been decreased by 10.5% for the block address lookup scheme(BAL) compare to conventional dynamic filter scheme(2-bitSC).
PDF KSCI

Improving Search Performance of Tries Data Structures for Network Filtering by Using Cache (네트워크 필터링에서 캐시를 적용한 트라이 구조의 탐색 성능 개선)

Kim, Hoyeon;Chung, Kyusik
- KIPS Transactions on Computer and Communication Systems
- /
- v.3 no.6
- /
- pp.179-188
- /
- 2014
Due to the tremendous amount and its rapid increase of network traffic, the performance of network equipments are becoming an important issue. Network filtering is one of primary functions affecting the performance of the network equipment such as a firewall or a load balancer to process the packet. In this paper, we propose a cache based tri method to improve the performance of the existing tri method of searching for network filtering. When several packets are exchanged at a time between a server and a client, the tri method repeats the same search procedure for network filtering. However, the proposed method can avoid unnecessary repetition of search procedure by exploiting cache so that the performance of network filtering can be improved. We performed network filtering experiments for the existing method and the proposed method. Experimental results showed that the proposed method could process more packets up to 790,000 per second than the existing method. When the size of cache list is 11, the proposed method showed the most outstanding performance improvement (18.08%) with respect to memory usage increase (7.75%).
https://doi.org/10.3745/KTCCS.2014.3.6.179 인용 PDF KSCI

A Dynamic Prefetch Filtering Schemes to Enhance Usefulness Of Cache Memory (캐시 메모리의 유용성을 높이는 동적 선인출 필터링 기법)

Chon Young-Suk;Lee Byung-Kwon;Lee Chun-Hee;Kim Suk-Il;Jeon Joong-Nam
- The KIPS Transactions:PartA
- /
- v.13A no.2 s.99
- /
- pp.123-136
- /
- 2006
The prefetching technique is an effective way to reduce the latency caused memory access. However, excessively aggressive prefetch not only leads to cache pollution so as to cancel out the benefits of prefetch but also increase bus traffic leading to overall performance degradation. In this thesis, a prefetch filtering scheme is proposed which dynamically decides whether to commence prefetching by referring a filtering table to reduce the cache pollution due to unnecessary prefetches In this thesis, First, prefetch hashing table 1bitSC filtering scheme(PHT1bSC) has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete block address table filtering scheme(CBAT) has been introduced to be used as a reference for the comparative study. A prefetch block address lookup table scheme(PBALT) has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the PHT1bSC scheme, the contents of each entry have the fields the same as CBAT scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. On commonly used prefetch schemes and general benchmarks and multimedia programs simulates change cache parameters. The PBALT scheme compared with no filtering has shown enhanced the greatest 22%, the cache miss ratio has been decreased by 7.9% by virtue of enhanced filtering accuracy compared with conventional PHT2bSC. The MADT of the proposed PBALT scheme has been decreased by 6.1% compared with conventional schemes to reduce the total execution time.
https://doi.org/10.3745/KIPSTA.2006.13A.2.123 인용 PDF KSCI

Performance of the Finite Difference Method Using Cache and Shared Memory for Massively Parallel Systems (대규모 병렬 시스템에서 캐시와 공유메모리를 이용한 유한 차분법 성능)

Kim, Hyun Kyu;Lee, Hyo Jong
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.4
- /
- pp.108-116
- /
- 2013
Many algorithms have been introduced to improve performance by using massively parallel systems, which consist of several hundreds of processors. A typical example is a GPU system of many processors which uses shared memory. In the case of image filtering algorithms, which make references to neighboring points, the shared memory helps improve performance by frequently accessing adjacent pixels. However, using shared memory requires rewriting the existing codes and consequently results in complexity of the codes. Recent GPU systems support both L1 and L2 cache along with shared memory. Since the L1 cache memory is located in the same area as the shared memory, the improvement of performance is predictable by using the cache memory. In this paper, the performance of cache and shared memory were compared. In conclusion, the performance of cache-based algorithm is very similar to the one of shared memory. The complexity of the code appearing in a shared memory system, however, is resolved with the cache-based algorithm.
https://doi.org/10.5573/ieek.2013.50.4.108 인용 PDF KSCI

Memory Management based Hybrid Transactional Memory Scheme for Efficiently Processing Transactions in Multi-core Environment (멀티코어 환경에서 효율적인 트랜잭션 처리를 위한 메모리 관리 기반 하이브리드 트랜잭셔널 메모리 기법)

Jang, Yeon-Woo;Kang, Moon-Hwan;Chang, Jae-Woo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.795-798
- /
- 2017
최근 멀티코어 프로세서가 개발됨에 따라 병렬 프로그래밍은 멀티코어를 효과적으로 활용하기 위한 기법으로 그 중요성이 높아지고 있다. 트랜잭셔널 메모리는 처리 방식에 따라 HTM, STM, HyTM으로 구분되며, 최근 HTM 및 STM 결합한 HyTM 이 활발히 연구되고 있다. 그러나 기존의 HyTM 는 HTM과 STM의 동시성 제어를 위해 블룸필터를 사용하는 반면, 블룸필터의 자체적인 긍정 오류를 해결하지 못한다. 아울러, 트랜잭션 처리를 위한 메모리 할당/해제를 기존의 락 메커니즘을 사용하여 관리한다. 따라서 멀티코어 환경에서 스레드 수가 증가할수록 트랜잭션 처리 효율이 떨어진다. 본 논문에서는 멀티코어 환경에서 효율적인 트랜잭션 처리를 위한 메모리 관리 기반 하이브리드 트랜잭셔널 메모리 기법을 제안한다. 제안하는 기법은 트랜잭션 처리에 최적화된 블룸필터를 제공함으로써, 병렬적으로 동시에 수행되는 서로 다른 환경의 트랜잭션에 대해 일관성 있는 처리를 지원한다. 아울러, CPU 캐시라인에 최적화된 메모리 기법을 통해, 메모리 할당량이 적은 트랜잭션은 로컬 캐시에 할당함으로써 트랜잭션의 빠른 처리를 지원한다.
https://doi.org/10.3745/PKIPS.y2017m04a.795 인용 PDF

Search Result 19, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)