Search | Korea Science

Low Power Scheme Using Bypassing Technique for Hybrid Cache Architecture

Choi, Juhee
- Journal of the Semiconductor & Display Technology
- /
- v.20 no.4
- /
- pp.10-15
- /
- 2021
Cache bypassing schemes have been studied to remove unnecessary updating the data in cache blocks. Among them, a statistics-based cache bypassing method for asymmetric-access caches is one of the most efficient approach for non-voliatile memories and shows the lowest cache access latency. However, it is proposed under the condition of the normal cache system, so further study is required for the hybrid cache architecture. This paper proposes a novel cache bypassing scheme, called hybrid bypassing block selector. In the proposal, the new model is established considering the SRAM region and the non-volatile memory region separately. Based on the model, hybrid bypassing decision block is implemented. Experiments show that the hybrid bypassing decision block saves overall energy consumption by 21.5%.
PDF KSCI

Making Cache-Conscious CCMR-trees for Main Memory Indexing (주기억 데이타베이스 인덱싱을 위한 CCMR-트리)

윤석우;김경창
- Journal of KIISE:Databases
- /
- v.30 no.6
- /
- pp.651-665
- /
- 2003
To reduce cache misses emerges as the most important issue in today's situation of main memory databases, in which CPU speeds have been increasing at 60% per year, and memory speeds at 10% per year. Recent researches have demonstrated that cache-conscious index structure such as the CR-tree outperforms the R-tree variants. Its search performance can be poor than the original R-tree, however, since it uses a lossy compression scheme. In this paper, we propose alternatively a cache-conscious version of the R-tree, which we call MR-tree. The MR-tree propagates node splits upward only if one of the internal nodes on the insertion path has empty room. Thus, the internal nodes of the MR-tree are almost 100% full. In case there is no empty room on the insertion path, a newly-created leaf simply becomes a child of the split leaf. The height of the MR-tree increases according to the sequence of inserting objects. Thus, the HeightBalance algorithm is executed when unbalanced heights of child nodes are detected. Additionally, we also propose the CCMR-tree in order to build a more cache-conscious MR-tree. Our experimental and analytical study shows that the two-dimensional MR-tree performs search up to 2.4times faster than the ordinary R-tree while maintaining slightly better update performance and using similar memory space.
PDF KSCI

IpCSB+ - tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (레벨 프리페칭 기법을 이용한 향상된 주기억장치 상주형 색인구조)

Hong Hyun-Taek;Kang Tae-Ho;Yoo Jae-Soo
- Journal of Internet Computing and Services
- /
- v.4 no.6
- /
- pp.75-86
- /
- 2003
In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree, In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly, Also, we show the superiority of our algorithm through various performance evaluation.
PDF

Radiation-Induced Soft Error Detection Method for High Speed SRAM Instruction Cache (고속 정적 RAM 명령어 캐시를 위한 방사선 소프트오류 검출 기법)

Kwon, Soon-Gyu;Choi, Hyun-Suk;Park, Jong-Kang;Kim, Jong-Tae
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.6B
- /
- pp.948-953
- /
- 2010
In this paper, we propose multi-bit soft error detection method which can use an instruction cache of superscalar CPU architecture. Proposed method is applied to high-speed static RAM for instruction cache. Using 1D parity and interleaving, it has less memory overhead and detects more multi-bit errors comparing with other methods. It only detects occurrence of soft errors in static RAM. Error correction is treated like a cache miss situation. When soft errors are occurred, it is detected by 1D parity. Instruction cache just fetch the words from lower-level memory to correct errors. This method can detect multi-bit errors in maximum 4$\times$4 window.
PDF KSCI

Buffer Cache Management for Low Power Consumption (저전력을 위한 버퍼 캐쉬 관리 기법)

Lee, Min;Seo, Eui-Seong;Lee, Joon-Won
- Journal of KIISE:Computer Systems and Theory
- /
- v.35 no.6
- /
- pp.293-303
- /
- 2008
As the computing environment moves to the wireless and handheld system, the power efficiency is getting more important. That is the case especially in the embedded hand-held system and the power consumed by the memory system takes the second largest portion in overall. To save energy consumed in the memory system we can utilize low power mode of SDRAM. In the case of RDRAM, nap mode consumes less than 5% of the power consumed in active or standby mode. However hardware controller itself can't use this facility efficiently unless the operating system cooperates. In this paper we focus on how to minimize the number of active units of SDRAM. The operating system allocates its physical pages so that only a few units of SDRAM need to be activated and the unnecessary SDRAM can be put into nap mode. This work can be considered as a generalized and system-wide version of PAVM(Power-Aware Virtual Memory) research. We take all the physical memory into account, especially buffer cache, which takes an half of total memory usage on average. Because of the portion of buffer cache and its importance, PAVM approach cannot be robust without taking the buffer cache into account. In this paper, we analyze the RAM usage and propose power-aware page allocation policy. Especially the pages mapped into the process' address space and the buffer cache pages are considered. The relationship and interactions of these two kinds of pages are analyzed and exploited for energy saving.
PDF KSCI

Bus Splitting Techniques for MPSoC to Reduce Bus Energy (MPSoC 플랫폼의 버스 에너지 절감을 위한 버스 분할 기법)

Chung Chun-Mok;Kim Jin-Hyo;Kim Ji-Hong
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.9
- /
- pp.699-708
- /
- 2006
Bus splitting technique reduces bus energy by placing modules with frequent communications closely and using necessary bus segments in communications. But, previous bus splitting techniques can not be used in MPSoC platform, because it uses cache coherency protocol and all processors should be able to see the bus transactions. In this paper, we propose a bus splitting technique for MPSoC platform to reduce bus energy. The proposed technique divides a bus into several bus segments, some for private memory and others for shared memory. So, it minimizes the bus energy consumed in private memory accesses without producing cache coherency problem. We also propose a task allocation technique considering cache coherency protocol. It allocates tasks into processors according to the numbers of bus transactions and cache coherence protocol, and reduces the bus energy consumption during shared memory references. The experimental results from simulations say the bus splitting technique reduces maximal 83% of the bus energy consumption by private memory accesses. Also they show the task allocation technique reduces maximal 30% of bus energy consumed in shared memory references. We can expect the bus splitting technique and the task allocation technique can be used in multiprocessor platforms to reduce bus energy without interference with cache coherency protocol.
PDF KSCI

Efficient Implementation of SVM-Based Speech/Music Classifier by Utilizing Temporal Locality (시간적 근접성 향상을 통한 효율적인 SVM 기반 음성/음악 분류기의 구현 방법)

Lim, Chung-Soo;Chang, Joon-Hyuk
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.49 no.2
- /
- pp.149-156
- /
- 2012
Support vector machines (SVMs) are well known for their pattern recognition capability, but proper care should be taken to alleviate their inherent implementation cost resulting from high computational intensity and memory requirement, especially in embedded systems where only limited resources are available. Since the memory requirement determined by the dimensionality and the number of support vectors is generally too high for a cache in embedded systems to accomodate, frequent accesses to the main memory occur inevitably whenever the cache is not able to provide requested data to the processor. These frequent accesses to the main memory result in overall performance degradation and increased energy consumption because a memory access typically takes longer and consumes more energy than a cache access or a register access. In this paper, we propose a technique that reduces the number of main memory accesses by optimizing the data access pattern of the SVM-based classifier in such a way that the temporal locality of the accesses increases, fully utilizing data loaded into the processor chip. With experiments, we confirm the enhancement made by the proposed technique in terms of the number of memory accesses, overall execution time, and energy consumption.
PDF KSCI

An Address Translation Technique Large NAND Flash Memory using Page Level Mapping (페이지 단위 매핑 기반 대용량 NAND플래시를 위한 주소변환기법)

Seo, Hyun-Min;Kwon, Oh-Hoon;Park, Jun-Seok;Koh, Kern
- Journal of KIISE:Computing Practices and Letters
- /
- v.16 no.3
- /
- pp.371-375
- /
- 2010
SSD is a storage medium based on NAND Flash memory. Because of its short latency, low power consumption, and resistance to shock, it's not only used in PC but also in server computers. Most SSDs use FTL to overcome the erase-before-overwrite characteristic of NAND flash. There are several types of FTL, but page mapped FTL shows better performance than others. But its usefulness is limited because of its large memory footprint for the mapping table. For example, 64MB memory space is required only for the mapping table for a 64GB MLC SSD. In this paper, we propose a novel caching scheme for the mapping table. By using the mapping-table-meta-data we construct a fully associative cache, and translate the address within O(1) time. The simulation results show more than 80 hit ratio with 32KB cache and 90% with 512KB cache. The overall memory footprint was only 1.9% of 64MB. The time overhead of cache miss was measured lower than 2% for most workload.
PDF KSCI

Dynamic Limited Directory Scheme for Distributed Shared Memory Systems (분산공유 메모리 시스템을 위한 동적 제한 디렉터리 기법)

Lee, Dong-Gwang;Gwon, Hyeok-Seong;Choe, Seong-Min;An, Byeong-Cheol
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.4
- /
- pp.1098-1105
- /
- 1999
The caches in distributed shared memory systems enhance the performance by reducing memory access latency and communication overhead, but they must solve the cache coherence problem. This paper proposes a new directory protocol to solve the cache coherence problem and to improve the system performance in distributed shared memory systems. To maintain the cache coherence of shared data, processors within a limited distance reduce the communication overhead by using a bit-vector like the full directory scheme. Processors over a limited distance store pointers in a directory pool. Since the bit-vector and the directory pool remove the unnecessary cache invalidations, the proposed scheme reduces the communication traffic and improves the system performance. The dynamic limited directory scheme reduces the communication traffic up to 66 percents compared with the limited directory scheme and the number of directory access up to 27 percents compared with the dynamic pointer allocation scheme.
PDF

An Active Prefetch Filtering Schemes using Exclusive Prefetch Cache (선인출 전용 캐시를 이용한 적극적 선인출 필터링 기법)

Chon Young-Suk;Kim Suk-il;Jeon Joong-nam
- The KIPS Transactions:PartA
- /
- v.12A no.1 s.91
- /
- pp.41-52
- /
- 2005
Memory reference instruction caused by cache miss is the critical factor that limits the processing power of processor. Cache prefetching technique is an effective way to reduce the latency due to memory access. However, excessively aggressive prefetch leads to cache pollution and finally to cancel out the advantage of prefetch. In this study, an active prefetch filtering scheme is introduced which dynamically decides whether to commence prefetching after referring a filtering table to reduce the cache pollution due to unnecessary prefetches. For the precision filtering, an evicted address referencing scheme has been proposed where the filter directly compares the current prefetch address with previous unnecessary prefetch addresses stored in filtering table. Moreover, a small sized exclusive prefetch cache has been introduced to increase the amount of eviction of unnecessarily prefetched addresses to enhance the accuracy of dynamic filtering. The exclusive prefetch cache also prevents useful demand data from being pushed out by prefetched data, while the evicted address direct referencing scheme enables the prefetch cache to keep most of useful prefetch data within its small size. Experimental results from commonly used general and multimedia benchmarks show that the average cache miss ratio has been decreased by $13.3{\%}$ by virtue of enhanced filtering accuracy compared with conventional schemes.
https://doi.org/10.3745/KIPSTA.2005.12A.1.041 인용 PDF KSCI

Search Result 412, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)