Search | Korea Science

An efficient pipelined architecture for 3D graphics accelerator (3차원 그래픽 가속기의 효율적인 파이프라인 설계)

우현재;정종철;이문기
- Proceedings of the IEEK Conference
- /
- 2002.06b
- /
- pp.357-360
- /
- 2002
This paper is proposed about an efficient pipelined architecture for 3D graphics accelerator to reduce Cache miss ratio. Because cache miss takes a considerable time, about 20∼30 cycle, we reduce cache miss ratio to use pre-fetch. As a result of simulation, we figure out that the miss ratio of cache depends on the size of tile, cache memory and auxiliary cache memory. We can save 6.6% cache miss ratio maximumly.
PDF

Divided Disk Cache and SSD FTL for Improving Performance in Storage

Park, Jung Kyu;Lee, Jun-yong;Noh, Sam H.
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.17 no.1
- /
- pp.15-22
- /
- 2017
Although there are many efficient techniques to minimize the speed gap between processor and the memory, it remains a bottleneck for various commercial implementations. Since secondary memory technologies are much slower than main memory, it is challenging to match memory speed to the processor. Usually, hard disk drives include semiconductor caches to improve their performance. A hit in the disk cache eliminates the mechanical seek time and rotational latency. To further improve performance a divided disk cache, subdivided between metadata and data, has been proposed previously. We propose a new algorithm to apply the SSD that is flash memory-based solid state drive by applying FTL. First, this paper evaluates the performance of such a disk cache via simulations using DiskSim. Then, we perform an experiment to evaluate the performance of the proposed algorithm.
https://doi.org/10.5573/JSTS.2017.17.1.015 인용 PDF KSCI

The Effects of Cache Memory on the System Bus Traffic (캐쉬 메모리가 버스 트래픽에 끼치는 영향)

조용훈;김정선
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.21 no.1
- /
- pp.224-240
- /
- 1996
It is common sense for at least one or more levels of cache memory to be used in these day's computer systems. In this paper, the impact of the internal cache memory organization on the performance of the computer is investigated by using a simulator program, which is wirtten by authors and run on SUN SPARC workstation, with several real execution, with several real execution trace files. 280 cache organizations have been simulated using n-way set associative mapping and LRU(Least Recently Used) replacement algorithm with write allocation policy. As a result, 16-way setassociative cache is the best configuration, and when we select 256KB cache memory and 64 byte line size, the bus traffic ratio was decreased compared to that of the noncache system so that a single bus could support almost 7 processors without any delay and degradationof high ratio(hit ratio was 99.21%). The smaller the line size we choose, the little lower hit ratio we can get, but the more processors can be supported by a single bus(maximum 18 processors). Therefore, using a proper cache memory organization can make a single bus structure be able to support multiple processors without any performance degradation.
PDF

Cache Simulator Design for Optimizing Write Operations of Nonvolatile Memory Based Caches (비휘발성 메모리 기반 캐시의 쓰기 작업 최적화를 위한 캐시 시뮬레이터 설계)

Joo, Yongsoo;Kim, Myeung-Heo;Han, In-Kyu;Lim, Sung-Soo
- IEMEK Journal of Embedded Systems and Applications
- /
- v.11 no.2
- /
- pp.87-95
- /
- 2016
Nonvolatile memory (NVM) is being considered as an alternative of traditional memory devices such as SRAM and DRAM, which suffer from various limitations due to the technology scaling of modern integrated circuits. Although NVMs have advantages including nonvolatility, low leakage current, and high density, their inferior write performance in terms of energy and endurance becomes a major challenge to the successful design of NVM-based memory systems. In order to overcome the aforementioned drawback of the NVM, extensive research is required to develop energy- and endurance-aware optimization techniques for NVM-based memory systems. However, researchers have experienced difficulty in finding a suitable simulation tool to prototype and evaluate new NVM optimization schemes because existing simulation tools do not consider the feature of NVM devices. In this article, we introduce a NVM-based cache simulator to support rapid prototyping and evaluation of NVM-based caches, as well as energy- and endurance-aware NVM cache optimization schemes. We demonstrate that the proposed NVM cache simulator can easily prototype PRAM cache and PRAM+STT-RAM hybrid cache as well as evaluate various write traffic reduction schemes and wear leveling schemes.
https://doi.org/10.14372/IEMEK.2016.11.2.87 인용 PDF KSCI

Adaptive Writeback-aware Cache Management Policy for Lifetime Extension of Non-volatile Memory

Hwang, Sang-Ho;Choi, Ju Hee;Kwak, Jong Wook
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.17 no.4
- /
- pp.514-523
- /
- 2017
In this paper, we propose Adaptive Writeback-aware Cache management (AWC) to prolong the lifetime of non-volatile main memory systems by reducing the number of writebacks. The last-level cache in AWC is partitioned into Least Recently Used (LRU) segment and LRU using Dirty block Precedence (DP-LRU) segment. The DP-LRU segment evicts clean blocks first for giving reuse opportunity to dirty blocks. AWC can also determine the efficient size of DP-LRU segment for reducing the number of writebacks according to memory access patterns of programs. In the performance evaluation, we showed that AWC reduced the number of writebacks up to 29% and 46%, and saved the energy of a main memory system up to 23% and 49% in a single-core and multi-core, respectively. AWC also reduced the runtime by 1.5% and 3.2% on average compared to previous cache managements for non-volatile main memory systems, in a single-core and a multi-core, respectively.
https://doi.org/10.5573/JSTS.2017.17.4.514 인용 PDF KSCI

A Remote Cache Coherence Protocol for Single Shared Memory in Multiprocessor System (단일 공유 메모리를 가지는 다중 프로세서 시스템의 원격 캐시 일관성 유지 프로토콜)

Kim, Seong-Woon;Kim, Bo-Gwan
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.42 no.6
- /
- pp.19-28
- /
- 2005
The multiprocessor architecture is a good method to improve the computer system performance. The CC-NUMA provides a single shared space with the physically distributed memories is used widely in the multiprocessor computer system. A CC-NUMA has the full-mapped directory for the shared memory md uses a remote cache memory for tile fast memory access. In this paper, we propose a processing node architecture for a CC-NUMA system and a cache coherency protocol on the physically distributed but logically shared system. We show an implementation result of the system which is adopted the cache coherency protocol.
PDF KSCI

Forgetting based File Cache Management Scheme for Non-Volatile Memory (데이터 망각을 활용한 비휘발성 메모리 기반 파일 캐시 관리 기법)

Kang, Dongwoo;Choi, Jongmoo
- Journal of KIISE
- /
- v.42 no.8
- /
- pp.972-978
- /
- 2015
Non-volatile memory (NVM) supports both byte addressability and non-volatility. These characteristics make it feasible for NVM to be employed at any layer of the memory hierarchy such as cache, memory and disk. An interesting characteristic of NVM is that, even though it supports non-volatility, its retention capability is limited. Furthermore NVM has tradeoff between its retention capability and write latency. In this paper, we propose a novel NVM-based file cache management scheme that makes use of the limited retention capability to improve the cache performance. Experimental results with real-workloads show that our scheme can reduce access latency by up to 31% (24.4% average) compared with the conventional LRU based cache management scheme.
https://doi.org/10.5626/JOK.2015.42.8.972 인용 KSCI

Designing a low-power L1 cache system using aggressive data of frequent reference patterns

Jung, Bo-Sung;Lee, Jung-Hoon
- Journal of the Korea Society of Computer and Information
- /
- v.27 no.7
- /
- pp.9-16
- /
- 2022
Today, with the advent of the 4th industrial revolution, IoT (Internet of Things) systems are advancing rapidly. For this reason, a various application with high-performance and large-capacity are emerging. Therefore, there is a need for low-power and high-performance memory for computing systems with these applications. In this paper, we propose an effective structure for the L1 cache memory, which consumes the most energy in the computing system. The proposed cache system is largely composed of two parts, the L1 main cache and the buffer cache. The main cache is 2 banks, and each bank consists of a 2-way set association. When the L1 cache hits, the data is copied into buffer cache according to the proposed algorithm. According to simulation, the proposed L1 cache system improved the performance of energy delay products by about 65% compared to the existing 4-way set associative cache memory.
https://doi.org/10.9708/jksci.2022.27.07.009 인용 PDF KSCI HTML

Advanced Victim Cache with Processor Reuse Information (프로세서의 재사용 정보를 이용하는 개선된 고성능 희생 캐쉬)

Kwak Jong Wook;Lee Hyunbae;Jhang Seong Tae;Jhon Chu Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.12
- /
- pp.704-715
- /
- 2004
Recently, a single or multi processor system uses the hierarchical memory structure to reduce the time gap between processor clock rate and memory access time. A cache memory system includes especially two or three levels of caches to reduce this time gap. Moreover, one of the most important things In the hierarchical memory system is the hit rate in level 1 cache, because level 1 cache interfaces directly with the processor. Therefore, the high hit rate in level 1 cache is critical for system performance. A victim cache, another high level cache, is also important to assist level 1 cache by reducing the conflict miss in high level cache. In this paper, we propose the advanced high level cache management scheme based on the processor reuse information. This technique is a kind of cache replacement policy which uses the frequency of processor's memory accesses and makes the higher frequency address of the cache location reside longer in cache than the lower one. With this scheme, we simulate our policy using Augmint, the event-driven simulator, and analyze the simulation results. The simulation results show that the modified processor reuse information scheme(LIVMR) outperforms the level 1 with the simple victim cache(LIV), 6.7% in maximum and 0.5% in average, and performance benefits become larger as the number of processors increases.
PDF KSCI

An Efficient Cache Management Scheme of Flash Translation Layer for Large Size Flash Memory Drives

Choi, Hwan-Pil;Kim, Yong-Seok
- Journal of the Korea Society of Computer and Information
- /
- v.20 no.11
- /
- pp.31-38
- /
- 2015
Nowadays, large size flash memory drives with more than a couple of hundreds of gigabytes are common. This paper presents an efficient cache management scheme of flash translation layer, called TPC-FTL, for large size flash memory drives. Since flash drives of large size usually contain large size RAM, we can enhance the performance of page mapping cache by using more RAM for the cache. But if the size exceeds a threshold, the existing schemes are impractical for real devices, because the time for cache manipulation becomes too long. TPC-FTL manages the cache in translation page unit, not in logical page number unit used in existing schemes. Since a translation page covers a large number of logical page numbers (for example, 512 for 2KB size page), the number of cache elements can be reduced up to a practical level. A performance evaluation shows that average response time, an important performance measure, is better than existing schemes via the effect of utilizing spacial locality in addition to temporal locality.
https://doi.org/10.9708/jksci.2015.20.11.031 인용 PDF KSCI

Search Result 412, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)