• Title/Summary/Keyword: 캐쉬메모리

Search Result 176, Processing Time 0.025 seconds

Content-Aware Main Memory Web Caching (내용 기반의 메인 메모리 웹 캐쉬 할당 정책)

  • 염미령;노삼혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10c
    • /
    • pp.244-246
    • /
    • 2001
  • 웹의 이용을 증가로 클라이언트의 요청이 급증하지만 웹 서버의 처리 능력은 한계에 다다르고 있다. 늘어나는 요청율에 응답하기 위해서 웹서버는 불필요한 오버해드는 피해야 한다. 정적 문서를 서비스하는 웹 서버의 가장 큰 오버해드는 디스크 액세스이다. 불필요한 디스크 접근을 피하기 위해 본 논문에서는 요청들에 대한 서비스 순서를 고려하는 메인 메모리 캐슁 정책을 제시하였다. Event-Driven 방식의 웹 서버에서 실험 결과 웹 서버의 성능을 향상시켰다.

  • PDF

Performance Analysis of n-way Associative Cache and Fully Associative Cache (n-way Set Associative Cache와 Fully Associative Cache성능 분석)

  • Jo, Yong-Hun;Kim, Jeong-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.3
    • /
    • pp.802-810
    • /
    • 1997
  • In this paper, the performance of direce mapping caches, 2_, 4_, 8_, .., 4096_way way set associative caches, and fully assiciative caches are analyized by trace simulation for verivying their effectiveness.In general, it is well known that as n, the number of main memory lines to be stored into one cache line number in direct mapping cache, increases, the performance of the cache memory should get higher linearly.According to our analysis, however, it is not true on all the cache organizations.It is shown that as n increases, miss ratios get lower only when the small cache(less than 256K) using large line size is used.It is also shown that fully associative mapping achieves high performance only when small size cache using large line size ia used.

  • PDF

An Optimized Cache Coherence Protocol in Multiprocessor System Connected by Slotted Ring (슬롯링으로 연결된 다중처리기 시스템에서 최적화된 캐쉬일관성 프로토콜)

  • Min, Jun-Sik;Chang, Tae-Mu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.12
    • /
    • pp.3964-3975
    • /
    • 2000
  • There are two policies for maintaining consistency among the multiple processor caches in a multiprocessor system: Write invalidate and Write update. In the write invalidate policy, whenever a processor attempt to write its cached block, it has to invalidate all the same copies of the updated block in the system. As a results of this frequent invalidations, this policy results in high cache miss ratio. On the other hand, the write update policy renew them, instead of invalidating all the same copies. This policy has to transfer the updated contents through interconnection network, whether the updated block is ptivate or not. Therefore the network suffer from heavy transaction traffic. In this paper we present an efficient cache coherence protocol for shared memory multiprocessor system connected by slotted ring. This protocol is based on the write update policy, but the updated contents are transferred only in case of updating the shared block. Otherwise, if the updated block is private, the updated contents are not transferred. We analyze the proposed protocol and enforce simulation to compare it with previous version.

  • PDF

Development of Hardware Trace Generating System (하드웨어 트레이스 생성 시스템의 개발)

  • Yun, Hyeong-Min;Park, Gi-Ho;Lee, Gil-Hwan;Han, Tak-Don;Kim, Sin-Deok;Yang, Seong-Bong;Lee, Yong-Seok
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.3
    • /
    • pp.811-823
    • /
    • 1998
  • 캐쉬 메모리 시스템의 성능 측정 방법으로 이제까지 널리 사용되고 있는 방법이 트레이스 구동 시뮬레이션이다. 트레이스 구동 시뮬레이션의 정확성은 사용하는 트레이스의 크기, 포함된 정보의 종류 등에 의해서 크게 영향을 받는다. 이에 따라 보다 정확한 트레이스를 생성하기 위해 많은 방법들이 제안되었으며 그 중 하드웨어 모니터링 기법에 의해서 얻어진 트레이스는 응용 프로그램의 메모리 참조에 대한 정보뿐만 아니라, 문맥교환이나 시스템 프로그램의 메모리 참조에 대한 정보, 메모리 참조가 발생한 시간 정보 등을 가진다는 장점을 갖는다. 그러나 하드웨어 모니터링 시스템은 트레이스를 생성하기 위한 시스템에 따라서 설계가 변화되어야 하는 단점이 있다. 본 논문에서는 이러한 하드웨어 모니터링 시스템의 단점을 완화하기 위해서 EPLD(Erasable Programmable Logic Device)를 사용하여 트레이스 생성 시스템을 구성하여, 보다 간단한 수정으로 여러 시스템에서 트레이스 생성이 가능한 하드웨어 시스템을 설계하였다. 또한 제작된 트레이스 생성 시스템은 66Mhz의 고속 버스 시스템에서 동작할 수 있는 특징을 갖는다.

  • PDF

A Study on the Performance Analysis of Cache Coherence Protocols in a Multiprocessor System Using HiPi Bus (HiPi 버스를 사용한 멀티프로세서 시스템에서 캐쉬 코히어런스 프로토콜의 성능 평가에 관한 연구)

  • 김영천;강인곤;황승욱;최진규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.1
    • /
    • pp.57-68
    • /
    • 1993
  • In this paper, we describe a multiprocessor system using the HiPi bus with pended protocol and multiple cache memories, and evalute the performance of the multiprocessor system in terms of processor utilization for various cache coherence protocols. The HiPi bus is delveloped as the shared bus of TICOM II which is a main computer system to establish a nation-wide computing network in ETRI. The HiPi bus has high data transfer rate, but it doesn't allow cache-to-cache transfer. In order to evaluate the effect of cache-to-cache transfer upon the performance of system and to choose a best-performed protocol for HiPi bus, we simulate as follows: First, we analyze the performance of multiprocessor system with HiPi bus in terms of processor utilizatIOn through simulation. Each of cache coherence protocol is described by state transition diagram, and then the probability of each state is calculated by Markov steady state. The calculated probability of each state is used as input parameters of simulation, and modeling and simulation are implemented and performed by using SLAM II graphic symbols and language. Second, we propose the HiPi bus which supports cache-to-cache transfer, and analyze the performance of multiprocessor system with proposed HiPi bus in terms of processor utilization through simulation. Considered cache coherence protocols for the simulation are Write-through, Write-once, Berkely, Synapse, Illinois, Firefly, and Dragon.

  • PDF

Improving Reliability of the Last Level Cache with Low Energy and Low Area Overhead (낮은 에너지 소모와 공간 오버헤드의 Last Level Cache 신뢰성 향상 기법)

  • Kim, Young-Ung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.35-41
    • /
    • 2012
  • Due to the technology scaling, more transistors can be placed on a cache memories of a processor. However, processors become more vulnerable to the soft error because of the highly integrated transistors, and consequently, the reliability of the cache memory must consider seriously at the design space level. In this paper, we propose the reliability improving technique which can be achieved with low energy and low area overheads. The simulation experiments of the proposed scheme shows over 95.4% of protection rate against the soft error with only 0.26% of performance degradations. Also, It requires only 2.96% of extra energy consumption.

Design of Pipeline Bus and the Performance Evaluation in Multiprocessor System (다중프로세서 시스템에서 파이프라인 전송 버스의 설계 및 성능 평가)

  • 윤용호;임인칠
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.2
    • /
    • pp.288-299
    • /
    • 1993
  • This paper proposes the new bus protocol in the tightly coupled multiprocessor system. The bus protocol uses the pipelined data transfer and block transfer scheme to increase the bus bandwidth, The bus also has the independent transfer lines for the address and data respectively, and it can transfer the data up to maximum 264 Mbytes /sec. This paper also models the multiprocessor system where each processor boards have the private cache. Simulation evaluates the bus and system performance according to hit ratio of the reference data in cache memory, In the case of using this bus, the bus is evaluated not to be saturated when up to 10 processor boards are connected to the bus. As for up to 4 memory interleavng, the performance increases linearly.

  • PDF

Cache Coherency Schemes for Database Sharing Systems with Primary Copy Authority (주사본 권한을 지원하는 공유 데이터베이스 시스템을 위한 캐쉬 일관성 기법)

  • Kim, Shin-Hee;Cho, Haeng-Rae;Kim, Byeong-Uk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.6
    • /
    • pp.1390-1403
    • /
    • 1998
  • Database sharing system (DSS) refers to a system for high performance transaction processing. In DSS, the processing nodes are locally coupled via a high speed network and share a common database at the disk level. Each node has a local memory, a separate copy of operating system, and a DB'\fS. To reduce the number of disk accesses, the node caches database pages in its local memory buffer. However, since multiple nodes may be simultaneously cached a page, cache consistency must be cnsured so that every node can always access the'latest version of pages. In this paper, we propose efficient cache consistency schemes in DSS, where the database is logically partitioned using primary copy authority to reduce locking overhead, The proposed schemes can improve performance by reducing the disk access overhead and the message overhead due to maintaining cache consistency, Furthermore, they can show good performance when database workloads are varied dynamically.

  • PDF

Implementation of MPEG/Audio Decoder based on RISC Processor With Minimized DSP Accelerator (DSP 가속기가 내장된 RISC 프로세서 기반 MPEG/Audio 복호화기의 구현)

  • Bang Kyoung Ho;Lee Ken Sup;Park Young Cheol;Youn Dae Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1617-1622
    • /
    • 2004
  • MPEG/Audio decoder for mobile multimedia systems requires low power consumption. Implementations of AV decoder using a single RISC processor often need high power consumption owing to cash-miss in case of insufficient cash memory. In this paper, we present a MPEG/Audio decoder for mobile handset applications and implement it on a RISC processor embedding a minimized DSP accelerator. Audio decoding algorithm is splined into two parts; computation intensive and control intensive parts. Those parts we, respectively, allocated to DSP and RISC core, which are designed to run in parallel to increase the processing efficiency. The proposed system implements MP3 and AAC decoders at l7MHz and 24MHz clocks, which are reductions of 48% and 40% of complexities in comparison with implementations on a single RISC processor. The proposed method is adequate for mobile multimedia applications with insufficient cash memory.

The Efficient Merge Operation in Log Buffer-Based Flash Translation Layer for Enhanced Random Writing (임의쓰기 성능향상을 위한 로그블록 기반 FTL의 효율적인 합병연산)

  • Lee, Jun-Hyuk;Roh, Hong-Chan;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.19D no.2
    • /
    • pp.161-186
    • /
    • 2012
  • Recently, the flash memory consistently increases the storage capacity while the price of the memory is being cheap. This makes the mass storage SSD(Solid State Drive) popular. The flash memory, however, has a lot of defects. In order that these defects should be complimented, it is needed to use the FTL(Flash Translation Layer) as a special layer. To operate restrictions of the hardware efficiently, the FTL that is essential to work plays a role of transferring from the logical sector number of file systems to the physical sector number of the flash memory. Especially, the poor performance is attributed to Erase-Before-Write among the flash memory's restrictions, and even if there are lots of studies based on the log block, a few problems still exists in order for the mass storage flash memory to be operated. If the FAST based on Log Block-Based Flash often is generated in the wide locality causing the random writing, the merge operation will be occur as the sectors is not used in the data block. In other words, the block thrashing which is not effective occurs and then, the flash memory's performance get worse. If the log-block makes the overwriting caused, the log-block is executed like a cache and this technique contributes to developing the flash memory performance improvement. This study for the improvement of the random writing demonstrates that the log block is operated like not only the cache but also the entire flash memory so that the merge operation and the erase operation are diminished as there are a distinct mapping table called as the offset mapping table for the operation. The new FTL is to be defined as the XAST(extensively-Associative Sector Translation). The XAST manages the offset mapping table with efficiency based on the spatial locality and temporal locality.