• Title/Summary/Keyword: Level 2 Cache

Search Result 68, Processing Time 0.032 seconds

2-Level Adaptive Branch Prediction Based on Set-Associative Cache (세트 연관 캐쉬를 사용한 2단계 적응적 분기 예측)

  • Shim, Won
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.497-502
    • /
    • 2002
  • Conditional branches can severely limit the performance of instruction level parallelism by causing branch penalties. 2-level adaptive branch predictors were developed to get accurate branch prediction in high performance superscalar processors. Although 2 level adaptive branch predictors achieve very high prediction accuracy, they tend to be very costly. In this paper, set-associative cached correlated 2-level branch predictors are proposed to overcome the cost problem in conventional 2-level adaptive branch predictors. According to simulation results, cached correlated predictors deliver higher prediction accuracy than conventional predictors at a significantly lower cost. The best misprediction rates of global and local cached correlated predictors using set-associative caches are 5.99% and 6.28% respectively. They achieve 54% and 17% improvements over those of the conventional 2-level adaptive branch predictors.

Probability-based Pre-fetching Method for Multi-level Abstracted Data in Web GIS (웹 지리정보시스템에서 다단계 추상화 데이터의 확률기반 프리페칭 기법)

  • 황병연;박연원;김유성
    • Spatial Information Research
    • /
    • v.11 no.3
    • /
    • pp.261-274
    • /
    • 2003
  • The effective probability-based tile pre-fetching algorithm and the collaborative cache replacement algorithm are able to reduce the response time for user's requests by transferring tiles which will be used in advance and determining tiles which should be removed from the restrictive cache space of a client based on the future access probabilities in Web GISs(Geographical Information Systems). The Web GISs have multi-level abstracted data for the quick response time when zoom-in and zoom-out queries are requested. But, the previous pre-fetching algorithm is applied on only two-dimensional pre-fetching space, and doesn't consider expanded pre-fetching space for multi-level abstracted data in Web GISs. In this thesis, a probability-based pre-fetching algorithm for multi-level abstracted in Web GISs was proposed. This algorithm expanded the previous two-dimensional pre-fetching space into three-dimensional one for pre-fetching tiles of the upper levels or lower levels. Moreover, we evaluated the effect of the proposed pre-fetching algorithm by using a simulation method. Through the experimental results, the response time for user requests was improved 1.8%∼21.6% on the average. Consequently, in Web GISs with multi-level abstracted data, the proposed pre-fetching algorithm and the collaborative cache replacement algorithm can reduce the response time for user requests substantially.

  • PDF

Improving Hit Ratio and Hybrid Branch Prediction Performance with Victim BTB (Victim BTB를 활용한 히트율 개선과 효율적인 통합 분기 예측)

  • Joo, Young-Sang;Cho, Kyung-San
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.10
    • /
    • pp.2676-2685
    • /
    • 1998
  • In order to improve the branch prediction accuracy and to reduce the BTB miss rate, this paper proposes a two-level BTB structure that adds small-sized victim BTB to the convetional BTB. With small cost, two-level BTB can reduce the BTB miss rate as well as improve the prediction accuracy of the hybrid branch prediction strategy which combines dynamic prediction and static prediction. Through the trace-driven simulation of four bechmark programs, the performance improvement by the proposed two-level BTB structure is analysed and validated. Our proposed BTB structure can improve the BTB miss rate by 26.5% and the misprediction rate by 26.75%

  • PDF

A Recovery Scheme of Single Node Failure using Version Caching in Database Sharing Systems (데이타베이스 공유 시스템에서 버전 캐싱을 이용한 단일 노드 고장 회복 기법)

  • 조행래;정용석;이상호
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.409-421
    • /
    • 2004
  • A database sharing system (DSS) couples a number of computing nodes for high performance transaction processing, and each node in DSS shares database at the disk level. In case of node failures in DSS, database recovery algorithms are required to recover the database in a consistent state. A database recovery process in DSS takes rather longer time compared with single database systems, since it should include merging of discrete log records in several nodes and perform REDO tasks using the merged lo9 records. In this paper, we propose a two version caching (2VC) algorithm that improves the cache fusion algorithm introduced in Oracle 9i Real Application Cluster (ORAC). The 2VC algorithm can achieve faster database recovery by eliminating the use of merged log records in case of single node failure. Furthermore, it can improve the performance of normal transaction processing by reducing the amount of unnecessary disk force overhead that occurs in ORAC.

M-ARC : ARC based high performance multi-level buffer cache algorithm (M-ARC: ARC 기반 고성능 멀티레벨 버퍼캐시 알고리즘)

  • Park, Se-Jin;Park, Chan-Ik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.143-145
    • /
    • 2012
  • 멀티레벨 스토리지 접근은 클라우드 시스템, 가상화 환경, 네트워크 기반 스토리지 등 많은 컴퓨팅 환경에서 널리 사용되고 있다. 이러한 멀티레벨 스토리지의 접근성능을 향상시키려면, 되도록 하위 레벨의 스토리지로 요청이 일어나지 않게 하는 것이 중요하며, 이는 각 레벨의 버퍼캐시 성능이 큰 영향을 미친다. 다양한 버퍼캐시 알고리즘들 중 ARC 알고리즘은 동작의 간결성과 고성능으로 인해, 많은 워크로드에서 가장 좋은 성능을 보이는 캐시 알고리즘으로 알려져 있다. 그러나, ARC 알고리즘은 2차 레벨 버퍼캐시에서는 좋은 성능을 보이지 않는데, 이는 ARC 알고리즘이 멀티레벨 캐시의 특성을 반영하지 못하고 있기 때문이다. 본 논문에서는 멀티레벨 캐시의 특성과 이를 반영한 M-ARC 라는 멀티레벨 버퍼캐시 알고리즘을 제안한다. 제안하는 알로리즘은 기존 ARC에 비해 약 2배 이상 향상된 성능을 보여주고 있다.

Optimization of LU-SGS Code for the Acceleration on the Modern Microprocessors

  • Jang, Keun-Jin;Kim, Jong-Kwan;Cho, Deok-Rae;Choi, Jeong-Yeol
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.14 no.2
    • /
    • pp.112-121
    • /
    • 2013
  • An approach for composing a performance optimized computational code is suggested for the latest microprocessors. The concept of the code optimization, termed localization, is maximizing the utilization of the second level cache that is common to all the latest computer systems, and minimizing the access to system main memory. In this study, the localized optimization of the LU-SGS (Lower-Upper Symmetric Gauss-Seidel) code for the solution of fluid dynamic equations was carried out in three different levels and tested for several different microprocessor architectures widely used these days. The test results of localized optimization showed a remarkable performance gain of more than two times faster solution than the baseline algorithm for producing exactly the same solution on the same computer system.

Design of an efficient hardware architecture supporting Direct3D texture mapping in mobile environment (Mobile 환경에서의 Direct3D 텍스쳐 매핑을 지원하는 효율적인 하드웨어 구조 설계)

  • 김상덕;이승기;박우찬;한탁돈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.712-714
    • /
    • 2002
  • 현재 3차원 컴퓨터 그래픽 가속기에서 텍스쳐 매핑과 같은 실감기법을 처리해 주기 위해서는 넓은 대역폭과 많은 메모리를 필요로 한다. 또한 PDA와 같은 차세대 mobile 응용분야에서는 점차적으로 3차원 그래픽의 지원이 요구되고 있는 추세이다. 이를 mobile 환경에서 지원하기 위해서는 낮은 소비 전력 및 적은 메모리, 그리고 하드웨어 비용 등의 제약 요건이 따른다. 그러나 이러한 제약 조건에도 불구하고, mobile 환경에 적합한 3차원 그래픽 하드웨어의 연구는 필수적이다. 본 논문에서는 Windows CE 기반의 mobile 환경에서 Direct3D의 압축 텍스쳐 데이터를 효율적으로 처리하는 하드웨어를 제시한다. 이는 1 cycle에 2개 texel을 처리할 수 있으며, 작은 2-level cache를 사용하여 대역폭을 효과적으로 줄였다.

  • PDF

Technique for Estimating the Number of Active Flows in High-Speed Networks

  • Yi, Sung-Won;Deng, Xidong;Kesidis, George;Das, Chita R.
    • ETRI Journal
    • /
    • v.30 no.2
    • /
    • pp.194-204
    • /
    • 2008
  • The online collection of coarse-grained traffic information, such as the total number of flows, is gaining in importance due to a wide range of applications, such as congestion control and network security. In this paper, we focus on an active queue management scheme called SRED since it estimates the number of active flows and uses the quantity to indicate the level of congestion. However, SRED has several limitations, such as instability in estimating the number of active flows and underestimation of active flows in the presence of non-responsive traffic. We present a Markov model to examine the capability of SRED in estimating the number of flows. We show how the SRED cache hit rate can be used to quantify the number of active flows. We then propose a modified SRED scheme, called hash-based two-level caching (HaTCh), which uses hashing and a two-level caching mechanism to accurately estimate the number of active flows under various workloads. Simulation results indicate that the proposed scheme provides a more accurate estimation of the number of active flows than SRED, stabilizes the estimation with respect to workload fluctuations, and prevents performance degradation by efficiently isolating non-responsive flows.

  • PDF

Bit-Map Based Hybrid Fast IP Lookup Technique (비트-맵 기반의 혼합형 고속 IP 검색 기법)

  • Oh Seung-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.2
    • /
    • pp.244-254
    • /
    • 2006
  • This paper presents an efficient hybrid technique to compact the trie indexing the huge forward table small enough to be stored into cache for speeding up IP lookup. It combines two techniques, an encoding scheme called bit-map and a controlled-prefix expanding scheme to replace slow memory search with few fast-memory accesses and computations. For compaction, the bit-map represents each index and child pointer with one bit respectively. For example, when one node denotes n bits, the bit-map gives a high compression rate by consumes $2^{n-1}$ bits for $2^n$ index and child link pointers branched out of the node. The controlled-prefix expanding scheme determines the number of address bits represented by all root node of each trie's level. At this time, controlled-prefix scheme use a dynamic programming technique to get a smallest trie memory size with given number of trie's level. This paper proposes standard that can choose suitable trie structure depending on memory size of system and the required IP lookup speed presenting optimal memory size and the lookup speed according to trie level number.

  • PDF

Low-Power 2-level Cache Architectures for Embedded System (내장형 시스템을 위한 저전력 2-레벨 캐쉬 메모리의 설계)

  • Jong-Min Lee;Soon-Tae Kim;Kyung-Ah Kim;Su-Ho Park;Yong-Ho Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.806-809
    • /
    • 2008
  • 온칩(on-chip) 캐쉬는 외부 메모리로의 접근을 감소시키는 중요한 역할을 한다. 본 연구에서는 내장형 시스템에 맞추어 설계된 2-레벨 캐쉬 메모리 구조를 제안하고자 한다. 레벨1(L1) 캐쉬의 구성으로 작은 크기, 직접사상(direct-mapped) 그리고 바로쓰기(write-through)를 채용한다. 대조적으로 레벨2(L2) 캐쉬는 일반적인 캐쉬 크기와 집합연관(Set-associativity) 그리고 나중쓰기(write-back) 정책을 채용한다. 결과적으로 L1캐쉬는 한 사이클 이내에 접근될 수 있고 L2캐쉬는 전체 캐쉬의 미스율(global miss rate)을 낮추는데 효과적이다. 두 캐쉬 계층간 바로쓰기(write-thorough) 정책에서 오는 빈번한 L2 캐쉬 접근으로 인한 에너지 소비를 줄이기 위해 본 연구에서는 One-way 접근 기법을 제안하였다. 본 연구에서 제안한 2-레벨 캐쉬 메모리 구조는 평균적으로 26%의 성능향상과 43%의 에너지 소비 그리고 77%의 에너지-지연 곱에서 이득을 보여주었다.