• Title/Summary/Keyword: cache performance model

Search Result 57, Processing Time 0.035 seconds

Client Cache Management Scheme For Data Broadcasting Environments (LRU-CFP: 데이터 방송 환경을 위한 클라이언트 캐쉬 관리 기법)

  • Kwon, Hyeok-Min
    • The KIPS Transactions:PartD
    • /
    • v.10D no.6
    • /
    • pp.961-970
    • /
    • 2003
  • In data broadcasting environments, the server periodically broadcasts data items in the broadcast channel. When each client wants to access any data item, it should monitor the broadcast channel and wait for the desired item to arrive. Client data caching is a very effective technique for reducing the time spent waiting for the desired item to be broadcastted. This paper proposes a new client cache management scheme, named LRU-CFP, to reduce this waiting time ans evaluates its performance on the basis of a simulation model. The performance results indicate that LRU-CFP scheme shows superior performance over LRU, GRAY and CF in the average response time.

A Theoretical Superscalar Microprocessor Performance Model with Limited Functional Units Using Instruction Dependencies (한정된 연산유닛에서 명령어 종속성을 이용하는 수퍼스칼라 프로세서의 이론적 성능 모델)

  • Lee, Jong-Bok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.2
    • /
    • pp.423-428
    • /
    • 2010
  • In the initial design phase of superscalar microprocessors, a performance model is necessary. A theoretic performance model is very useful since performance for various architecture parameters can be obtained by simply computing equations, without repeating simulations, Previous studies established theoretic performance models using the relation between the instruction window size and the issue width, with the penalties due to branch mispredictions and cache misses. However, the study was intended for unlimited number of functional units, which is insufficient for the real case application. This paper proposes a superscalar microprocessor theoretical performance model which also works for the limited functional units. To enhance the accuracy of our limited functional unit model, instruction dependency rates are employed. By using trace-driven data of SPEC 2000 integer programs as input, this paper shows that the theoretically computed performance of superscalar microprocessor with limited number of functional units is quite similar to the measured performance.

Effective Reference Probability Incorporating the Effect of Expiration Time in Web Cache (웹 캐쉬에서 만기시간의 영향을 고려한 유효참조확률)

  • Lee, Jeong-Joon;Moon, Yang-Se;Whang, Kyu-Young;Hong, Eui-Kyung
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.688-701
    • /
    • 2001
  • Web caching has become an important problem addressing the performance issues in web applications. In this paper we propose a method that enhances the performance of web caching by incorporating the expiration time of web data we introduce the notion of the effective reference probability that incorporates the effect of expiration time into the reference probability used in the existing cache replacement algorithms .We formally define the effective reference probability and derive it theoretically using a probabilistic model. By simply replacing probabilities with the effective reference probability in the existing cache replacement algorithms we can take the effect of expiration time into account The results of performance evaluation through experiments show that the replacement algorithms using the effective reference probability always outperform the existing ones. The reason is that the proposed method precisely reflects the theoretical probability of getting the cache effect, and thus, incorporates the influence of the expiration time more effectively. In particular when the cache fraction is 0.05 and data update is comparatively frequent (i.e. the update frequency is more than 1/0 of the reference frequency) the performance enhancement is more than 30% in LRU-2 and 13% in Aggarwal's method (PSS integrating a refresh overhead factor) The results show that effective reference probability contributes significantly to the performance enhancement of the web cache in the presence of expiration time.

  • PDF

Counter-Based Approaches for Efficient WCET Analysis of Multicore Processors with Shared Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.4
    • /
    • pp.285-299
    • /
    • 2013
  • To enable hard real-time systems to take advantage of multicore processors, it is crucial to obtain the worst-case execution time (WCET) for programs running on multicore processors. However, this is challenging and complicated due to the inter-thread interferences from the shared resources in a multicore processor. Recent research used the combined cache conflict graph (CCCG) to model and compute the worst-case inter-thread interferences on a shared L2 cache in a multicore processor, which is called the CCCG-based approach in this paper. Although it can compute the WCET safely and accurately, its computational complexity is exponential and prohibitive for a large number of cores. In this paper, we propose three counter-based approaches to significantly reduce the complexity of the multicore WCET analysis, while achieving absolute safety with tightness close to the CCCG-based approach. The basic counter-based approach simply counts the worst-case number of cache line blocks mapped to a cache set of a shared L2 cache from all the concurrent threads, and compares it with the associativity of the cache set to compute the worst-case cache behavior. The enhanced counter-based approach uses techniques to enhance the accuracy of calculating the counters. The hybrid counter-based approach combines the enhanced counter-based approach and the CCCG-based approach to further improve the tightness of analysis without significantly increasing the complexity. Our experiments on a 4-core processor indicate that the enhanced counter-based approach overestimates the WCET by 14% on average compared to the CCCG-based approach, while its averaged running time is less than 1/380 that of the CCCG-based approach. The hybrid approach reduces the overestimation to only 2.65%, while its running time is less than 1/150 that of the CCCG-based approach on average.

Web Proxy Cache Replacement Algorithms using Object Type Partition (개체 타입별 분할공간을 이용한 웹 프락시 캐시의 대체 알고리즘)

  • Soo-haeng, Lee;Sang-bang, Choi
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.5C
    • /
    • pp.399-410
    • /
    • 2002
  • Web cache, which is functionally another word of proxy server, is located between client and server. Web cache has a limited storage area although it has broad bandwidth between client and proxy server, which are usually connected through LAN. Because of limited storage capacity, existing objects in web cache can be deleted for new objects by some rules called replacement algorithm. Hit rate and byte-hit rate are general metrics to evaluate replacement algorithms. Most of the replacement algorithms do satisfy only one metric, or sometimes none of them. In this paper, we propose two replacement algorithms to achieve both high hit rate and byte-hit rate with great satisfaction. In the first algorithm, the cache is appropriately partitioned according to file types as a basic model. In the second algorithm, the cache is composed of two levels; the upper level cache is managed by the basic algorithm, but the lower level is collectively used for all types of files as a shared area. To show the performance of the proposed algorithms, we evaluate hit rate and byte-hit rate of the proposed replacement algorithms using the trace driven simulation.

Scalable Graphics Algorithms (스케일러블 그래픽스 알고리즘)

  • Yoon, Sung-Eui
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02c
    • /
    • pp.224-224
    • /
    • 2008
  • Recent advances in model acquisition, computer-aided design, and simulation technologies have resulted in massive databases of complex geometric data occupying multiple gigabytes and even terabytes. In various graphics/geometric applications, the major performance bottleneck is typically in accessing these massive geometric data due to the high complexity of such massive geometric data sets. However, there has been a consistent lower growth rate of data access speed compared to that of computational processing speed. Moreover, recent multi-core architectures aggravate this phenomenon. Therefore, it is expected that the current architecture improvement does not offer the solution to the problem of dealing with ever growing massive geometric data, especially in the case of using commodity hardware. In this tutorial, I will focus on two orthogonal approaches--multi-resolution and cache-coherent layout techniques--to design scalable graphics/geometric algorithms. First, I will discuss multi-resolution techniques that reduce the amount of data necessary for performing geometric methods within an error bound. Second, I will explain cache-coherent layouts that improve the cache utilization of runtime geometric applications. I have applied these two techniques into rendering, collision detection, and iso-surface extractions and, thereby, have been able to achieve significant performance improvement. I will show live demonstrations of view-dependent rendering and collision detection between massive models consisting of tens of millions of triangles on a laptop during the talk.

  • PDF

An Analysis of Multi-processor System Performance Depending on the Input/Output Types (입출력 형태에 따른 다중처리기 시스템의 성능 분석)

  • Moon, Wonsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.4
    • /
    • pp.71-79
    • /
    • 2016
  • This study proposes a performance model of a shared bus multi-processor system and analyzes the effect of input/output types on system performance and overload of shared resources. This system performance model reflects the memory reference time in relation to the effect of input/output types on shared resources and the input/output processing time in relation to the input/output processor, disk buffer, and device standby places. In addition, it demonstrates the contribution of input/output types to system performance for comprehensive analysis of system performance. As the concept of workload in the probability theory and the presented model are utilized, the result of operating and analyzing the model in various conditions of processor capability, cache miss ratio, page fault ratio, disk buffer hit ratio (input/output processor and controller), memory access time, and input/output block size. A simulation is conducted to verify the analysis result.

Performance Evaluation of Deferred Locking With Shadow Transaction (그림자 트랜잭션을 이용한 지연 로킹 기법의 성능 평가)

  • 권혁민
    • The Journal of Information Technology
    • /
    • v.3 no.3
    • /
    • pp.117-134
    • /
    • 2000
  • Client-server DBMS based on a data-skipping model can exploit client resources effectively by allowing inter-transaction caching. However, inter-transaction caching raises the need of transactional cache consistency maintenance(TCCM) protocol, since each client is able to cache a portion of the database dynamically. Detection-based TCCM schemes can reduce the message overhead required for cache consistency if they validate clients replica asynchronously, and thus they cm show high throughput rates. However, they tend to show high ratios of transaction abort since transactions can access invalid replica. For coping with this drawback, this paper develops a new notion of shadow transaction, which is a backup-purpose one that is kept ready to replace an aborted transaction. This paper proposes a new detection-based TCCM scheme named DL-ST on the basis of the notion of shadow transaction. Using a simulation model, this paper evaluates the effect of shadow transaction in terms of transaction through rate and abort ratio.

  • PDF

Recognition Time Reduction Technique for the Time-synchronous Viterbi Beam Search (시간 동기 비터비 빔 탐색을 위한 인식 시간 감축법)

  • 이강성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.6
    • /
    • pp.46-50
    • /
    • 2001
  • This paper proposes a new recognition time reduction algorithm Score-Cache technique, which is applicable to the HMM-base speech recognition system. Score-Cache is a very unique technique that has no other performance degradation and still reduces a lot of search time. Other search reduction techniques have trade-offs with the recognition rate. This technique can be applied to the continuous speech recognition system as well as the isolated word speech recognition system. W9 can get high degree of recognition time reduction by only replacing the score calculating function, not changing my architecture of the system. This technique also can be used with other recognition time reduction algorithms which give more time reduction. We could get 54% of time reduction at best.

  • PDF

Forecasting Load Balancing Method by Prediction Hot Spots in the Shared Web Caching System

  • Jung, Sung-C.;Chong, Kil-T.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.2137-2142
    • /
    • 2003
  • One of the important performance metrics of the World Wide Web is how fast and precise a request from users will be serviced successfully. Shared Web Caching (SWC) is one of the techniques to improve the performance of the network system. In Shared Web Caching Systems, the key issue is on deciding when and where an item is cached, and also how to transfer the correct and reliable information to the users quickly. Such SWC distributes the items to the proxies which have sufficient capacity such as the processing time and the cache sizes. In this study, the Hot Spot Prediction Algorithm (HSPA) has been suggested to improve the consistent hashing algorithm in the point of the load balancing, hit rate with a shorter response time. This method predicts the popular hot spots using a prediction model. The hot spots have been patched to the proper proxies according to the load-balancing algorithm. Also a simulator is developed to utilize the suggested algorithm using PERL language. The computer simulation result proves the performance of the suggested algorithm. The suggested algorithm is tested using the consistent hashing in the point of the load balancing and the hit rate.

  • PDF