• Title/Summary/Keyword: dual-cache

Search Result 25, Processing Time 0.02 seconds

A Dual Slotted Ring Organization for Reducing Memory Access Latency in Distributed Shared Memory System (분산 공유 메모리 시스템에서 메모리 접근지연을 줄이기 위한 이중 슬롯링 구조)

  • Min, Jun-Sik;Chang, Tae-Mu
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.419-428
    • /
    • 2001
  • Advances in circuit and integration technology are continuously boosting the speed of processors. One of the main challenges presented by such developments is the effective use of powerful processors in shared memory multiprocessor system. We believe that the interconnection problem is not solved even for small scale shared memory multiprocessor, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new powerful processors. In the past few years, point-to-point unidirectional connection have emerged as a very promising interconnection technology. The single slotted ring is the simplest form point-to-point interconnection. The main limitation of the single slotted ring architecture is that latency of access increase linearly with the number of the processors in the ring. Because of this, we proposed the dual slotted ring as an alternative to single slotted ring for cache-based multiprocessor system. In this paper, we analyze the proposed dual slotted ring architecture using new snooping protocol and enforce simulation to compare it with single slotted ring.

  • PDF

A Dual Mode Buffer Cache Management Policy for a Continuous Media Server (연속 미디어 서버를 위한 이중 모드 버퍼 캐쉬 관리 기법)

  • Seo, Won-Il;Park, Yong-Woon;Chung, Ki-Dong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.12
    • /
    • pp.3642-3651
    • /
    • 1999
  • In this paper, we propose a new caching scheme for continuous media data where the buffer allocation unit is divided into two modes : interval and object. All of objects' access patterns are monitored and based on the results of monitoring, a request for an object is decided to cache its data with either interval mode or object mode. The results of our simulation show that our proposed caching scheme is better than the existing caching algorithms such as interval caching where the access patterns of the objects are changed with time.

  • PDF

Dual-Cache Scheme in Parallel File System (병렬 파일 시스템에서 이중 캐쉬 구조)

  • Jang, Won-Young;Kim, Chei-Yol;Seo, Dae-Wha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.271-274
    • /
    • 2001
  • 프로세스와 디스크 입출력 속도를 비교해보면, 디스크 입출력의 속도가 휠씬 더 느리다. 따라서 디스크 입출력은 현재의 컴퓨팅 환경에서 병목현상이 되고있다. PFSL(Parallel File System for Linux)은 이런 문제를 해결하기 위한 클러스터링 환경의 병렬 파일 시스템이다. PFSL은 리눅스 머신 상에서 POSIX 스레드 라이브러리를 이용하여 멀티 스레드로 수행된다. 이 논문에서는 PFSL의 성능을 개선하기 위해 클러스터 환경의 작업 부하에 적합하도록 설계한 이중 캐쉬 구조를 소개하고자 한다.

  • PDF

Implementation and Performance Analysis of Efficient Packet Processing Method For DPI (Deep Packet Inspection) System using Dual-Processors (듀얼 프로세서 기반 DPI (Deep Packet Inspection) 엔진을 위한 효율적 패킷 프로세싱 방안 구현 및 성능 분석)

  • Yang, Joon-Ho;Han, Seung-Jae
    • The KIPS Transactions:PartC
    • /
    • v.16C no.4
    • /
    • pp.417-422
    • /
    • 2009
  • Implementation of DPI(Deep Packet Inspection) system on a general purpose multiprocessor platform is an attractive option from the implementation cost point of view, since it does not require high-cost customized hardware. Load balancing has been considered as a primary means to achieve high performance in multi processor systems. We claim, however, that in case of DPI system design simply balancing the load of each processor does not necessarily yield the highest system performance. Instead, we propose a method in which tasks are allocated to processors based on their functions. We implemented the proposed method in dual processor Linux system and compare its performance with the existing load balancing methods. Under the proposed method, one processor is dedicated to deal with interrupt handling and generic packet processing, while another processor is dedicated to DPI processing. According to experimental results, the proposed scheme outperforms the existing schemes by 60%, mainly because of the reduction of cache miss and spin lock occurrences.

Performance Enhancement of Handover in mSCTP using Pre-acquisition RA in WLAN (WLAN에서 RA 선수신을 이용한 mSCTP 핸드오버 성능 향상)

  • Choi, Soon-Won;Kim, Kwang-Ryoul;Min, Sung-Gi
    • Journal of KIISE:Information Networking
    • /
    • v.33 no.2
    • /
    • pp.156-164
    • /
    • 2006
  • The SCTP (Stream Control Transmission Protocol) implementation with the DAR (Dynamic Address Reconfiguration) extension is called the mSCTP (Mobile SCTP) that is proposed recently for mobility support in transport layer. The mSCTP does not satisfy short handover latency for real-time applications and it has no specific handover decision mechanisms. In this paper, we propose fast handover schemes for mobile nodes that are moving into different subnet using pre-acquisition RA (Router Advertisement) and L3 trigger for improving handover performance. Furthermore, we introduce three specific methods which are RA cache, FMIPv6 (Fast Handovers for Mobile IPv6) and dual interface and how proposed scheme can be interoperated with handover process respectively. Finally, we show two experimental results which are the mSCTP and the mSCTP using FMIPv6 on Linux platforms. Experimental results show that handover performance is improved with reducing the time of receiving RA which takes most of total handover latency.

Dual Cache System Based on the Locality Decision Mechanism (지역성 결정 메커니즘을 기반으로 한 이중 캐쉬 시스템)

  • Lee, Jeong-Hun;Lee, Jang-Su;Kim, Sin-Deok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.11
    • /
    • pp.908-918
    • /
    • 2000
  • 캐쉬의 성능을 향상시키는 가장 효과적인 방법은 프로그램 수행 특성에 내재되어 있는 시간적 (temporal locality) -공간적 지역성 (spatial locality)을 활용하는 것이다. 본 논문에서는 추가적인 장치나 컴파일러의 도움 없이 단지 캐쉬의 구조적인 특징과 간단한 메커니즘만을 이용하여 두 가지 타입의 지역성을 효과적으로 반영할 수 있는 새로운 캐쉬 시스템이 제안된다. 제안하는 새로운 캐쉬 시스템은 다른 블록 크기와 다른 연관도를 가지는 두 개의 캐쉬로써 구성되어 진다. 즉 작은 블록 크기를 지원하는 직접사상 캐쉬 (direct-mapped cache)와 큰 블록을 지원하는 완전 연관 버퍼 (fully-associative buffer)로 구성되어 진다. 큰 블록은 여러 개의 작은 블록으로 구성되어지며 두 캐쉬에서 접근 실패가 발생할 경우 직접사상 캐쉬의 접근 실패가 발생한 작은 블록과 그 이웃 작은 블록을 완전 연관 버퍼에 저장시킴으로써 한번 참조가 일어난 블록의 이웃 블록이 참조될 확률이 높다는 공간적 지역성의 특성을 효과적으로 반영할 수 있다. 또한 참조가 일어난 블록은 제어 비트를 사용하여 선택적으로 작은 블록을 직접사상 캐쉬에 저장함으로써 시간적 지역성을 보다 효과적으로 사용할 수 있다 시뮬레이션 결과에 따르면 기존의 직접사상 캐쉬의 4배 크기보다도 좋은 성능 향상을 보이고 있으며, 동일한 크기의 victim 캐쉬보다 우수한 성능을 보이고 소비 전력 면에서는 5% 정도의 전력 감소를 보이고 있다.

  • PDF

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

Trend of Intel Nonvolatile Memory Technology (인텔 비휘발성 메모리 기술 동향)

  • Lee, Y.S.;Woo, Y.J.;Jung, S.I.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of nonvolatile memory technology, Intel has released the Optane datacenter persistent memory module (DCPMM) that can be deployed in the dual in-line memory module. The results of research and experiments on Optane DCPMMs are significantly different from the anticipated results in previous studies through emulation. The DCPMM can be used in two different modes, namely, memory mode (similar to volatile DRAM: Dynamic Random Access Memory) and app direct mode (similar to file storage). It has buffers in 256-byte granularity; this is four times the CPU (Central Processing Unit) cache line (i.e., 64 bytes). However, these properties are not easy to use correctly, and the incorrect use of these properties may result in performance degradation. Optane has the same characteristics of DRAM and storage devices. To take advantage of the performance characteristics of this device, operating systems and applications require new approaches. However, this change in computing environments will require a significant number of researches in the future.

Static Timing Analysis of Shared Caches for Multicore Processors

  • Zhang, Wei;Yan, Jun
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.267-278
    • /
    • 2012
  • The state-of-the-art techniques in multicore timing analysis are limited to analyze multicores with shared instruction caches only. This paper proposes a uniform framework to analyze the worst-case performance for both shared instruction caches and data caches in a multicore platform. Our approach is based on a new concept called address flow graph, which can be used to model both instruction and data accesses for timing analysis. Our experiments, as a proof-of-concept study, indicate that the proposed approach can accurately compute the worst-case performance for real-time threads running on a dual-core processor with a shared L2 cache (either to store instructions or data).

Performance Evaluation of the New DRAM Architectures in Multiprogramming Environment (멀티프로그래밍 환경에서의 새로운 DRAM 구조의 성능 분석)

  • 안태원;정덕균;민상렬;최윤호
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.6
    • /
    • pp.177-187
    • /
    • 1994
  • In the design of modern computer systems, the speed gap between the CPUs and DRAMs has been a major concern. To relieve this problem at a low cost, several new DRAM architectures have been proposed. This study is aimed at evaluating quantitatively the impact of the new DRAM architectures (synchronous DRAM. dual-RAS synchronous DRAM, and enhanced DRAM) on the memory system performance. We developed a cache and memory simulator and performed various experiments using the traces generated from four benchmark programs. The simulation results show that the new DRAM architectures offer a better performance than a conventional one by 5~30% in a low cost system and their improvement in a high performance system is less than 1%. However, for resonable multiprogramming workoads, additional performance improvement of about 10~28% is expected in a high performance system while 1~3% in a low cost system.

  • PDF