• Title/Summary/Keyword: cache performance

Search Result 659, Processing Time 0.028 seconds

A Framework for an Advanced Learning Mechanism in Context-aware Systems using Improved Back-Propagation Algorithm (상황 인지 시스템에서 개선된 역전파 알고리즘을 사용하는 진보된 학습 메커니즘을 위한 프레임워크)

  • Zha, Wei;Eo, Sang-Hun;Kim, Gyoung-Bae;Cho, Sook-Kyoung;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.139-144
    • /
    • 2007
  • In seeking to improve the workload efficiency and inference capability of context-aware systems, we propose a new framework for an advanced teaming mechanism that uses improved bath propagation (BP) algorithm. Even though a learning mechanism is one of the most important parts in a context-aware system, the existing algorithms focused on facilitating systems by elaborating the learning mechanism with user's context information are rare. BP is the most adaptable algorithm for learning mechanism of context-aware systems. By using the improved BP algorithm, the framework we proposed drastically improves the inference capability so that the overall performance is far better than other systems. Also, using the special system cache, the framework manages the workload efficiently. Experiments show that there is an obvious improvement in overall performanre of the context-awareness systems using the proposed framework.

Playback Quantity-based Proxy Caching Scheme for Continuous Media Data (재생량에 기반한 연속미디어 데이터 프록시 캐슁 기법)

  • Hong, Hyeon-Ok;Im, Eun-Ji;Jeong, Gi-Dong
    • The KIPS Transactions:PartB
    • /
    • v.9B no.3
    • /
    • pp.303-310
    • /
    • 2002
  • In this paper, we propose a proxy caching scheme that stores a portion of a continuous media object or an entire object on the Internet. The proxy stores the initial fraction of a continuous media object and determines the optimal size of the initial fraction of the continuous media object to be cached based on the object popularity. Under the proposed scheme, the initial latency of most clients and the data transferred from a remote server can be reduced and limited cache storage space can be utilized efficiently. Considering the characteristics of continuous media, we also propose the novel popularity for the continuous media objects based on the amount of the data of each object played by the clients. Finally, we have performed trace-driven simulations to evaluate our caching scheme and the popularity for the continuous media objects. Through these simulations, we have verified that our caching scheme, PPC outperforms other well-known caching schemes in terms of BHR, DSR and replacement and that popularity for the continuous media objects based on the amount of the playback data can enhance the performance of caching scheme.

Data Communication Prediction Model in Multiprocessors based on Robust Estimation (로버스트 추정을 이용한 다중 프로세서에서의 데이터 통신 예측 모델)

  • Jun Janghwan;Lee Kangwoo
    • The KIPS Transactions:PartA
    • /
    • v.12A no.3 s.93
    • /
    • pp.243-252
    • /
    • 2005
  • This paper introduces a noble modeling technique to build data communication prediction models in multiprocessors, using Least-Squares and Robust Estimation methods. A set of sample communication rates are collected by using a few small input data sets into workload programs. By applying estimation methods to these samples, we can build analytic models that precisely estimate communication rates for huge input data sets. The primary advantage is that, since the models depend only on data set size not on the specifications of target systems or workloads, they can be utilized to various systems and applications. In addition, the fact that the algorithmic behavioral characteristics of workloads are reflected into the models entitles them to model diverse other performance metrics. In this paper, we built models for cache miss rates which are the main causes of data communication in shared memory multiprocessor systems. The results present excellent prediction error rates; below $1\%$ for five cases out of 12, and about $3\%$ for the rest cases.

A Data Prefetching Scheme Exploiting the Grain Size in Parallel Programs using Data Arrays (데이타 배열을 사용하는 병렬 프로그램에서 그레인 크기를 이용한 데이타 선인출 기법)

  • Jung, In-Bum;Lee, Joon-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.1
    • /
    • pp.101-108
    • /
    • 2000
  • The data prefetching scheme is an effective technique to reduce the main memory access latency by exploiting the overlap of processor computations with data accesses. However, if the prefetched data replicate the useful existing data in the cache memory and they are not being used in computations. performances of programs are aggravated. This phenomenon results from the lack of correct predictions for data being used in the future. When parallel programs exploit the data arrays for computations, the grain size is useful information for data prefetching scheme because it implies the range of data using in computations. Based on this information, we suggest a new data prefetching scheme exploited by the grain size of the parallel program. Simulation results show that the suggested prefetching scheme improves the performance of the simulated parallel programs due to the reduction of bus transactions as well as useful prefetching operations.

  • PDF

Dynamic Scheduling of Network Processes for Multi-Core Systems (멀티 코어 시스템에서 통신 프로세스의 동적 스케줄링)

  • Jang, Hye-Churn;Jin, Hyun-Wook;Kim, Hag-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.968-972
    • /
    • 2009
  • The multi-core processors are being widely exploited by many high-end systems. With significant advances in processor architecture, the network band-width required on the high-end systems is increasing drastically. It is therefore highly desirable to manage multiple cores efficiently to achieve high network band-width with minimum resource requirements. Modern operating systems, however, still have significant design and optimization space to leverage the network performance over multi-core systems. In this paper, we suggest a novel networking process scheduling scheme, which decides the best processor affinity of networking processes based on the processor cache layout, communication intensiveness, and processor loads. The experimental results show that the scheduling scheme implemented in the Linux kernel can improve the network bandwidth and the effectiveness of processor utilization by 20% and 59%, respectively.

Page Replacement Algorithm for Improving Performance of Hybrid Main Memory (하이브리드 메인 메모리의 성능 향상을 위한 페이지 교체 기법)

  • Lee, Minhoe;Kang, Dong Hyun;Kim, Junghoon;Eom, Young Ik
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.1
    • /
    • pp.88-93
    • /
    • 2015
  • In modern computer systems, DRAM is commonly used as main memory due to its low read/write latency and high endurance. However, DRAM is volatile memory that requires periodic power supply (i.e., memory refresh) to sustain the data stored in it. On the other hand, PCM is a promising candidate for replacement of DRAM because it is non-volatile memory, which could sustain the stored data without memory refresh. PCM is also available for byte-addressable access and in-place update. However, PCM is unsuitable for using main memory of a computer system because it has two limitations: high read/write latency and low endurance. To take the advantage of both DRAM and PCM, a hybrid main memory, which consists of DRAM and PCM, has been suggested and actively studied. In this paper, we propose a novel page replacement algorithm for hybrid main memory. To cope with the weaknesses of PCM, our scheme focuses on reducing the number of PCM writes in the hybrid main memory. Experimental results shows that our proposed page replacement algorithm reduces the number of PCM writes by up to 80.5% compared with the other page replacement algorithms.

Weighted Binary Prefix Tree for IP Address Lookup (IP 주소 검색을 위한 가중 이진 프리픽스 트리)

  • Yim Changhoon;Lim Hyesook;Lee Bomi
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.11B
    • /
    • pp.911-919
    • /
    • 2004
  • IP address lookup is one of the essential functions on internet routers, and it determines overall router performance. The most important evaluation factor for software-based IP address lookup is the number of the worst case memory accesses. Binary prefix tree (BPT) scheme gives small number of worst case memory accesses among previous software-based schemes. However the tree structure of BPT is normally unbalanced. In this paper, we propose weighted binary prefix tree (WBP) scheme which generates nearly balanced tree, through combining the concept of weight to the BPT generation process. The proposed WBPT gives very small number of worst case memory accesses compared to the previous software-based schemes. Moreover the WBPT requires comparably small size of memory which can be fit within L2 cache for about 30,000 prefixes, and it is rather simple for prefix addition and deletion. Hence the proposed WBPT can be used for software-based If address lookup in practical routers.

Effective resource selection and mobility management scheme in mobile grid computing (모바일 그리드 컴퓨팅에서 효율적인 자원 확보와 이동성 관리 기법)

  • Lee, Dae-Won
    • The Journal of Korean Association of Computer Education
    • /
    • v.13 no.1
    • /
    • pp.53-64
    • /
    • 2010
  • In this paper, we tried to enable a mobile device as a resource to access to mobile grid networks. By advanced Internet techniques, the use of mobile devices has been rapidly increased. Some researches in mobile grid computing tried to combine grid computing with mobile devices. However, according to intrinsic properties of mobile environments, mobile devices have many considerations, such as mobility management, disconnected operation, device heterogeneity, service discovery, resource sharing, security, and so on. To solve these problems, there are two trends for mobile grid computing: a proxy-based mobile grid architecture and an agent-based mobile grid architecture. We focus on a proxy-based mobile grid architecture with IP-paging, which can easily manage idle mobile devices and grid resource status information. Also, we use SIP(Session Initiation Protocol)to support mobility management, mobile grid services. We manage variation of mobile device state and power by paging cache. Finally, using the candidate set and the reservation set of resources, we perform task migration. The performance evaluation by simulation, shows improvement of efficiency and stability during execution.

  • PDF

Efficient Method to Support Mobile Virtualization-based Cloud Resource Management (모바일 가상화기반 클라우드 자원관리를 지원하는 효율적 방법)

  • Kang, Yongho;Jang, Changbok;Lee, Wanjik;Heo, Seokyeol;Kim, Jooman
    • Journal of Digital Convergence
    • /
    • v.12 no.2
    • /
    • pp.277-283
    • /
    • 2014
  • Recently, various cloud service has been being provided on mobile devices as well as desktop pc and server computer. Also, Smartphone users are very rapidly increasing, and they are using it for enjoying various services(cloud service, game, banking service, mobile office, etc.). So, research to utilize resources on mobile device has been conducted. In this paper, We have suggested efficient method of cloud resource management by using information of available physical resources(CPU, memory, storage, etc.) between mobile devices, and information of physical resource in mobile device. Suggested technology is possible to guarantee real-time process and efficiently manage resources.

Real-time Implementation of MPEG-4 HVXC Encoder and Decoder on Floating Point DSP (부동 소수점 DSP를 이용한 MPEG-4 HVXC 인코더 및 디코더의 실시간 구현)

  • Kang, Kyeong-ok;Na, Hoon;Hong, Jin-Woo;Jeong, Dae-Gwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.37-44
    • /
    • 2000
  • In this paper, we described the real-time implementation effort of MPEG-4 audio HVXC (Harmonic Vector eXcitation Coding) algorithm for very low bitrates, which has target applications from mobile communications to Internet telephony, on current high performance floating point TMS320C6701 DSP. We adopted a hardware structure for real-time operation. In order for software optimization, we used C- and assembly-language level optimizations for time-critical functional codes. Utilizing the internal program memory of the DSP as the program cache, the internal data memory overlap technique and DMA functionality, we could get a goal of realtime operation of HVXC codec both at 2 kbit/s and at 4 kbit/s. For an encoder at 2 kbit/s, the optimization ratio to original code is about 96 %. Finally, we got the subjective quality of MOS 2.45 at 2 kbit/s from an informal quality test.

  • PDF