• Title/Summary/Keyword: Temporal locality

Search Result 60, Processing Time 0.02 seconds

Efficient Implementation of SVM-Based Speech/Music Classifier by Utilizing Temporal Locality (시간적 근접성 향상을 통한 효율적인 SVM 기반 음성/음악 분류기의 구현 방법)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.149-156
    • /
    • 2012
  • Support vector machines (SVMs) are well known for their pattern recognition capability, but proper care should be taken to alleviate their inherent implementation cost resulting from high computational intensity and memory requirement, especially in embedded systems where only limited resources are available. Since the memory requirement determined by the dimensionality and the number of support vectors is generally too high for a cache in embedded systems to accomodate, frequent accesses to the main memory occur inevitably whenever the cache is not able to provide requested data to the processor. These frequent accesses to the main memory result in overall performance degradation and increased energy consumption because a memory access typically takes longer and consumes more energy than a cache access or a register access. In this paper, we propose a technique that reduces the number of main memory accesses by optimizing the data access pattern of the SVM-based classifier in such a way that the temporal locality of the accesses increases, fully utilizing data loaded into the processor chip. With experiments, we confirm the enhancement made by the proposed technique in terms of the number of memory accesses, overall execution time, and energy consumption.

The buffer Management system for reducing write/erase operations in NAND flash memory (NAND 플래시 메모리에서 쓰기/지우기 연산을 줄이기위한 버퍼 관리 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.10
    • /
    • pp.1-10
    • /
    • 2011
  • There are the large overhead of block erase and page write operations in NAND flash memory, though it has low power consumption, cheap prices and a large storage. Due to the physical characteristics of NAND flash memory, overwrite operations are not permitted at the same location, so rewriting operation require after erase operation. it cause performance decrease of NAND flash memory. Using SRAM buffer in traditional NAND flash memory, it can not only reduce effective write operation but also guarantee fast memory access time. In this paper, we proposed the small SRAM buffer management system for reducing overhead of NAND flash memory, that is, erase and write operations. The proposed buffer system in a NAND flash memory consists of two parts, i.e., a fully associative temporal buffer with the small fetching block size and a fully associative spatial buffer with the large fetching block size. The temporal buffer have small fetching blocks that referenced from spatial buffer. When it happen write operations or erase operations in NAND flash memory, the related fetching blocks in temporal buffer include a page or a block are written in NAND flash memory at the same time. The writing and erasing counts in NAND flash memory can be reduced. According to the simulation results, although we have high miss ratios, write and erase operations can be reduced approximatively 58% and 83% respectively. Also the average memory access times are improved about 84% compared with the fully associative buffer with two sizes.

An Architecture of One-Dimensional Systolic Array for Full-Search Block Matching Algorithm (완전탐색 블럭정합 알고리즘을 위한 일차원 시스톨릭 어레이의 구조)

  • Lee, Su-Jin;Woo, Chong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.39 no.5
    • /
    • pp.34-42
    • /
    • 2002
  • In this paper, we designed the VLSI array architecture for the high speed processing of the motion estimation used by block matching algorithm. We derived the one dimensional systolic array from the full search block matching algorithm. The data and control signals of the proposed systolic array are passed through adjacent processing element. So proposed architecture has temporal and spatial locality. The I/O ports exists only in the first and last processing elements of the array. This architecture has low pin counts and modular expandability. So the proposed array architecture can be cascaded for different block size and search range.

An Efficient Variable Rearrangement Technique for STT-RAM Based Hybrid Caches

  • Youn, Jonghee M.;Cho, Doosan
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.2
    • /
    • pp.67-78
    • /
    • 2016
  • The emerging Spin-Transfer Torque RAM (STT-RAM) is a promising component that can be used to improve the efficiency as a result of its high storage density and low leakage power. However, the state-of-the-art STT-RAM is not ready to replace SRAM technology due to the negative effect of its write operations. The write operations require longer latency and more power than the same operations in SRAM. Therefore, a hybrid cache with SRAM and STT-RAM technologies is proposed to obtain the benefits of STT-RAM while minimizing its negative effects by using SRAM. To efficiently use of the hybrid cache, it is important to place write intensive data onto the cache. Such data should be placed on SRAM to minimize the negative effect. Thus, we propose a technique that optimizes placement of data in main memory. It drives the proper combination of advantages and disadvantages for SRAM and STT-RAM in the hybrid cache. As a result of the proposed technique, write intensive data are loaded to SRAM and read intensive data are loaded to STT-RAM. In addition, our technique also optimizes temporal locality to minimize conflict misses. Therefore, it improves performance and energy consumption of the hybrid cache architecture in a certain range.

Hierarchical Location Caching Scheme for Mobile Object Tracking in the Internet of Things

  • Han, Youn-Hee;Lim, Hyun-Kyo;Gil, Joon-Min
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1410-1429
    • /
    • 2017
  • Mobility arises naturally in the Internet of Things networks, since the location of mobile objects, e.g., mobile agents, mobile software, mobile things, or users with wireless hardware, changes as they move. Tracking their current location is essential to mobile computing. To overcome the scalability problem, hierarchical architectures of location databases have been proposed. When location updates and lookups for mobile objects are localized, these architectures become effective. However, the network signaling costs and the execution number of database operations increase particularly when the scale of the architectures and the numbers of databases becomes large to accommodate a great number of objects. This disadvantage can be alleviated by a location caching scheme which exploits the spatial and temporal locality in location lookup. In this paper, we propose a hierarchical location caching scheme, which acclimates the existing location caching scheme to a hierarchical architecture of location databases. The performance analysis indicates that the adjustment of such thresholds has an impact on cost reduction in the proposed scheme.

Gated Recurrent Unit based Prefetching for Graph Processing (그래프 프로세싱을 위한 GRU 기반 프리페칭)

  • Shivani Jadhav;Farman Ullah;Jeong Eun Nah;Su-Kyung Yoon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.2
    • /
    • pp.6-10
    • /
    • 2023
  • High-potential data can be predicted and stored in the cache to prevent cache misses, thus reducing the processor's request and wait times. As a result, the processor can work non-stop, hiding memory latency. By utilizing the temporal/spatial locality of memory access, the prefetcher introduced to improve the performance of these computers predicts the following memory address will be accessed. We propose a prefetcher that applies the GRU model, which is advantageous for handling time series data. Display the currently accessed address in binary and use it as training data to train the Gated Recurrent Unit model based on the difference (delta) between consecutive memory accesses. Finally, using a GRU model with learned memory access patterns, the proposed data prefetcher predicts the memory address to be accessed next. We have compared the model with the multi-layer perceptron, but our prefetcher showed better results than the Multi-Layer Perceptron.

  • PDF

A Node Relocation Strategy of Trajectory Indexes for Efficient Processing of Spatiotemporal Range Queries (효율적인 시공간 영역 질의 처리를 위한 궤적 색인의 노드 재배치 전략)

  • Lim Duksung;Cho Daesoo;Hong Bonghee
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.664-674
    • /
    • 2004
  • The trajectory preservation property that stores only one trajectory in a leaf node is the most important feature of an index structure, such as the TB-tree for retrieving object's moving paths in the spatio-temporal space. It performs well in trajectory-related queries such as navigational queries and combined queries. But, the MBR of non-leaf nodes in the TB-tree have large amounts of dead space because trajectory preservation is achieved at the sacrifice of the spatial locality of trajectories. As dead space increases, the overlap between nodes also increases, and, thus, the classical range query cost increases. We present a new split policy and entry relocation policies, which have no deterioration of the performance for trajectory-related queries, for improving the performance of range queries. To maximally reduce the dead space of a non-leaf node's MBR, the Maximal Area Reduction (MAR) policy is used as a split policy for non-leaf nodes. The entry relocation policy induces entries in non-leaf nodes to exchange each other for the purpose of reducing dead spaces in these nodes. We propose two algorithms for the entry relocation policy, and evaluate the performance studies of new algorithms comparing to the TB-tree under a varying set of spatio-temporal queries.

Preference-Based Segment Buffer Replacement in Cluster VOD Servers (클러스터 VOD서버에서 선호도 기반 세그먼트 버퍼 대체 기법)

  • Seo, Dong-Mahn;Lee, Joa-Hyoung;Bang, Cheol-Seok;Lim, Dong-Sun;Jung, In-Bum;Kim, Yoon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.11
    • /
    • pp.797-809
    • /
    • 2006
  • To support the QoS streams for large scale clients, the internal resources of VOD servers should be utilized based on the characteristics of the streaming media service. Among the various resources in the server, the main memory is used for the buffer space to the media data loaded from the disks and the buffer hit ratio has a great impact upon the server performance. However, if the buffer data with high hit ratio are replaced for the new media data as a result of the number of clients and the required movie titles are increased, the negative impact on the scalability of server performance is occurred. To address this problem, the buffer replacement policy considers the intrinsic characteristics of the streaming media such as the sequential access to large volume data and the highly disproportionate preference to specific movies. In this paper, the preference-based segment buffer replacement policy is proposed in the cluster-based VOD server to exploit the characteristics of the streaming media. Since the proposed method reflects both the temporal locality by the clients' preference and the spatial locality by the sequential access to media data, the buffer hit ratio would be improved as compared to the existing buffer replacement policy. The enhanced buffer hit ratio causes the fact that the performance scalability of the cluster-based VOD server is linearly improved as the number of cluster nodes is increased.

RSSI based Proximity User Detection System using Exponential Moving Average (지수이동평균을 이용한 RSSI 기반 근거리 사용자 탐지 시스템)

  • Yun, Gi-Hun;Kim, Keon-Wook;Choi, Jae-Hun;Park, Soo-Jun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.4
    • /
    • pp.105-111
    • /
    • 2010
  • This paper proposes the recursive algorithm for passive proximity detection system based on signal strength. The system is designed to be used in the smart medicine chest in order to provide location-based service for the senior personnel. Due to the system profile, single receiver and uni-direction communication are applied over the signal attenuation model for the determination of user existence within certain proximity. The performance of conventional methods is subjective to the sight between the transmitter and receiver unless the direction of target is known. To appreciate the temporal and spatial locality of human subjects, the authors present exponential moving average (EMA) to compensate the unexpected position error from the direction and/or environment. By using optimal parameter, the experiments with EMA algorithm demonstrates 32.26% (maximum 40.80%) reduction in average of the error probability with 50% of consecutive sight in time.

Cache memory system for high performance CPU with 4GHz (4Ghz 고성능 CPU 위한 캐시 메모리 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.2
    • /
    • pp.1-8
    • /
    • 2013
  • TIn this paper, we propose a high performance L1 cache structure on the high clock CPU of 4GHz. The proposed cache memory consists of three parts, i.e., a direct-mapped cache to support fast access time, a two-way set associative buffer to exploit temporal locality, and a buffer-select table. The most recently accessed data is stored in the direct-mapped cache. If a data has a high probability of a repeated reference, when the data is replaced from the direct-mapped cache, the data is selectively stored into the two-way set associative buffer. For the high performance and low power consumption, we propose an one way among two ways set associative buffer is selectively accessed based on the buffer-select table(BST). According to simulation results, Energy $^*$ Delay product can improve about 45%, 70% and 75% compared with a direct mapped cache, a four-way set associative cache, and a victim cache with two times more space respectively.