• Title/Summary/Keyword: Scratchpad Memory

Search Result 7, Processing Time 0.028 seconds

A Review of Data Management Techniques for Scratchpad Memory (스크래치패드 메모리를 위한 데이터 관리 기법 리뷰)

  • DOOSAN CHO
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.771-776
    • /
    • 2023
  • Scratchpad memory is a software-controlled on-chip memory designed and used to mitigate the disadvantages of existing cache memories. Existing cache memories have TAG-related hardware control logic, so users cannot directly control cache misses, and their sizes are large and energy consumption is relatively high. Scratchpad memory has advantages in terms of size and energy consumption because it eliminates such hardware overhead, but there is a burden on software to manage data. In this study, data management techniques of scratchpad memory were classified and examined, and ways to maximize the advantages were discussed.

Two-Level Scratchpad Memory Architectures to Achieve Time Predictability and High Performance

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.4
    • /
    • pp.215-227
    • /
    • 2014
  • In modern computer architectures, caches are widely used to shorten the gap between processor speed and memory access time. However, caches are time-unpredictable, and thus can significantly increase the complexity of worst-case execution time (WCET) analysis, which is crucial for real-time systems. This paper proposes a time-predictable two-level scratchpad-based architecture and an ILP-based static memory objects assignment algorithm to support real-time computing. Moreover, to exploit the load/store latencies that are known statically in this architecture, we study a Scratch-pad Sensitive Scheduling method to further improve the performance. Our experimental results indicate that the performance and energy consumption of the two-level scratchpad-based architecture are superior to the similar cache based architecture for most of the benchmarks we studied.

Scratchpad-Memory Management Using NUMA Infrastructure on Linux (Linux 상에서 NUMA 지원을 응용한 스크래치 패드 메모리 관리방법)

  • Park, Byung-Hun;Seo, Dae-Wha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.41-42
    • /
    • 2009
  • 현재 많은 임베디드 SoC(System-On-Chip)에는 캐시 메모리의 단점을 보완하기 위해 온-칩(On-Chip) SRAM, 즉, SPM(Scratchpad Memory)를 내장하고 있으며 SPM은 그 특성상 캐시 메모리와 달리 소프트웨어가 직접 관리해야 한다. 본 논문에서는 NUMA를 지원하는 Linux 상에서 이식성이 높으면서 단순하게 구현할 수 있는 SPM 관리 방법을 제안한다.

Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.2
    • /
    • pp.51-72
    • /
    • 2015
  • Time predictability is crucial in hard real-time and safety-critical systems. Cache memories, while useful for improving the average-case memory performance, are not time predictable, especially when they are shared in multicore processors. To achieve time predictability while minimizing the impact on performance, this paper explores several time-predictable scratch-pad memory (SPM) based architectures for multicore processors. To support these architectures, we propose the dynamic memory objects allocation based partition, the static allocation based partition, and the static allocation based priority L2 SPM strategy to retain the characteristic of time predictability while attempting to maximize the performance and energy efficiency. The SPM based multicore architectural design and the related allocation methods thus form a comprehensive solution to hard real-time multicore based computing. Our experimental results indicate the strengths and weaknesses of each proposed architecture and the allocation method, which offers interesting on-chip memory design options to enable multicore platforms for hard real-time systems.

Memory Hierarchy Optimization in Embedded Systems using On-Chip SRAM (On-Chip SRAM을 이용한 임베디드 시스템 메모리 계층 최적화)

  • Kim, Jung-Won;Kim, Seung-Kyun;Lee, Jae-Jin;Jung, Chang-Hee;Woo, Duk-Kyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.2
    • /
    • pp.102-110
    • /
    • 2009
  • The memory wall is the growing disparity of speed between CPU and memory outside the CPU chip. An economical solution is a memory hierarchy organized into several levels, such as processor registers, cache, main memory, disk storage. We introduce a novel memory hierarchy optimization technique in Linux based embedded systems using on-chip SRAM for the first time. The optimization technique allocates On-Chip SRAM to the code/data that selected by programmers by using virtual memory systems. Experiments performed with nine applications indicate that the runtime improvements can be achieved by up to 35%, with an average of 14%, and the energy consumption can be reduced by up to 40%, with an average of 15%.

Research on the Main Memory Access Count According to the On-Chip Memory Size of an Artificial Neural Network (인공 신경망 가속기 온칩 메모리 크기에 따른 주메모리 접근 횟수 추정에 대한 연구)

  • Cho, Seok-Jae;Park, Sungkyung;Park, Chester Sungchung
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.180-192
    • /
    • 2021
  • One widely used algorithm for image recognition and pattern detection is the convolution neural network (CNN). To efficiently handle convolution operations, which account for the majority of computations in the CNN, we use hardware accelerators to improve the performance of CNN applications. In using these hardware accelerators, the CNN fetches data from the off-chip DRAM, as the massive computational volume of data makes it difficult to derive performance improvements only from memory inside the hardware accelerator. In other words, data communication between off-chip DRAM and memory inside the accelerator has a significant impact on the performance of CNN applications. In this paper, a simulator for the CNN is developed to analyze the main memory or DRAM access with respect to the size of the on-chip memory or global buffer inside the CNN accelerator. For AlexNet, one of the CNN architectures, when simulated with increasing the size of the global buffer, we found that the global buffer of size larger than 100kB has 0.8x as low a DRAM access count as the global buffer of size smaller than 100kB.

A Study of Scratchpad memory size exploration of System-on-a Chip (시스템 온칩에서 스크래치 패드 메모리의 크기 탐색연구)

  • Cho, Jungseok;Cho, Doosan;Kim, Yongjoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.15-17
    • /
    • 2014
  • 멀티미디어를 비롯한 많은 스트리밍 어플리케이션은 에너지 소비의 상당한 부분을 데이터 접근 연산 실행 명령어에 의해서 소비된다. 이러한 어플리케이션에서는 데이터 재사용성을 이용하여 에너지 소모량을 절감할 수 있다. 빈번히 사용되는 데이터를 고속의 상위 계층 메모리에 상주시켜 메인메모리 접근 횟수를 줄인다. 결과적으로 메모리 서브시스템에서 에너지 소모를 절감할 수 있게 된다. 본 연구에서는 어플리케이션의 재사용성을 분석하여 해당 어플리케이션에 특화된 스크래치패드 메모리 서브시스템 구성을 탐색하는 기법을 제안하고자 한다. 제안된 기법을 사용하면 하드웨어 제어 캐시 메모리와 비교하여 약 49% 에너지 소모를 절감하는 것이 가능하다.