• Title/Summary/Keyword: Heterogeneous Memory

Search Result 71, Processing Time 0.029 seconds

Low-power heterogeneous uncore architecture for future 3D chip-multiprocessors

  • Dorostkar, Aniseh;Asad, Arghavan;Fathy, Mahmood;Jahed-Motlagh, Mohammad Reza;Mohammadi, Farah
    • ETRI Journal
    • /
    • v.40 no.6
    • /
    • pp.759-773
    • /
    • 2018
  • Uncore components such as on-chip memory systems and on-chip interconnects consume a large amount of energy in emerging embedded applications. Few studies have focused on next-generation analytical models for future chip-multiprocessors (CMPs) that simultaneously consider the impacts of the power consumption of core and uncore components. In this paper, we propose a convex-optimization approach to design heterogeneous uncore architectures for embedded CMPs. Our convex approach optimizes the number and placement of memory banks with different technologies on the memory layer. In parallel with hybrid memory architecting, optimizing the number and placement of through silicon vias as a viable solution in building three-dimensional (3D) CMPs is another important target of the proposed approach. Experimental results show that the proposed method outperforms 3D CMP designs with hybrid and traditional memory architectures in terms of both energy delay products (EDPs) and performance parameters. The proposed method improves the EDPs by an average of about 43% compared with SRAM design. In addition, it improves the throughput by about 7% compared with dynamic RAM (DRAM) design.

Memory Allocation in Mobile Multitasking Environments with Real-time Constraints

  • Hyokyung, Bahn
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.79-84
    • /
    • 2023
  • Due to the rapid performance improvement of smartphones, multitasking on mobile platforms has become an essential feature. Unlike traditional desktop or server environments, mobile applications are mostly interactive jobs where response time is important, and some applications are classified as real-time jobs with deadlines. When interactive and real-time jobs run concurrently, memory allocation between multitasking applications is a challenging issue as they have different time requirements. In this paper, we study how to allocate memory space when real-time and interactive jobs are simultaneously executed in a smartphone to meet the multitasking requirements between heterogeneous jobs. Specifically, we analyze the memory size required to satisfy the constraints of real-time jobs and present a new model for allocating memory space between heterogeneous multitasking jobs. Trace-driven simulations show that the proposed model provides reasonable performance for interactive jobs while guaranteeing the requirement of real-time jobs.

Evolution of Nonvolatile Resistive Switching Memory Technologies: The Related Influence on Hetrogeneous Nanoarchitectures

  • Eshraghian, Kamran
    • Transactions on Electrical and Electronic Materials
    • /
    • v.11 no.6
    • /
    • pp.243-248
    • /
    • 2010
  • The emergence of different and disparate materials together with the convergence of both the 'old' and 'emerging' technologies is paving the way for integration of heterogeneous technologies that are likely to extend the limitations of silicon technology beyond the roadmap envisaged for complementary metal-oxide semiconductor. Formulation of new information processing concepts based on novel aspects of nano-scale based materials is the catalyst for new nanoarchitectures driven by a different perspective in realization of novel logic devices. The memory technology has been the pace setter for silicon scaling and thus far has pave the way for new architectures. This paper provides an overview of the inevitability of heterogeneous integration of technologies that are in their infancy through initiatives of material physicists, computational chemists, and bioengineers and explores the options in the spectrum of novel non-volatile memory technologies considered as forerunner of new logic devices.

Energy and Performance-Efficient Dynamic Load Distribution for Mobile Heterogeneous Storage Devices (에너지 및 성능 효율적인 이종 모바일 저장 장치용 동적 부하 분산)

  • Kim, Young-Jin;Kim, Ji-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.9-17
    • /
    • 2009
  • In this paper, we propose a dynamic load distribution technique at the operating system level in mobile storage systems with a heterogeneous storage pair of a small form-factor and disk and a flash memory, which aims at saving energy consumption as well as enhancing I/O performance. Our proposed technique takes a combinatory approach of file placement and buffer cache management techniques to find how the load can be distributed in an energy and performance-aware way for a heterogeneous mobile storage air of a hard disk and a flash memory. We demonstrate that the proposed technique provides better experimental results with heterogeneous mobile storage devices compared with the existing techniques through extensive simulations.

Stationary bootstrapping for structural break tests for a heterogeneous autoregressive model

  • Hwang, Eunju;Shin, Dong Wan
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.4
    • /
    • pp.367-382
    • /
    • 2017
  • We consider an infinite-order long-memory heterogeneous autoregressive (HAR) model, which is motivated by a long-memory property of realized volatilities (RVs), as an extension of the finite order HAR-RV model. We develop bootstrap tests for structural mean or variance changes in the infinite-order HAR model via stationary bootstrapping. A functional central limit theorem is proved for stationary bootstrap sample, which enables us to develop stationary bootstrap cumulative sum (CUSUM) tests: a bootstrap test for mean break and a bootstrap test for variance break. Consistencies of the bootstrap null distributions of the CUSUM tests are proved. Consistencies of the bootstrap CUSUM tests are also proved under alternative hypotheses of mean or variance changes. A Monte-Carlo simulation shows that stationary bootstrapping improves the sizes of existing tests.

Real-time Task Aware Memory Allocation Techniques for Heterogeneous Mobile Multitasking Environments (이종 모바일 멀티태스킹 환경을 위한 실시간 작업 인지형 메모리 할당 기술 연구)

  • Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.3
    • /
    • pp.43-48
    • /
    • 2022
  • Recently, due to the rapid performance improvement of smartphones and the increase in background executions of mobile apps, multitasking has become common on mobile platforms. Unlike traditional desktop and server apps, response time is important in most mobile apps as they are interactive tasks, and some apps are classified as real-time tasks with deadlines. In this paper, we discuss how to meet the requirements of heterogeneous multitasking in managing memory of real-time and interactive tasks when they are executed together on a smartphone. To do so, we analyze the memory requirement of real-time tasks, and propose a model that has the ability of allocating memory to multitasking tasks on a smartphone. Trace-driven simulations with real-world storage access traces captured by heterogeneous apps show that the proposed model provides reasonable performance for interactive tasks while guaranteeing the requirement of real-time tasks.

3-D Hetero-Integration Technologies for Multifunctional Convergence Systems

  • Lee, Kang-Wook
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.22 no.2
    • /
    • pp.11-19
    • /
    • 2015
  • Since CMOS device scaling has stalled, three-dimensional (3-D) integration allows extending Moore's law to ever high density, higher functionality, higher performance, and more diversed materials and devices to be integrated with lower cost. 3-D integration has many benefits such as increased multi-functionality, increased performance, increased data bandwidth, reduced power, small form factor, reduced packaging volume, because it vertically stacks multiple materials, technologies, and functional components such as processor, memory, sensors, logic, analog, and power ICs into one stacked chip. Anticipated applications start with memory, handheld devices, and high-performance computers and especially extend to multifunctional convengence systems such as cloud networking for internet of things, exascale computing for big data server, electrical vehicle system for future automotive, radioactivity safety system, energy harvesting system and, wireless implantable medical system by flexible heterogeneous integrations involving CMOS, MEMS, sensors and photonic circuits. However, heterogeneous integration of different functional devices has many technical challenges owing to various types of size, thickness, and substrate of different functional devices, because they were fabricated by different technologies. This paper describes new 3-D heterogeneous integration technologies of chip self-assembling stacking and 3-D heterogeneous opto-electronics integration, backside TSV fabrication developed by Tohoku University for multifunctional convergence systems. The paper introduce a high speed sensing, highly parallel processing image sensor system comprising a 3-D stacked image sensor with extremely fast signal sensing and processing speed and a 3-D stacked microprocessor with a self-test and self-repair function for autonomous driving assist fabricated by 3-D heterogeneous integration technologies.

Implementation of Integrated CPU-GPU for Efficient Uniform Memory Access Method and Verification System (CPU-GPU간 긴밀성을 위한 효율적인 공유메모리 접근 방법과 검증 시스템 구현)

  • Park, Hyun-moon;Kwon, Jinsan;Hwang, Tae-ho;Kim, Dong-Sun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.2
    • /
    • pp.57-65
    • /
    • 2016
  • In this paper, we propose a system for efficient use of shared memory between CPU and GPU. The system, called Fusion Architecture, assures consistency of the shared memory and minimizes cache misses that frequently occurs on Heterogeneous System Architecture or Unified Virtual Memory based systems. It also maximizes the performance for memory intensive jobs by efficient allocation of GPU cores. To test between architectures on various scenarios, we introduce the Fusion Architecture Analyzer, which compares OpenMP, OpenCL, CUDA, and the proposed architecture in terms of memory overhead and process time. As a result, Proposed fusion architectures show that the Fusion Architecture runs benchmarks 55% faster and reduces memory overheads by 220% in average.

Autoencoder factor augmented heterogeneous autoregressive model (오토인코더를 이용한 요인 강화 HAR 모형)

  • Park, Minsu;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.49-62
    • /
    • 2022
  • Realized volatility is well known to have long memory, strong association with other global financial markets and interdependences among macroeconomic indices such as exchange rate, oil price and interest rates. This paper proposes autoencoder factor-augmented heterogeneous autoregressive (AE-FAHAR) model for realized volatility forecasting. AE-FAHAR incorporates long memory using HAR structure, and exogenous variables into few factors summarized by autoencoder. Autoencoder requires intensive calculation due to its nonlinear structure, however, it is more suitable to summarize complex, possibly nonstationary high-dimensional time series. Our AE-FAHAR model is shown to have smaller out-of-sample forecasting error in empirical analysis. We also discuss pre-training, ensemble in autoencoder to reduce computational cost and estimation errors.

TP-Sim: A Trace-driven Processing-in-Memory Simulator (TP-Sim: 트레이스 기반의 프로세싱 인 메모리 시뮬레이터)

  • Jeonggeun Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.3
    • /
    • pp.78-83
    • /
    • 2023
  • This paper proposes a lightweight trace-driven Processing-In-Memory (PIM) simulator, TP-Sim. TP-Sim is a General Purpose PIM (GP-PIM) simulator that evaluates various PIM system performance-related metrics. Based on instruction and memory traces extracted from the Intel Pin tool, TP-Sim can replay trace files for multiple models of PIM architectures to compare its performance. To verify the availability of TP-Sim, we estimated three different system configurations on the STREAM benchmark. Compared to the traditional Host CPU-only systems with conventional memory hierarchy, simple GP-PIM architecture achieved better performance; even the Host CPU has the same number of in-order cores. For further study, we also extend TP-Sim as a part of a heterogeneous system simulator that contains CPU, GPGPU, and PIM as its primary and co-processors.

  • PDF