• Title/Summary/Keyword: Embedded memory

Search Result 724, Processing Time 0.019 seconds

Implementation of Integrated CPU-GPU for Efficient Uniform Memory Access Method and Verification System (CPU-GPU간 긴밀성을 위한 효율적인 공유메모리 접근 방법과 검증 시스템 구현)

  • Park, Hyun-moon;Kwon, Jinsan;Hwang, Tae-ho;Kim, Dong-Sun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.2
    • /
    • pp.57-65
    • /
    • 2016
  • In this paper, we propose a system for efficient use of shared memory between CPU and GPU. The system, called Fusion Architecture, assures consistency of the shared memory and minimizes cache misses that frequently occurs on Heterogeneous System Architecture or Unified Virtual Memory based systems. It also maximizes the performance for memory intensive jobs by efficient allocation of GPU cores. To test between architectures on various scenarios, we introduce the Fusion Architecture Analyzer, which compares OpenMP, OpenCL, CUDA, and the proposed architecture in terms of memory overhead and process time. As a result, Proposed fusion architectures show that the Fusion Architecture runs benchmarks 55% faster and reduces memory overheads by 220% in average.

Design of Efficient Memory Architecture for Coeff_Token Encoding in H.264/AVC Video Coding Standard (H.264/AVC 동영상 압축 표준에서 Coeff_token 부호화를 위한 효율적임 메모리 구조 설계)

  • Moon, Yong Ho;Park, Kyoung Choon;Ha, Seok Wun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.2
    • /
    • pp.77-83
    • /
    • 2010
  • In this paper, we propose an efficient memory architecture for coeff_token encoding in H.264/AVC standard. The VLCTs used to encode the coeff_token syntax element are implemented with the memory. In general, the size of memory must be reduced because it affects the cost and operation speed of the system. Based on the analysis for the codewords in VLCTs, new memory architecture is designed in this paper. The proposed memory architecture results in about 24% memory saving, compared to the conventional memory architecture.

A Performance Analysis of Embedded Systems adapting Data Prefetching (데이터 선인출을 채용한 임베디드 시스템의 성능 분석)

  • Moon, Hyun-Ju;Yoo, Hyun-Bae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.1
    • /
    • pp.148-155
    • /
    • 2006
  • Portable embedded systems which mainly handle multimedia applications involve the problem that frequent accesses to fetch data from memory make running time increased. To cope with the problem, embedded processors have adopted data prefetching schemes. From a power point of view, which is a main performance indicator of embedded systems, this paper analyzed to investigate how data prefetching schemes influence on system's performance. To solve the problem, we proposed a power-consumption analysis model of a memory system with data prefetching scheme and measured the power dissipated during running application programs. As a result data prefetching schemes have application program's running time reduced but have system's power increased. Also we proposed a performance analysis model considering execution time and power consumption for embedded system with data prefetching schemes.

Delayed Write Scheme to Enhance Write Performance of Flash Memory Based Embedded Database Systems (플래시 메모리 기반 임베디드 데이터베이스 시스템의 쓰기 성능 향상을 위한 지연쓰기 기법)

  • Song, Ha-Joo;Kwon, Oh-Heum
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.2
    • /
    • pp.165-177
    • /
    • 2009
  • Embedded database systems (EDBMS) based on NAND flash memories are widely adopted for logging data on sensor nodes. Since write and erase operations of a flash memory are time consuming compared to read operations and wear memory cells, it is important to reduce these operations to enhance the EDBMS performance and to extend the memory life. In this paper, we propose a delayed write scheme to archive this goal. Proposed scheme stores updated parts of database pages into delayed write records to reduce the database page writes. By doing that, it decreases write and erase operations on a flash memory. Therefore, the proposed scheme enhances the logging performance of a write-intensive EDBMS on a sensor node and extends the flash memory life.

  • PDF

Performane Modeling of Flash Memory Storage Systems Using Simulink (시뮬링크를 이용한 플래시메모리 저장장치 성능 모델링)

  • Min, Hang Jun;Park, Jeong Su;Lee, Joo Il;Min, Sang Lyul;Kim, Kanghee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.5
    • /
    • pp.263-272
    • /
    • 2011
  • The complexity of flash memory based storage systems is high due to diverse host interfaces and other design choices such as mapping granularity, flash memory controller execution models and so on. Thus, it is possible that the actual performance after implementation is not consistent with the target performance. This paper demonstrates that the performance prediction of flash memory based storage systems is possible through performance modeling that takes into account various design parameters. In the performance modeling, the FTL, which is the core element of flash memory based storage systems, is modeled as a set of (copy-on-write) logs and their interactions. Also, the flash memory controller is modeled based on the classification proposed in the design of the Ozone flash controller. In this study, the performance model has been implemented using Simulink and experimental results are presented and analyzed.

Low-power heterogeneous uncore architecture for future 3D chip-multiprocessors

  • Dorostkar, Aniseh;Asad, Arghavan;Fathy, Mahmood;Jahed-Motlagh, Mohammad Reza;Mohammadi, Farah
    • ETRI Journal
    • /
    • v.40 no.6
    • /
    • pp.759-773
    • /
    • 2018
  • Uncore components such as on-chip memory systems and on-chip interconnects consume a large amount of energy in emerging embedded applications. Few studies have focused on next-generation analytical models for future chip-multiprocessors (CMPs) that simultaneously consider the impacts of the power consumption of core and uncore components. In this paper, we propose a convex-optimization approach to design heterogeneous uncore architectures for embedded CMPs. Our convex approach optimizes the number and placement of memory banks with different technologies on the memory layer. In parallel with hybrid memory architecting, optimizing the number and placement of through silicon vias as a viable solution in building three-dimensional (3D) CMPs is another important target of the proposed approach. Experimental results show that the proposed method outperforms 3D CMP designs with hybrid and traditional memory architectures in terms of both energy delay products (EDPs) and performance parameters. The proposed method improves the EDPs by an average of about 43% compared with SRAM design. In addition, it improves the throughput by about 7% compared with dynamic RAM (DRAM) design.

A Ranking Cleaning Policy for Embedded Flash File Systems (임베디드 플래시 파일시스템을 위한 순위별 지움 정책)

  • Kim, Jeong-Ki;Park, Sung-Min;Kim, Chae-Kyu
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.399-404
    • /
    • 2002
  • Along the evolution of information and communication technologies, manufacturing embedded systems such as PDA (personal digital assistant), HPC (hand -held PC), settop box. and information appliance became realistic. And RTOS (real-time operating system) and filesystem have been played essential re]os within the embedded systems as well. For the filesystem of embedded systems, flash memory has been used extensively instead of traditional hard disk drives because of embedded system's requirements like portability, fast access time, and low power consumption. Other than these requirements, nonvolatile storage characteristic of flash memory is another reason for wide adoption in industry. However, there are some technical challenges to cope with to use the flash memory as an indispensable component of the embedded systems. These would be relatively slow cleaning time and the limited number of times to write-and-clean. In this paper, a new cleaning policy is proposed to overcome the problems mentioned above and relevant performance comparison results will be provided. Ranking cleaning policy(RCP) decides when and where to clean within the flash memory considering the cost of cleaning and the number of times of cleaning. This method will maximize not only the lifetime of flash memory but also the performance of access time and manageability. As a result of performance comparison, RCP has showed about 10 ~ 50% of performance evolution compared to traditional policies, Greedy and Cost-benefit methods, by write throughputs.

Analysis on the Effectiveness of the Filter Buffer for Low Power NAND Flash Memory (저전력 NAND 플래시 메모리를 위한 필터 버퍼의 효율성 분석)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.4
    • /
    • pp.201-207
    • /
    • 2012
  • Currently, NAND Flash memory has been widely used in consumer storage devices due to its non-volatility, stability, economical feasibility, low power usage, durability, and high density. However, a high capacity of NAND flash memory causes the high power consumption and the low performance. In the convention memory research, a hierarchical filter mechanism can archive an effective performance improvement in terms of the power consumption. In order to attain the best filter structure for NAND flash memory, we selected a direct-mapped filter, a victim filter, a fully associative filter and a 4-way set associative filter for comparison in the performance analysis. According to the results of the simulation, the fully associative filter buffer with a 128byte fetching size can obtain the bet performance compared to another filter structures, and it can reduce the energy*delay product(EDP) by about 93% compared to the conventional NAND Flash memory.

High Performance PCM&DRAM Hybrid Memory System (고성능 PCM&DRAM 하이브리드 메모리 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.2
    • /
    • pp.117-123
    • /
    • 2016
  • In general, PCM (Phase Change Memory) is unsuitable as a main memory because it has limitations: high read/write latency and low endurance. However, the DRAM&PCM hybrid memory with the same level is one of the effective structures for a next generation main memory because it can utilize an advantage of both DRAM and PCM. Therefore, it needs an effective page management method for exploiting each memory characteristics dynamically and adaptively. So we aim reducing an access time and write count of PCM by using an effective page replacement. According to our simulation, the proposed algorithm for the DRAM&PCM hybrid can reduce the PCM access count by around 60% and the PCM write count by 42% given the same PCM size, compared with Clock-DWF algorithm.

Multi-mode Embedded Compression Algorithm and Architecture for Code-block Memory Size and Bandwidth Reduction in JPEG2000 System (JPEG2000 시스템의 코드블록 메모리 크기 및 대역폭 감소를 위한 Multi-mode Embedded Compression 알고리즘 및 구조)

  • Son, Chang-Hoon;Park, Seong-Mo;Kim, Young-Min
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.8
    • /
    • pp.41-52
    • /
    • 2009
  • In Motion JPEG2000 encoding, huge bandwidth requirement of data memory access is the bottleneck in required system performance. For the alleviation of this bandwidth requirement, a new embedded compression(EC) algorithm with a little bit of image quality drop is devised. For both random accessibility and low latency, very simple and efficient entropy coding algorithm is proposed. We achieved significant memory bandwidth reductions (about 53${\sim}$81%) and reduced code-block memory to about half size through proposed multi-mode algorithms, without requiring any modification in JPEG2000 standard algorithm.