Search | Korea Science

Considering Read and Write Characteristics of Page Access Separately for Efficient Memory Management

Hyokyung Bahn
- International journal of advanced smart convergence
- /
- v.12 no.1
- /
- pp.70-75
- /
- 2023
With the recent proliferation of memory-intensive workloads such as deep learning, analyzing memory access characteristics for efficient memory management is becoming increasingly important. Since read and write operations in memory access have different characteristics, an efficient memory management policy should take into accountthe characteristics of thesetwo operationsseparately. Although some previous studies have considered the different characteristics of reads and writes, they require a modified hardware architecture supporting read bits and write bits. Unlike previous approaches, we propose a software-based management policy under the existing memory architecture for considering read/write characteristics. The proposed policy logically partitions memory space into the read/write area and the write area by making use of reference bits and dirty bits provided in modern paging systems. Simulation experiments with memory access traces show that our approach performs better than the CLOCK algorithm by 23% on average, and the effect is similar to the previous policy with hardware support.
https://doi.org/10.7236/IJASC.2023.12.1.70 인용 PDF

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment

Jang, Yeon-Woo;Kang, Moon-Hwan;Chang, Jae-Woo
- Journal of Information Processing Systems
- /
- v.14 no.2
- /
- pp.499-509
- /
- 2018
Recently, hybrid transactional memory (HyTM) has gained much interest from researchers because it combines the advantages of hardware transactional memory (HTM) and software transactional memory (STM). To provide the concurrency control of transactions, the existing HyTM-based studies use a bloom filter. However, they fail to overcome the typical false positive errors of a bloom filter. Though the existing studies use a global lock, the efficiency of global lock-based memory allocation is significantly low in multi-core environment. In this paper, we propose an efficient hybrid transactional memory scheme using near-optimal retry computation and sophisticated memory management in order to efficiently process transactions in multi-core environment. First, we propose a near-optimal retry computation algorithm that provides an efficient HTM configuration using machine learning algorithms, according to the characteristic of a given workload. Second, we provide an efficient concurrency control for transactions in different environments by using a sophisticated bloom filter. Third, we propose a memory management scheme being optimized for the CPU cache line, in order to provide a fast transaction processing. Finally, it is shown from our performance evaluation that our HyTM scheme achieves up to 2.5 times better performance by using the Stanford transactional applications for multi-processing (STAMP) benchmarks than the state-of-the-art algorithms.
https://doi.org/10.3745/JIPS.01.0026 인용 PDF KSCI

Memory-Efficient Hypercube Key Establishment Scheme for Micro-Sensor Networks

Lhee, Kyung-Suk
- ETRI Journal
- /
- v.30 no.3
- /
- pp.483-485
- /
- 2008
A micro-sensor network is comprised of a large number of small sensors with limited memory capacity. Current key-establishment schemes for symmetric encryption require too much memory for micro-sensor networks on a large scale. In this paper, we propose a memory-efficient hypercube key establishment scheme that only requires logarithmic memory overhead.
PDF

Implementation of Integrated CPU-GPU for Efficient Uniform Memory Access Method and Verification System (CPU-GPU간 긴밀성을 위한 효율적인 공유메모리 접근 방법과 검증 시스템 구현)

Park, Hyun-moon;Kwon, Jinsan;Hwang, Tae-ho;Kim, Dong-Sun
- IEMEK Journal of Embedded Systems and Applications
- /
- v.11 no.2
- /
- pp.57-65
- /
- 2016
In this paper, we propose a system for efficient use of shared memory between CPU and GPU. The system, called Fusion Architecture, assures consistency of the shared memory and minimizes cache misses that frequently occurs on Heterogeneous System Architecture or Unified Virtual Memory based systems. It also maximizes the performance for memory intensive jobs by efficient allocation of GPU cores. To test between architectures on various scenarios, we introduce the Fusion Architecture Analyzer, which compares OpenMP, OpenCL, CUDA, and the proposed architecture in terms of memory overhead and process time. As a result, Proposed fusion architectures show that the Fusion Architecture runs benchmarks 55% faster and reduces memory overheads by 220% in average.
https://doi.org/10.14372/IEMEK.2016.11.2.57 인용 PDF KSCI

A Memory-efficient Hand Segmentation Architecture for Hand Gesture Recognition in Low-power Mobile Devices

Choi, Sungpill;Park, Seongwook;Yoo, Hoi-Jun
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.17 no.3
- /
- pp.473-482
- /
- 2017
Hand gesture recognition is regarded as new Human Computer Interaction (HCI) technologies for the next generation of mobile devices. Previous hand gesture implementation requires a large memory and computation power for hand segmentation, which fails to give real-time interaction with mobile devices to users. Therefore, in this paper, we presents a low latency and memory-efficient hand segmentation architecture for natural hand gesture recognition. To obtain both high memory-efficiency and low latency, we propose a streaming hand contour tracing unit and a fast contour filling unit. As a result, it achieves 7.14 ms latency with only 34.8 KB on-chip memory, which are 1.65 times less latency and 1.68 times less on-chip memory, respectively, compare to the best-in-class.
https://doi.org/10.5573/JSTS.2017.17.3.473 인용 PDF KSCI

A Memory-Efficient VLC Decoder Architecture for MPEG-2 Application

Lee, Seung-Joon;Suh, Ki-bum;Chong, Jong-wha
- Proceedings of the IEEK Conference
- /
- 1999.11a
- /
- pp.360-363
- /
- 1999
Video data compression is a major key technology in the field of multimedia applications. Variable-length coding is the most popular data compression technique which has been used in many data compression standards, such as JPEG, MPEG and image data compression standards, etc. In this paper, we present memory efficient VLC decoder architecture for MPEG-2 application which can achieve small memory space and higher throughput. To reduce the memory size, we propose a new grouping, remainder generation method and merged lookup table (LUT) for variable length decoders (VLD's). In the MPEG-2, the discrete cosine transform (DCT) coefficient table zero and one are mapped onto one memory whose space requirement has been minimized by using efficient memory mapping strategy The proposed memory size is only 256 words in spite of mapping two DCT coefficient tables.
PDF

Characterizing Memory References for Smartphone Applications and Its Implications

Lee, Soyoon;Bahn, Hyokyung
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.15 no.2
- /
- pp.223-231
- /
- 2015
As smartphones support a variety of applications and their memory demand keeps increasing, the design of an efficient memory management policy is becoming increasingly important. Meanwhile, as nonvolatile memory (NVM) technologies such as PCM and STT-MRAM have emerged as new memory media of smartphones, characterizing memory references for NVM-based smartphone memory systems is needed. For the deep understanding of memory access features in smartphones, this paper performs comprehensive analysis of memory references for various smartphone applications. We first analyze the temporal locality and frequency of memory reference behaviors to quantify the effects of the two properties with respect to the re-reference likelihood of pages. We also analyze the skewed popularity of memory references and model it as a Zipf-like distribution. We expect that the result of this study will be a good guidance to design an efficient memory management policy for future smartphones.
https://doi.org/10.5573/JSTS.2015.15.2.223 인용 PDF KSCI

Automated optimization for memory-efficient high-performance deep neural network accelerators

Kim, HyunMi;Lyuh, Chun-Gi;Kwon, Youngsu
- ETRI Journal
- /
- v.42 no.4
- /
- pp.505-517
- /
- 2020
The increasing size and complexity of deep neural networks (DNNs) necessitate the development of efficient high-performance accelerators. An efficient memory structure and operating scheme provide an intuitive solution for high-performance accelerators along with dataflow control. Furthermore, the processing of various neural networks (NNs) requires a flexible memory architecture, programmable control scheme, and automated optimizations. We first propose an efficient architecture with flexibility while operating at a high frequency despite the large memory and PE-array sizes. We then improve the efficiency and usability of our architecture by automating the optimization algorithm. The experimental results show that the architecture increases the data reuse; a diagonal write path improves the performance by 1.44× on average across a wide range of NNs. The automated optimizations significantly enhance the performance from 3.8× to 14.79× and further provide usability. Therefore, automating the optimization as well as designing an efficient architecture is critical to realizing high-performance DNN accelerators.
https://doi.org/10.4218/etrij.2020-0125 인용 PDF KSCI

An efficient Storage Reclamation Algorithm for RISC Parallel Processing (RISC 병렬 처리를 위한 기억공간의 효율적인 활용 알고리즘)

이철원;임인칠
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.9
- /
- pp.703-711
- /
- 1991
In this paper, an efficient storage reclamation algorithm for RISC parallel processing in the object orented programming environments is presented. The memory management for the dynamic memory allocation and the frequent memory access in object oriented programming is the main factor that decreases RISC parallel processing performance. The proposed algorithm can be efficiently allocated the memory space of RISCy computer which is required the frequent memory access, so it can be increased RISC parallel processing performance. The proposed algorithm is verified the efficiency by implementing C language on SUN SPARC(4.3 BSD UNIX).
PDF

A Study on Efficient Use of Dual Data Memory Banks in Flight Control Computers

Cho, Doosan
- International Journal of Internet, Broadcasting and Communication
- /
- v.9 no.1
- /
- pp.29-34
- /
- 2017
Over the past several decades, embedded system and flight control computer technologies have been evolved to meet the diverse needs of the mobile device market. Current embedded systems are at the heart of technologies that can take advantage of small-sized specialized hardware while still providing high-efficiency performance at low cost. One of these key technologies is multiple memory banks. For example, a dual memory bank can provide two times more memory bandwidth in the same memory space. This benefit take lower cost to provide the same bandwidth. However, there is still few software technologies to support the efficient use of multiple memory banks. In this study, we present a technique to efficiently exploit multiple memory banks by software support. Specifically, our technique use an interference graph to optimally allocate data to different memory banks by an optimizing compiler. As a result, the execution time can be improved upto 7% with the proposed technique.
https://doi.org/10.7236/IJIBC.2017.9.1.29 인용 PDF

Search Result 1,330, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)