• 제목/요약/키워드: Memory Latency

검색결과 361건 처리시간 0.022초

MBS-LVM: A High-Performance Logical Volume Manager for Memory Bus-Connected Storages over NUMA Servers

  • Lee, Yongseob;Park, Sungyong
    • Journal of Information Processing Systems
    • /
    • 제15권1호
    • /
    • pp.151-158
    • /
    • 2019
  • With the recent advances of memory technologies, high-performance non-volatile memories such as non-volatile dual in-line memory module (NVDIMM) have begun to be used as an addition or an alternative to server-side storages. When these memory bus-connected storages (MBSs) are installed over non-uniform memory access (NUMA) servers, the distance between NUMA nodes and MBSs is one of the crucial factors that influence file processing performance, because the access latency of a NUMA system varies depending on its distance from the NUMA nodes. This paper presents the design and implementation of a high-performance logical volume manager for MBSs, called MBS-LVM, when multiple MBSs are scattered over a NUMA server. The MBS-LVM consolidates the address space of each MBS into a single global address space and dynamically utilizes storage spaces such that each thread can access an MBS with the lowest latency possible. We implemented the MBS-LVM in the Linux kernel and evaluated its performance by porting it over the tmpfs, a memory-based file system widely used in Linux. The results of the benchmarking show that the write performance of the tmpfs using MBS-LVM has been improved by up to twenty times against the original tmpfs over a NUMA server with four nodes.

직접 메모리 접근 장치에서 버스트 데이터 전송 기능의 효과적인 활용 (Efficient Utilization of Burst Data Transfers of DMA)

  • 이종원;조두산;백윤흥
    • 대한임베디드공학회논문지
    • /
    • 제8권5호
    • /
    • pp.255-264
    • /
    • 2013
  • Resolving of memory access latency is one of the most important problems in modern embedded system design. Recently, tons of studies are presented to reduce and hide the access latency. Burst/page data transfer modes are representative hardware techniques for achieving such purpose. The burst data transfer capability offers an average access time reduction of more than 65 percent for an eight-word sequential transfer. However, solution of utilizing such burst data transfer to improve memory performance has not been accomplished at commercial level. Therefore, this paper presents a new technique that provides the maximum utilization of burst transfer for memory accesses with local variables in code by reorganizing variables placement.

Predictive Memory Allocation over Skewed Streams

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • 제7권2호
    • /
    • pp.199-202
    • /
    • 2009
  • Adaptive memory management is a serious issue in data stream management. Data stream differ from the traditional stored relational model in several aspect such as the stream arrives online, high volume in size, skewed data distributions. Data skew is a common property of massive data streams. We propose the predicted allocation strategy, which uses predictive processing to cope with time varying data skew. This processing includes memory usage estimation and indexing with timestamp. Our experimental study shows that the predictive strategy reduces both required memory space and latency time for skewed data over varying time.

Dual-Port SDRAM Optimization with Semaphore Authority Management Controller

  • Kim, Jae-Hwan;Chong, Jong-Wha
    • ETRI Journal
    • /
    • 제32권1호
    • /
    • pp.84-92
    • /
    • 2010
  • This paper proposes the semaphore authority management (SAM) controller to optimize the dual-port SDRAM (DPSDRAM) in the mobile multimedia systems. Recently, the DPSDRAM with a shared bank enabling the exchange of data between two processors at high speed has been developed for mobile multimedia systems based on dual-processors. However, the latency of DPSDRAM caused by the semaphore for preventing the access contention at the shared bank slows down the data transfer rate and reduces the memory bandwidth. The methodology of SAM increases the data transfer rate by minimizing the semaphore latency. The SAM prevents the latency of reading the semaphore register of DPSDRAM, and reduces the latency of waiting for the authority of the shared bank to be changed. It also reduces the number of authority requests and the number of times authority changes. The experimental results using a 1 Gb DPSDRAM (OneDRAM) with the SAM controllers at 66 MHz show 1.6 times improvement of the data transfer rate between two processors compared with the traditional controller. In addition, the SAM shows bandwidth enhancement of up to 38% for port A and 31% for port B compared with the traditional controller.

JPEG2000의 웨이블릿 변환용 메모리 크기 및 대역폭 감소를 위한 새로운 Embedded Compression 알고리즘 (A New Embedded Compression Algorithm for Memory Size and Bandwidth Reduction in Wavelet Transform Appliable to JPEG2000)

  • 손창훈;송성근;김지원;박성모;김명민
    • 한국멀티미디어학회논문지
    • /
    • 제14권1호
    • /
    • pp.94-102
    • /
    • 2011
  • JPEG2000 시스템에서 요구하는 메모리의 크기와 대역폭을 감소시키기 위하여 본 논문은 약간의 화질 손실이 있는 새로운 임베디드 압축(Embedded Compression) 알고리즘을 제안한다. 또한, 메모리 내의 압축된 데이터에 임의 접근성(Random Accessibility)과 짧은 지연 시간(Latency)을 보장하기 위해서 매우 단순하면서도 효율적인 하다마드(Hadamard) 변환 기반의 부호화 방식을 제안한다. JPEG2000 표준안의 알고리즘에 변경을 주지 않고, 제안한 EC 알고리즘을 통해 LL 임시 메모리의 크기와 코드블록 메모리의 크기를 약 2 배로 줄이며, 약 52~73%의 메모리 대역폭을 감소시킬 수 있다.

Activation of Adenosine A2A Receptor Impairs Memory Acquisition but not Consolidation or Retrieval Phases

  • Kim, Dong-Hyun;Ryu, Jong-Hoon
    • Biomolecules & Therapeutics
    • /
    • 제16권4호
    • /
    • pp.320-327
    • /
    • 2008
  • Several lines of evidence indicate that adenosine $A_{2A}$ agonist disrupts spatial working memory. However, it is unclear which stages of learning and memory are affected by the stimulation of adenosine $A_{2A}$ receptor. To clarify these points, we employed CV-1808 as adenosine $A_{2A}$ agonist and investigated its effects on acquisition, consolidation, and retrieval phases of learning and memory using passive avoidance and the Morris water maze tasks. During the acquisition phase, CV-1808 (2-phenylaminoadenosine, 1 and 2 mg/kg, i.p.) decreased the latency time in passive avoidance task and the mean savings in the Morris water maze task, respectively. During the consolidation and retrieval phase tests, CV-1808 did not exhibited any effects on latency time in passive avoidance task and the mean savings in the Morris water maze task. These results suggest that CV-1808 as an adenosine $A_{2A}$ agonist impairs memory acquisition but not consolidation or retrieval.

하이브리드 메인 메모리의 성능 향상을 위한 페이지 교체 기법 (Page Replacement Algorithm for Improving Performance of Hybrid Main Memory)

  • 이민호;강동현;김정훈;엄영익
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제21권1호
    • /
    • pp.88-93
    • /
    • 2015
  • DRAM은 빠른 쓰기/읽기 속도와 무한한 쓰기 횟수로 인해 컴퓨터 시스템에서 주로 메인 메모리로 사용되지만 저장된 데이터를 유지하기 위해 지속적인 전원공급이 필요하다. 반면, PCM은 비휘발성 메모리로 전원공급 없이 저장된 데이터를 유지할 수 있으며 DRAM과 같이 바이트 단위의 접근과 덮어쓰기가 가능하다는 점에서 DRAM을 대체할 수 있는 메모리로 주목받고 있다. 하지만 PCM은 느린 쓰기/읽기 속도와 제한된 쓰기 횟수로 인해 메인 메모리로 사용되기 어렵다. 이런 이유로 DRAM과 PCM의 장점을 모두 활용하기 위한 하이브리드 메인 메모리가 제안되었고 이에 대한 연구가 활발하다. 본 논문에서는 DRAM과 PCM으로 구성된 하이브리드 메인 메모리를 위한 새로운 페이지 교체 기법을 제안한다. PCM의 단점을 보완하기 위해 제안 기법은 PCM 쓰기 횟수를 줄이는 것을 목표로 하며 실험결과에서 알 수 있듯이 본 논문의 제안 기법은 다른 페이지 교체 기법에 비해 PCM 쓰기 횟수를 80.5% 줄인다.

비기능이 학습과 기억에 미치는 영향에 대한 실험적 연구 (Experimental Study on the Influence of the Function of Spleen on Learning and Memory)

  • 박찬원;이진우;채한;홍무창;신민규
    • 대한한의학회지
    • /
    • 제20권4호
    • /
    • pp.39-49
    • /
    • 2000
  • This study was conducted to prove that there exists a relation between the spleen and learning and memory as Oriental medicine believesTo promote the function of the Spleen, Guibitang was administered to rats in this study. Rats were 250~300g Sprague-Dawley, and were divided into three groups. One was the normal group without any pretreatment. Another was the control group which was administered normal saline and the abdominal injection of L-NAME before learning and memory test. And the 3rd was the sample group, to which was administered Guibitang extract and (no 'the') abdominal injection of L-NAME before the learning and memory test. Each group was made up of 12 rats. Morris water maze and radial arm maze tasks were performed in the learning test and Morris water maze task in the memory test. For 2 days to evaluate the ability of learning in the Morris water maze, 16 trials were carried out and first latency(lapse time to find the escape platform for the first time) was measured. The next day, to evaluate the ability of memory, the escape platform was eliminated from the maze, and total path, target entry number, first latency and memory score were measured. 48hrs before the radial arm maze task was performed, bait was deprived from each group. After learning test, bait was permitted to each group. So 85% of the body weight was maintained for 6 days of the test. Each of the eight arms was baited; correct choice numer and error were counted; each trial was finished when the rat had entered each of the eight arms, or more than 10 minutes had elapsed. The results were as follows: In the learning test, the first latency of the sample group in the Morris water maze showed evident improvement of learning compared to control group at the 11th, 12th, 13th trial of 16 trials, and correct choice number in radial arm maze showed noticeable improvement compared to the control group at 3rd, 4th and 5th; In the memory test, the memory score of the sample group showed evident improvement compared to the control group. From the above results, the administration of Guibitang, which tonifies the function of the Spleen, could enhance the ability of learning and memory. So it was suggested that the Spleen has a relation with learning and memory.

  • PDF

클라우드 데이터베이스에서의 꼬리응답시간 감소를 위한 가비지 컬렉션 동기화 기법 (Garbage Collection Synchronization Technique for Improving Tail Latency of Cloud Databases)

  • 한승욱;한상욱;김지홍
    • 정보과학회 논문지
    • /
    • 제44권8호
    • /
    • pp.767-773
    • /
    • 2017
  • 클라우드 데이터베이스와 같은 분산 시스템 환경에서는 균일한 서비스 품질을 보장하기 위해 꼬리 응답시간을 짧게 유지하는 것이 중요하다. 본 논문에서는 카산드라 데이터베이스를 대상으로, 긴 꼬리 응답시간에 해당하는 지연이 메모리 공간 부족으로 인해 발생한다는 것을 보이며, 이러한 지연이 메모리 공간 확보를 위해 버퍼에 저장된 데이터를 저장장치에 완전히 내려쓸 때까지 카산드라가 사용자의 요청을 받지 않기 때문임을 밝힌다. 버퍼에 저장된 데이터를 내려쓰는데 걸리는 시간은 저장장치 성능에 따라 결정되므로 SSD의 가바지 컬렉션으로 인한 성능 저하가 꼬리 응답시간을 더 길게 만들고 있음을 관찰하였다. 우리는 자바가상기계에서의 가비지 컬렉션과 SSD에서의 가비지 컬렉션을 함께 수행하여 SSD의 가비지 컬렉션 비용을 숨기는, SyncGC 기법을 제안한다. 실험 결과, SyncGC 기법을 통해 꼬리 응답시간인 $99.9^{th}$$99.9^{th}-percentile$을 각각 31%, 36% 줄일 수 있었다.

Analysis of read speed latency in 6T-SRAM cell using multi-layered graphene nanoribbon and cu based nano-interconnects for high performance memory circuit design

  • Sandip, Bhattacharya;Mohammed Imran Hussain;John Ajayan;Shubham Tayal;Louis Maria Irudaya Leo Joseph;Sreedhar Kollem;Usha Desai;Syed Musthak Ahmed;Ravichander Janapati
    • ETRI Journal
    • /
    • 제45권5호
    • /
    • pp.910-921
    • /
    • 2023
  • In this study, we designed a 6T-SRAM cell using 16-nm CMOS process and analyzed the performance in terms of read-speed latency. The temperaturedependent Cu and multilayered graphene nanoribbon (MLGNR)-based nanointerconnect materials is used throughout the circuit (primarily bit/bit-bars [red lines] and word lines [write lines]). Here, the read speed analysis is performed with four different chip operating temperatures (150K, 250K, 350K, and 450K) using both Cu and graphene nanoribbon (GNR) nano-interconnects with different interconnect lengths (from 10 ㎛ to 100 ㎛), for reading-0 and reading-1 operations. To execute the reading operation, the CMOS technology, that is, the16-nm PTM-HPC model, and the16-nm interconnect technology, that is, ITRS-13, are used in this application. The complete design is simulated using TSPICE simulation tools (by Mentor Graphics). The read speed latency increases rapidly as interconnect length increases for both Cu and GNR interconnects. However, the Cu interconnect has three to six times more latency than the GNR. In addition, we observe that the reading speed latency for the GNR interconnect is ~10.29 ns for wide temperature variations (150K to 450K), whereas the reading speed latency for the Cu interconnect varies between ~32 ns and 65 ns for the same temperature ranges. The above analysis is useful for the design of next generation, high-speed memories using different nano-interconnect materials.