• Title/Summary/Keyword: Uniform Memory Access

Search Result 31, Processing Time 0.03 seconds

A Block Allocation Policy to Enhance Wear-leveling in a Flash File System (플래시 파일시스템에서 wear-leveling 개선을 위한 블록 할당 정책)

  • Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.574-577
    • /
    • 2007
  • While disk can be overwritten on updating data, because flash memory can not be overwritten on updating data, new data are updated in new area. If data are frequently updated, garbage collection, which is achieved by erasing blocks, should be performed to reclaim new area. Hence, because the number of erase operations is limited due to characteristics of flash memory, every block should be evenly written and erased. However, if data with access locality are processed by cost benefit algorithm with separation of hot block and cold block, though the performance of processing is high, wear-leveling is not even. In this paper, we propose CB-MB (Cost Benefit between Multi Bank) algorithm in which hot data are allocated in one bank and cold data in another bank, and in which role of hot bank and cold bank is exchanged every period. CB-MB showed that its performance was similar to that of others for uniform workload, however, the method provides much better performance than that of others for workload of access locality.

  • PDF

A Development of Fusion Processor Architecture for Efficient Main Memory Access in CPU-GPU Environment (CPU-GPU환경에서 효율적인 메인메모리 접근을 위한 융합 프로세서 구조 개발)

  • Park, Hyun-Moon;Kwon, Jin-San;Hwang, Tae-Ho;Kim, Dong-Sun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.2
    • /
    • pp.151-158
    • /
    • 2016
  • The HSA resolves an old problem with existing CPU and GPU architectures by allowing both units to directly access each other's memory pools via unified virtual memory. In a physically realized system, however, frequent data exchanges between CPU and GPU for a virtual memory block result bottlenecks and coherence request overheads. In this paper, we propose Fusion Processor Architecture for efficient access of main memory from both CPU and GPU. It consists of Job Manager, Re-mapper, and Pre-fetcher to control, organize, and distribute work loads and working areas for GPU cores. These components help on reducing memory exchanges between the two processors and improving overall efficiency by eliminating faulty page table requests. To verify proposed algorithm architectures, we develop an emulator based on QEMU, and compare several architectures such as CUDA(Compute Unified Device Architecture), OpenMP, OpenCL. As a result, Proposed fusion processor architectures show 198% faster than others by removing unnecessary memory copies and cache-miss overheads.

Metastable Vortex State of Perpendicular Magnetic Anisotropy Free Layer in Spin Transfer Torque Magnetic Tunneling Junctions

  • You, Chun-Yeol;Kim, Hyungsuk
    • Journal of Magnetics
    • /
    • v.18 no.4
    • /
    • pp.380-385
    • /
    • 2013
  • We find a metastable vortex state of the perpendicular magnetic anisotropy free layer in spin transfer torque magnetic tunneling junctions by using micromagnetic simulations. The metastable vortex state does not exist in a single layer, and it is only found in the trilayer structure with the perpendicular magnetic anisotropy polarizer layer. It is revealed that the physical origin is the non-uniform stray field from the polarizer layer.

Random-Oriented (Bi,La)4Ti3O12 Thin Film Deposited by Pulsed-DC Sputtering Method on Ferroelectric Random Access Memory Device

  • Lee, Youn-Ki;Ryu, Sung-Lim;Kweon, Soon-Yong;Yeom, Seung-Jin;Kang, Hee-Bok
    • Transactions on Electrical and Electronic Materials
    • /
    • v.12 no.6
    • /
    • pp.258-261
    • /
    • 2011
  • A ferroelectric $(Bi,La)_4Ti_3O_{12}$ (BLT) thin film fabricated by the pulsed-DC sputtering method was evaluated on a cell structure to check its compatibility to high density ferroelectric random access memory (FeRAM) devices. The BLT composition in the sputtering target was $Bi_{4.8}La_{1.0}Ti_{3.0}O_{12}$. Firstly, a BLT film was deposited on a buried Pt/$IrO_x$/Ir bottom electrode stack with W-plug connected to the transistor in a lower place. Then, the film was finally crystallized at $700^{\circ}C$ for 30 seconds in oxygen ambient. The annealed BLT layer was found to have randomly oriented and small ellipsoidal-shaped grains (long direction: ~100 nm, short direction: ~20 nm). The small and uniform-sized grains with random orientations were considered to be suitable for high density FeRAM devices.

Memory Affinity based Load Balancing Model for NUMA System (NUMA 환경에서 메모리 친화력을 고려한 부하 균등 모델)

  • Youn, Dae-Seok;Park, Hee-Kwon;Choi, Jong-Moo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06b
    • /
    • pp.346-350
    • /
    • 2008
  • AMD에서 사용한 HyperTransport 기술 기반 다중 처리기가 좋은 성능을 보이면서 최근 NUMA(Non Uniform Memory Access) 환경에 대한 관심이 증가하고 있다. 본 논문에서는 NUMA 시스템을 위한 부하균등 모델을 제안한다. 다중 처리기 시스템에서 운영체제는 특정 처리기에 부하가 많아지는 것을 부하가적은 처리기로 나누어 주기 위해 부하 균등 기법들을 가지고 있다. 이런 부하 균등 기법은 처리기가 가지고 있는 태스크 개수에 의존적인 연구가 많다. 본 연구에서는 NUMA 시스템의 메모리 접근 비용이 위치에 따라 다른 것을 반영한 부하 균등 기법의 모델을 제시한다. 이를 위해 모의 실험 환경을 구축하고 특정 상황들에 대한 실험을 통해 증명한다.

  • PDF

Cache Coherence Protocols in NUMA Multiprocessors (NUMA 다중 프로세서에서의 캐쉬 일관성 프로토콜)

  • Moh, Sang-Man;Hahn, Woo-Jong;Yoon, Suk-Han
    • Electronics and Telecommunications Trends
    • /
    • v.13 no.5 s.53
    • /
    • pp.11-22
    • /
    • 1998
  • Recently, scalable multiprocessor systems are actively developed for general-purpose computing, which are based on distributed shared memory (DSM) architecture to boost up both programmability and scalability. In this paper, we survey and analyze cache coherence protocols in non-uniform memory access (NUMA) multiprocessor systems. In particular, it has been easily inferred that specialized hardware suitable for NUMA multiprocessor systems with commodity symmetric multiprocessors (SMPs) is highly required. The cache coherence protocol combined with specialized hardware can significantly improve the performance and scalability of NUMA multiprocessor systems, providing better programmability.

A Global IPv6 Unicast Address Lookup Scheme Using Variable Multiple Hashing (가변적인 복수 해슁을 이용한 글로벌 IPv6 유니캐스트 주소 검색 구조)

  • Park Hyun-Tae;Moon Byung-In;Kang Sung-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.5B
    • /
    • pp.378-389
    • /
    • 2006
  • An IP address lookup scheme has become a critical issue increasingly for high-speed networking techniques due to the advent of IPv6 based on 128bit. In this paper, a novel global IPv6 unicast address lookup scheme is proposed for next generation internet routers. The proposed scheme perform a variable multiple hashing based on prefix grouping. Accordingly, it should not only minimize overflows with the proper number of memory modules, but also reduce a memory size required to organize forwarding tables. It has the fast building and searching mechanisms for forwarding tables during only a single memory access. Besides, it is easy to update forwarding tables incrementally. In the simulation using CERNET routing data as a 6bone test phase, we compared the proposed scheme with a similar scheme using a uniform multiple hashing. As a result, we verified that the number of overflows is reduced by 50% and the size of memory for forwarding tables is shrunken by 15% with 8 tables.

Design Error Corrector of Binary Data in Holographic Digital Data Storage System Using Subclustering (Subclustering을 이용한 홀로그래픽 디지털 정보저장 시스템의 이진 데이터 에러 보정기 구현)

  • Kim, Sang-Hoon;Kim, Jang-Hyun;Yang, Hyun-Seok;Park, Young-Pil
    • Proceedings of the KIEE Conference
    • /
    • 2004.11c
    • /
    • pp.617-619
    • /
    • 2004
  • Data storage related with writing and retrieving requires high storage capacity, fast transfer rate and less access time in. Today any data storage system can not satisfy these conditions, but holographic data storage system can perform faster data transfer rate because it is a page oriented memory system using volume hologram in writing and retrieving data. System architecture without mechanical actuating part is possible, so fast data transfer rate and high storage capacity about $1Tb/cm^3$ can be realized. In this paper, to correct errors of binary data stored in holographic digital data storage system, find cluster centers using subtractive clustering and reduce intensities of pixels around centers, so the intensify profile of data rare will be uniform and the better data storage system can be realized.

  • PDF

Measurement of Barium Ion Displacement Near Surface in a Barium Titanate Nanoparticle by Scanning Transmission Electron Microscopy

  • Aoki, Mai;Sato, Yukio;Teranishi, Ryo;Kaneko, Kenji
    • Applied Microscopy
    • /
    • v.48 no.1
    • /
    • pp.27-32
    • /
    • 2018
  • Barium titanate ($BaTiO_3$) nanoparticle is one of the most promising materials for future multi-layer ceramic capacitor and ferroelectric random access memory. It is well known that electrical property of nanoparticles depends on the atomistic structure. Although surface may possibly have an impact on the atomistic structure, reconstructed structure at the surface has not been widely investigated. In the present study, Ba-ion position near surface in a $BaTiO_3$ nanoparticle has been quantitatively characterized by scanning transmission electron microscopy. It was found that some Ba ions at the surface were greatly displaced in non-uniform directions.

Design and Comparison of Error Correctors Using Clustering in Holographic Data Storage System

  • Kim, Sang-Hoon;Kim, Jang-Hyun;Yang, Hyun-Seok;Park, Young-Pil
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1076-1079
    • /
    • 2005
  • Data storage related with writing and retrieving requires high storage capacity, fast transfer rate and less access time in. Today any data storage system can not satisfy these conditions, but holographic data storage system can perform faster data transfer rate because it is a page oriented memory system using volume hologram in writing and retrieving data. System architecture without mechanical actuating part is possible, so fast data transfer rate and high storage capacity about 1Tb/cm3 can be realized. In this paper, to correct errors of binary data stored in holographic digital data storage system, find cluster centers using clustering algorithm and reduce intensities of pixels around centers. We archive the procedure by two algorithms of C-mean and subtractive clustering, and compare the results of the two algorithms. By using proper clustering algorithm, the intensity profile of data page will be uniform and the better data storage system can be realized.

  • PDF