• Title/Summary/Keyword: Memory access

Search Result 1,138, Processing Time 0.025 seconds

A study of workload consolidation considering NUMA affinity (NUMA affinity를 고려한 Workload Consolidation 연구)

  • Seo, Dongyou;Kim, Shin-gye;Choi, Chanho;Eom, Hyeonsang;Yeom, Heon Y.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.204-206
    • /
    • 2012
  • SMP(Symmetric Multi-Processing)는 Shared memory bus 를 사용함으로써 scalability 가 제한적이었다. 이런 SMP의 scalability 제한을 극복하기 위해 제안 된 것이 NUMA(Non Uniform Memory Access)이다. NUMA는 memory bus 를 CPU 별 local 하게 가지고 있어 자신이 가지는 memory 영역에 대해서는 다른 영역을 접근하는 것 보다 더 빠른 latency 를 가지는 구조이다. Local 한 memory 영역의 존재는 scalability를 높여 주었지만 서버 가상화 환경에서 VM을 동적으로 scheduling 을 하였을 때 VM의 page 가 실행되는 core 의 local 한 메모리 영역에 존재하지 않게 되면 remote access로 인해 local access보다 성능이 떨어진다. 이 논문에서는 서버 가상화 환경에서 최신 architecture인 AMD bulldozer에서 NUMA affinity가 위반되었을 때 발생하는 성능 저하와 어떤 상황에서 이런 NUMA affinity가 위반되어도 성능저하가 없는지 연구하였다.

Efficient Use of On-chip Memory through Profile-Driven Array Reorganization

  • Cho, Doosan;Youn, Jonghee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.6
    • /
    • pp.345-359
    • /
    • 2011
  • In high performance embedded systems, the use of multiple on-chip memories is an essential architectural feature for exploiting inherent parallelism in multimedia applications. This feature allows multiple data accesses to be executed in parallel. However, it remains difficult to effectively exploit of multiple on-chip memories. The successful use of this architecture strongly depends on how to efficiently detect and exploit memory parallelism in target applications. In this paper, we propose a technique based on a linear array access descriptor [1], which is generated from profiled data, to detect and exploit memory parallelism. The proposed technique tackles an array reorganization problem to maximize memory parallelism in multimedia applications. We present preliminary experiments applying the proposed technique onto a representative coarse grained reconfigurable array processor (CGRA) with multimedia kernel codes. Our experimental results demonstrate that our technique optimizes data placement by putting independent data on separate storage. The results exhibit 9.8% higher performance on average compared to the existing method.

A method for preventing online games hacking using memory monitoring

  • Lee, Chang Seon;Kim, Huy Kang;Won, Hey Rin;Kim, Kyounggon
    • ETRI Journal
    • /
    • v.43 no.1
    • /
    • pp.141-151
    • /
    • 2021
  • Several methods exist for detecting hacking programs operating within online games. However, a significant amount of computational power is required to detect the illegal access of a hacking program in game clients. In this study, we propose a novel detection method that analyzes the protected memory area and the hacking program's process in real time. Our proposed method is composed of a three-step process: the collection of information from each PC, separation of the collected information according to OS and version, and analysis of the separated memory information. As a result, we successfully detect malicious injected dynamic link libraries in the normal memory space.

A Study on the Analysis and Mitigation of Temporal Access Vulnerability in Processing-In Memory (Processing-In Memory 시간적 접근 취약점 분석 및 완화에 대한 연구)

  • Tae-Wook Kim;Yeongpil Cho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.199-201
    • /
    • 2024
  • 많은 양의 데이터 처리를 요구하는 오늘날, 메모리 입/출력 없이 데이터를 처리할 수 있는 Processing-In Memory가 많은 관심을 받고 있다. Processing-In Memory는 소프트웨어 라이브러리를 통해 접근할 수 있는데, 적절히 구현되지 않은 라이브러리는 공격 대상이 된다. 본 논문에서는 Processing-In Memory 소프트웨어 라이브러리에 존재하는 시간적 접근 취약점을 분석하고 그에 대한 완화기법을 제시한다.

An Empirical Evaluation Analysis of the Performance of In-memory Bigdata Processing Platform (메모리 기반 빅데이터 처리 프레임워크의 성능개선 연구)

  • Lee, Jae hwan;Choi, Jun;Koo, Dong hun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.13-19
    • /
    • 2016
  • Spark, an in-memory big-data processing framework is popular to use for real-time processing workload. Spark can store all intermediate data in the cluster memory so that Spark can minimize I/O access. However, when the resident memory of workload is larger that the physical memory amount of the cluster, the total performance can drop dramatically. In this paper, we analyse the factors of bottleneck on PageRank Application that needs many memory through experiment, and cluster the Spark with Tachyon File System for using memory to solve the factor of bottleneck and then we improve the performance about 18%.

Implementation of the FAT32 File System using PLC and CF Memory (PLC와 CF 메모리를 이용한 FAT32 파일시스템 구현)

  • Kim, Myeong Kyun;Yang, Oh;Chung, Won Sup
    • Journal of the Semiconductor & Display Technology
    • /
    • v.11 no.2
    • /
    • pp.85-91
    • /
    • 2012
  • In this paper, the large data processing and suitable FAT32 file system for industrial system using a PLC and CF memory was implemented. Most of PLC can't save the large data in user data memory. So it's required to the external devices of CF memory or NAND flash memory. The CF memory is used in order to save the large data of PLC system. The file system using the CF memory is NTFS, FAT, and FAT32 system to configure in various ways. Typically, the file system which is widely used in industrial data storage has been implemented as modified FAT32. The conventional FAT 32 file system was not possible for multiple writing and high speed data accessing. The proposed file system was implemented by the large data processing module can be handled that the files are copied at the 40 bytes for 1msec speed logging and creating 8 files at the same time. In a sudden power failure, high reliability was obtained that the problem was solved using a power fail monitor and the non-volatile random-access memory (NVSRAM). The implemented large data processing system was applied the modified file system as FAT32 and the good performance and high reliability was showed.

A New Flash Memory Package Structure with Intelligent Buffer System and Performance Evaluation (버퍼 시스템을 내장한 새로운 플래쉬 메모리 패키지 구조 및 성능 평가)

  • Lee Jung-Hoon;Kim Shin-Dug
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.2
    • /
    • pp.75-84
    • /
    • 2005
  • This research is to design a high performance NAND-type flash memory package with a smart buffer cache that enhances the exploitation of spatial and temporal locality. The proposed buffer structure in a NAND flash memory package, called as a smart buffer cache, consists of three parts, i.e., a fully-associative victim buffer with a small block size, a fully-associative spatial buffer with a large block size, and a dynamic fetching unit. This new NAND-type flash memory package can achieve dramatically high performance and low power consumption comparing with any conventional NAND-type flash memory. Our results show that the NAND flash memory package with a smart buffer cache can reduce the miss ratio by around 70% and the average memory access time by around 67%, over the conventional NAND flash memory configuration. Also, the average miss ratio and average memory access time of the package module with smart buffer for a given buffer space (e.g., 3KB) can achieve better performance than package modules with a conventional direct-mapped buffer with eight times(e.g., 32KB) as much space and a fully-associative configuration with twice as much space(e.g., 8KB)

Design and Implementation of Automatic Installation System for PDA (휴대 정보터미널을 위한 애플리케이션 자동설치 시스템의 설계 및 구현)

  • 나승원;오세만
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.3
    • /
    • pp.165-176
    • /
    • 2003
  • Instead of existing cell phones, PDAs are observed as leading wireless Internet devices recently Numerous applications are developed by extended usage of PDAs and it should be installed appropriately according to devices. Furthermore, when battery is discharged, all data stored in RAM(Random Access Memory) becomes obsolete. So it should be recovered or reinstalled from flash memory, backup media or something. In this paper, we present an automatic application installation system(PAIS : PDA Automatic Installation System) to solve problems that users have to install applications by themselves whenever it is necessary. With this system, users feel comfortable by saving time and effort to install each applications and application development companies save cost needed to make materials illustrating installation process. Consequently PAIS may flourish wireless Internet business.

  • PDF

Formal Verification of RACE Protocol Using VIS (VIS를 이용한 RACE 포로토콜의 정형검증)

  • Um, Hyun-Sun;Choi, JIn-Young;Han, Woo-Jong;Ki, An-Do;Shim, Kyu-Hyun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2219-2228
    • /
    • 2000
  • Caches in a multiprocessing environment introduce the cache coherence problem. When multiple processors maintain locally cached copies of a unique shared-memory location, any local modification of the location can result in a globally inconsistent view of memory. Cache coherence protocols are important to operate a shared-memory multiprocessor system with efficiency and correctness. Since random testing and simulations are not enough to validate correctness of protocols, it is necessary to develop efficient and reliable verification methods. In this appear we present our experience in using VIS (Verification Interacting with Synthesis), a tool of formal method, to analyze a number of property of a cache coherence protocol, RACE (Remote Access Cache coherent Enforcement).

  • PDF

Incremental Design of MIN using Unit Module (단위 모듈을 이용한 MIN의 점증적 설계)

  • Choi, Chang-Hoon;Kim, Sung-Chun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.2
    • /
    • pp.149-159
    • /
    • 2000
  • In this paper, we propose a new class of MIN (Multistage Interconnection Network) called SCMIN(ShortCut MIN) which can form a cheap and efficient packet switching interconnection network. SCMIN satisfies full access capability(FAC) and has multiple redundant paths between processor-memory pairs even though SCMIN is constructed with 2.5N-4 SEs which is far fewer SEs than that of MINs. SCMIN can be constructed suitable for localized communication by providing the shortcut path and multiple paths inside the processor-memory cluster which has frequent data communications. Therefore, SCMIN can be used as an attractive interconnection network for parallel applications with a localized communication pattern in shared-memory multiprocessor systems.

  • PDF