• Title/Summary/Keyword: 캐시메모리

Search Result 242, Processing Time 0.029 seconds

Visualization of Anomaly Detection in Hadoop System Information (하둡 시스템 정보의 이상탐지를 위한 시각화)

  • Yang, Seokwoo;Son, Siwoon;Gil, Myeong-Seon;Moon, Yang-Sae;Won, Hee-Sun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.702-705
    • /
    • 2015
  • 본 논문에서는 하듐 환경에서 시스템 정보의 이상탐지를 위한 시각화 기능을 설계 및 구현한다. 제안한 이상탐지 시각화 기능은 크게 세 단계로 구분된다. 먼저, 각 노드로부터 시스템 로그 데이터(캐시 및 메인 메모리)를 수집하여 하이브(Hive) 저장한다. 그리고 저장한 데이터에 3-시그마 규칙을 적용하여 이상탐지를 수행한 후 관계형 데이터베이스에 적합하도록 재가공한다. 마지막으로, 스쿱(Sqoop)을 통해 RDBMS(MariaDB)에 이상탕지 결과를 저장하고, DHTMLX 차트 라이브러리를 사용하여 이를 시각화한다. 시각화 결과, 로그 데이터의 이상탐지와 데이터간의 상관관계를 직관적으로 이해할 수 있게 되었다.

A Study on Write Cache Policy using a Flash Memory (플래시 메모리를 사용한 쓰기 캐시 정책 연구)

  • Kim, Young-Jin;Anggorosesar, Aldhino;Lee, Jeong-Bae;Rim, Kee-Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.77-78
    • /
    • 2009
  • In this paper, we study a pattern-aware write cache policy using a NAND flash memory in disk-based mobile storage systems. Our work is designed to face a mix of a number of sequential accesses and fewer non-sequential ones in mobile storage systems by redirecting the latter to a NAND flash memory and the former to a disk. Experimental results show that our policy improves the overall I/O performance by reducing the overhead significantly from a non-volatile cache over a traditional one.

Improving the I/O Performance of Disk-Based Graph Engine by Graph Ordering (디스크 기반 그래프 엔진의 입출력 성능 향상을 위한 그래프 오더링)

  • Lim, Keunhak;Kim, Junghyun;Lee, Eunjae;Seo, Jiwon
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.40-45
    • /
    • 2018
  • With the advent of big data and social networks, large-scale graph processing becomes popular research topic. Recently, an optimization technique called Gorder has been proposed to improve the performance of in-memory graph processing. This technique improves performance by optimizing the graph layout on memory to have better cache locality. However, since it is designed for in-memory graph processing systems, the technique is not suitable for disk-based graph engines; also the cost for applying the technique is significantly high. To solve the problem, we propose a new graph ordering called I/O Order. I/O Order considers the characteristics of I/O accesses for SSDs and HDDs to improve the performance of disk-based graph engine. In addition, the algorithmic complexity of I/O Order is simple compared to Gorder, hence it is cheaper to apply I/O Ordering. I/O order reduces the cost of pre-processing up to 9.6 times compared to that of Gorder's, still its performance is 2 times higher compared to the Random in low-locality graph algorithms.

Web-Based Distributed Visualization System for Large Scale Geographic Data (대용량 지형 데이터를 위한 웹 기반 분산 가시화 시스템)

  • Hwang, Gyu-Hyun;Yun, Seong-Min;Park, Sang-Hun
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.6
    • /
    • pp.835-848
    • /
    • 2011
  • In this paper, we propose a client server based distributed/parallel system to effectively visualize huge geographic data. The system consists of a web-based client GUI program and a distributed/parallel server program which runs on multiple PC clusters. To make the client program run on mobile devices as well as PCs, the graphical user interface has been designed by using JOGL, the java-based OpenGL graphics library, and sending the information about current available memory space and maximum display resolution the server can minimize the amount of tasks. PC clusters used to play the role of the server access requested geographic data from distributed disks, and properly re-sample them, then send the results back to the client. To minimize the latency happened in repeatedly access the distributed stored geography data, cache data structures have been maintained in both every nodes of the server and the client.

File-System-Level SSD Caching for Improving Application Launch Time (응용프로그램의 기동시간 단축을 위한 파일 시스템 수준의 SSD 캐싱 기법)

  • Han, Changhee;Ryu, Junhee;Lee, Dongeun;Kang, Kyungtae;Shin, Heonshik
    • Journal of KIISE
    • /
    • v.42 no.6
    • /
    • pp.691-698
    • /
    • 2015
  • Application launch time is an important performance metric to user experience in desktop and laptop environment, which mostly depends on the performance of secondary storage. Application launch times can be reduced by utilizing solid-state drive (SSD) instead of hard disk drive (HDD). However, considering a cost-performance trade-off, utilizing SSDs as caches for slow HDDs is a practicable alternative in reducing the application launch times. We propose a new SSD caching scheme which migrates data blocks from HDDs to SSDs. Our scheme operates entirely in the file system level and does not require an extra layer for mapping SSD-cached data that is essential in most other schemes. In particular, our scheme does not incur mapping overheads that cause significant burdens on the main memory, CPU, and SSD space for mapping table. Experimental results conducted with 8 popular applications demonstrate our scheme yields 56% of performance gain in application launch, when data blocks along with metadata are migrated.

A Transaction Level Simulator for Performance Analysis of Solid-State Disk (SSD) in PC Environment (PC향 SSD의 성능 분석을 위한 트랜잭션 수준 시뮬레이터)

  • Kim, Dong;Bang, Kwan-Hu;Ha, Seung-Hwan;Chung, Sung-Woo;Chung, Eui-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.12
    • /
    • pp.57-64
    • /
    • 2008
  • In this paper, we propose a system-level simulator for the performance analysis of a Solid-State Disk (SSD) in PC environment by using TLM (Transaction Level Modeling) method. Our method provides quantitative analysis for a variety of architectural choices of PC system as well as SSD. Also, it drastically reduces the analysis time compared to the conventional RTL (Register Transfer Level) modeling method. To show the effectiveness of the proposed simulator, we performed several explorations of PC architecture as well as SSD. More specifically, we measured the performance impact of the hit rate of a cache buffer which temporarily stores the data from PC. Also, we analyzed the performance variation of SSD for various NAND Flash memories which show different response time with our simulator. These experimental results show that our simulator can be effectively utilized for the architecture exploration of SSD as well as PC.

Performance Analysis of Implementation on Image Processing Algorithm for Multi-Access Memory System Including 16 Processing Elements (16개의 처리기를 가진 다중접근기억장치를 위한 영상처리 알고리즘의 구현에 대한 성능평가)

  • Lee, You-Jin;Kim, Jea-Hee;Park, Jong-Won
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.3
    • /
    • pp.8-14
    • /
    • 2012
  • Improving the speed of image processing is in great demand according to spread of high quality visual media or massive image applications such as 3D TV or movies, AR(Augmented reality). SIMD computer attached to a host computer can accelerate various image processing and massive data operations. MAMS is a multi-access memory system which is, along with multiple processing elements(PEs), adequate for establishing a high performance pipelined SIMD machine. MAMS supports simultaneous access to pq data elements within a horizontal, a vertical, or a block subarray with a constant interval in an arbitrary position in an $M{\times}N$ array of data elements, where the number of memory modules(MMs), m, is a prime number greater than pq. MAMS-PP4 is the first realization of the MAMS architecture, which consists of four PEs in a single chip and five MMs. This paper presents implementation of image processing algorithms and performance analysis for MAMS-PP16 which consists of 16 PEs with 17 MMs in an extension or the prior work, MAMS-PP4. The newly designed MAMS-PP16 has a 64 bit instruction format and application specific instruction set. The author develops a simulator of the MAMS-PP16 system, which implemented algorithms can be executed on. Performance analysis has done with this simulator executing implemented algorithms of processing images. The result of performance analysis verifies consistent response of MAMS-PP16 through the pyramid operation in image processing algorithms comparing with a Pentium-based serial processor. Executing the pyramid operation in MAMS-PP16 results in consistent response of processing time while randomly response time in a serial processor.

Parallel Cell-Connectivity Information Extraction Algorithm for Ray-casting on Unstructured Grid Data (비정렬 격자에 대한 광선 투사를 위한 셀 사이 연결정보 추출 병렬처리 알고리즘)

  • Lee, Jihun;Kim, Duksu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.17-25
    • /
    • 2020
  • We present a novel multi-core CPU based parallel algorithm for the cell-connectivity information extraction algorithm, which is one of the preprocessing steps for volume rendering of unstructured grid data. We first check the synchronization issues when parallelizing the prior serial algorithm naively. Then, we propose a 3-step parallel algorithm that achieves high parallelization efficiency by removing synchronization in each step. Also, our 3-step algorithm improves the cache utilization efficiency by increasing the spatial locality for the duplicated triangle test process, which is the core operation of building cell-connectivity information. We further improve the efficiency of our parallel algorithm by employing a memory pool for each thread. To check the benefit of our approach, we implemented our method on a system consisting of two octa-core CPUs and measured the performance. As a result, our method shows continuous performance improvement as we add threads. Also, it achieves up to 82.9 times higher performance compared with the prior serial algorithm when we use thirty-two threads (sixteen physical cores). These results demonstrate the high parallelization efficiency and high cache utilization efficiency of our method. Also, it validates the suitability of our algorithm for large-scale unstructured data.

A Performance Improvement of Linux TCP/IP Stack based on Flow-Level Parallelism in a Multi-Core System (멀티코어 시스템에서 흐름 수준 병렬처리에 기반한 리눅스 TCP/IP 스택의 성능 개선)

  • Kwon, Hui-Ung;Jung, Hyung-Jin;Kwak, Hu-Keun;Kim, Young-Jong;Chung, Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.113-124
    • /
    • 2009
  • With increasing multicore system, much effort has been put on the performance improvement of its application. Because multicore system has multiple processing devices in one system, its processing power increases compared to the single core system. However in many cases the advantages of multicore can not be exploited fully because the existing software and hardware were designed to be suitable for single core. When the existing software runs on multicore, its performance improvement is limited by the bottleneck of sharing resources and the inefficient use of cache memory on multicore. Therefore, according as the number of core increases, it doesn't show performance improvement and shows performance drop in the worst case. In this paper we propose a method of performance improvement of multicore system by applying Flow-Level Parallelism to the existing TCP/IP network application and operating system. The proposed method sets up the execution environment so that each core unit operates independently as much as possible in network application, TCP/IP stack on operating system, device driver, and network interface. Moreover it distributes network traffics to each core unit through L2 switch. The proposed method allows to minimize the sharing of application data, data structure, socket, device driver, and network interface between each core. Also it allows to minimize the competition among cores to take resources and increase the hit ratio of cache. We implemented the proposed methods with 8 core system and performed experiment. Experimental results show that network access speed and bandwidth increase linearly according to the number of core.

Implementation of Memory Copy Reduction Scheme for Networked Multimedia Service in Linux (리눅스 커널에서 네트워크 멀티미디어 서비스를 위한 메모리 복사 감소 기법 구현)

  • Kim, Jeong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.2B
    • /
    • pp.129-137
    • /
    • 2003
  • Multimedia streams, like MPEG continuously retrieve multimedia data because of their incessant playback. While these streams need an efficient support of kernel, the current buffer cache mechanism of Linux kernel such as Unix operating system was designed apt for small files, which is aperiodically requested as well as time uncritical. But, in case of continuous media, the CPU must enormously copy memory from kernel address space to user address space. This must lead to a large CPU overhead. This overhead both degrades system throughput and cannot guarantee QOS. In this paper, we have designed and implemented two memory copy reduction schemes in Linux kernel, direct I/O and one copy. The direct I/O skips the buffer cache layer of Linux kernel and results in dramatic reduction of CPU memory copy overhead. And, the one copy provides a fast disk-to-network data path without copying to user address space. The experimental results show considerable reduction of CPU overhead and throughput improvements.