• Title/Summary/Keyword: file I/O

Search Result 165, Processing Time 0.028 seconds

Block Allocation Method for Efficiently Managing Temporary Files of Hash Joins on SSDs (SSD상에서 해시조인 임시 파일의 효과적인 관리를 위한 블록 할당 방법)

  • Joontae, Kim;Sangwon, Lee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.12
    • /
    • pp.429-436
    • /
    • 2022
  • Temporary files are generated when the Hash Join is performed on tables larger than the memory. During the join process, each temporary file is deleted sequentially after it completes the I/O operations. This paper reveals for that the fallocate system call and file deletion-related trim options significantly impact the hash join performance when temporary files are managed on SSDs rather than hard disks. The experiment was conducted on various commercial and research SSDs using PostgreSQL, a representative open-source database. We find that it is possible to improve the join performance up to 3 to 5 times compared to the default combination depending on whether fallocate and trim options are used for temporary files. In addition, we investigate the write amplification and trim command overhead in the SSD according to the combination of the two options for temporary files.

A Study on Improvement of Buffer Cache Performance for File I/O in Deep Learning (딥러닝의 파일 입출력을 위한 버퍼캐시 성능 개선 연구)

  • Jeongha Lee;Hyokyung Bahn
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.93-98
    • /
    • 2024
  • With the rapid advance in AI (artificial intelligence) and high-performance computing technologies, deep learning is being used in various fields. Deep learning proceeds training by randomly reading a large amount of data and repeats this process. A large number of files are randomly repeatedly referenced during deep learning, which shows different access characteristics from traditional workloads with temporal locality. In order to cope with the difficulty in caching caused by deep learning, we propose a new sampling method that aims at reducing the randomness of dataset reading and adaptively operating on existing buffer cache algorithms. We show that the proposed policy reduces the miss rate of the buffer cache by 16% on average and up to 33% compared to the existing method, and improves the execution time by up to 24%.

Real-Time File Integrity Checker for Intrusion Recovery and Response System (침입 복구 및 대응 시스템을 위한 실시간 파일 무결성 검사)

  • Jeun Sanghoon;Hur Jinyoung;Choi Jongsun;Choi Jaeyoung
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.6
    • /
    • pp.279-287
    • /
    • 2005
  • File integrity checking is the most reliable method to examine integrity and stability of system resources. It is required to examine the whole data whenever auditing system's integrity, and its process and result depends on administrator's experience and ability. Therefore the existing method is not appropriate to intrusion response and recovery systems, which require a fast response time. Moreover file integrity checking is able to collect information about the damaged resources, without information about the person who generated the action, which would be very useful for intrusion isolation. In this paper, we propose rtIntegrit, which combines system call auditing functions, it is called Syswatcher, with file integrity checking. The rtlntegrit can detect many activities on files or file system in real-time by combining with Syswatcher. The Syswatcher audit file I/O relative system call that is specified on configuration. And it can be easily cooperated with intrusion response and recovery systems since it generates assessment data in the standard IDMEF format.

Data De-duplication and Recycling Technique in SSD-based Storage System for Increasing De-duplication Rate and I/O Performance (SSD 기반 스토리지 시스템에서 중복률과 입출력 성능 향상을 위한 데이터 중복제거 및 재활용 기법)

  • Kim, Ju-Kyeong;Lee, Seung-Kyu;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.12
    • /
    • pp.149-155
    • /
    • 2012
  • SSD is a storage device of having high-performance controller and cache buffer and consists of many NAND flash memories. Because NAND flash memory does not support in-place update, valid pages are invalidated when update and erase operations are issued in file system and then invalid pages are completely deleted via garbage collection. However, garbage collection performs many erase operations of long latency and then it reduces I/O performance and increases wear leveling in SSD. In this paper, we propose a new method of de-duplicating valid data and recycling invalid data. The method de-duplicates valid data and then recycles invalid data so that it improves de-duplication ratio. Due to reducing number of writes and garbage collection, the method could increase I/O performance and decrease wear leveling in SSD. Experimental result shows that it can reduce maximum 20% number of garbage collections and 9% I/O latency than those of general case.

An Efficient Metadata Journaling Scheme for In-memory File Systems (인메모리 파일시스템을 위한 효율적인 메타데이터 저널링 기법)

  • Hyokyung Bahn
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.107-111
    • /
    • 2023
  • Journaling techniques are widely used to maintain a consistent file system state in the event of a system crash. As existing journaling techniques are designed for block storage such as HDDs, they are not efficient for byte-addressable persistent memory media. This paper proposes a metadata journaling technique for in-memory file systems that has the ability of avoiding inconsistent file system states in crash situations. The proposed journaling technique reduces a large amount of writing by making use of the byte-addressable feature of memory media and bypasses heavy software I/O stack. Experimental results with the IOzone benchmark show that the proposed journaling technique improves the performance of Ext4 by 49.2% on average.

Implementation of a DB-Based Virtual File System for Lightweight IoT Clouds (경량 사물 인터넷 클라우드를 위한 DB 기반 가상 파일 시스템 구현)

  • Lee, Hyung-Bong;Kwon, Ki-Hyeon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.10
    • /
    • pp.311-322
    • /
    • 2014
  • IoT(Internet of Things) is a concept of connected internet pursuing direct access to devices or sensors in fused environment of personal, industrial and public area. In IoT environment, it is possible to access realtime data, and the data format and topology of devices are diverse. Also, there are bidirectional communications between users and devices to control actuators in IoT. In this point, IoT is different from the conventional internet in which data are produced by human desktops and gathered in server systems by way of one-sided simple internet communications. For the cloud or portal service of IoT, there should be a file management framework supporting systematic naming service and unified data access interface encompassing the variety of IoT things. This paper implements a DB-based virtual file system maintaining attributes of IoT things in a UNIX-styled file system view. Users who logged in the virtual shell are able to explore IoT things by navigating the virtual file system, and able to access IoT things directly via UNIX-styled file I O APIs. The implemented virtual file system is lightweight and flexible because it maintains only directory structure and descriptors for the distributed IoT things. The result of a test for the virtual shell primitives such as mkdir() or chdir() shows the smooth functionality of the virtual file system, Also, the exploring performance of the file system is better than that of Window file system in case of adopting a simple directory cache mechanism.

The Effect of C Language Output Method to the Performance of CGI Gateway in the UNIX Systems (유닉스 시스템에서 C 언어 출력 방법이 CGI 게이트웨이 성능에 미치는 영향)

  • Lee Hyung-Bong;Jeong Yeon-Chul;Kweon Ki-Hyeon
    • The KIPS Transactions:PartC
    • /
    • v.12C no.1 s.97
    • /
    • pp.147-156
    • /
    • 2005
  • CGI is a standard interface rule between web server and gateway devised for the gateway's standard output to replace a static web document in UNIX environment. So, it is common to use standard I/O statements provided by the programming language for the CGI gateway. But the standard I/O mechanism is one of buffer strategies that are designed transparently to operating system and optimized for generic cases. This means that it nay be useful to apply another optimization to the standard I/O environment in CGI gateway. In this paper, we introduced standard output method and file output method as the two output optimization areas for CGI gateways written in C language in the UNIX/LINUX systems, and applied the proposed methods of each area to Debian LINUX, IBM AIX, SUN Solaris, Digital UNIX respectively. Then we analyzed the effect of them focused on execution time. The results were different from operating system to operating system. Compared to normal situation, the best case of standard output area showed about $10{\%}$ improvement and the worst case showed $60{\%}$ degradation in file output area where some performance improvements were expected.

Flash Memory Shadow Paging Scheme Using Deferred Cleaning List for Portable Databases (휴대용 데이터베이스를 위한 지연된 소거 리스트를 이용하는 플래시 메모리 쉐도우 페이징 기법)

  • Byun Si-Woo
    • Journal of Information Technology Applications and Management
    • /
    • v.13 no.2
    • /
    • pp.115-126
    • /
    • 2006
  • Recently, flash memories are one of best media to support portable computer's storages in mobile computing environment. We propose a new transaction recovery scheme for a flash memory database environment which is based on a flash media file system. We improved traditional shadow paging schemes by reusing old data pages which are supposed to be invalidated in the course of writing a new data page in the flash file system environment. In order to reuse these data pages, we exploit deferred cleaning list structure in our flash memory shadow paging (FMSP) scheme. FMSP scheme removes the additional storage overhead for keeping shadow pages and minimizes the I/O performance degradation caused by data page distribution phenomena of traditional shadow paging schemes. We also propose a simulation model to show the performance of FMSP. Based on the results of the performance evaluation, we conclude that FMSP outperforms the traditional scheme.

  • PDF

Version-Aware Cooperative Caching for Multi-Node Rendering

  • Cho, Kyungwoon;Bahn, Hyokyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.30-35
    • /
    • 2022
  • Rendering is widely used for visual effects in animations and movies. Although rendering is computing-intensive, we observe that it accompanies heavy I/O because of large input data. This becomes technical hurdles for multi-node rendering performed on public cloud nodes. To reduce the overhead of data transmission in multi-node rendering, this paper analyzes the characteristics of rendering workloads, and presents the cooperative caching scheme for multi-node rendering. Our caching scheme has the function of synchronization between original data in local storage and cached data in rendering nodes, and the cached data are shared between multiple rendering nodes. We perform measurement experiments in real system environments and show that the proposed cooperative caching scheme improves the conventional caching scheme used in the network file system by 27% on average.

Implementation of the Inverted File for Indexing Large-volume Data (대용량 데이터 색인에 적합한 역파일의 구현)

  • Sung Chae Lim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.909-912
    • /
    • 2008
  • 대용량 문서에 대한 키워드 검색을 위해 역파일(inverted-file) 색인 기법이 널리 쓰이고 있다. 역파일 색인 기법을 구현함에 있어 고려되어야 할 점은 키워드 검색 처리 시에 디스크 사용을 최소로 할 수 있는 방법이다. 크기가 작은 역파일이라면 디스크 I/O 사용도 작고 필요시 역파일을 메모리에 적재하여 둠으로써 디스크 사용을 크게 줄일 수 있다. 하지만, 웹 검색이나 규모가 큰 도서관 시스템에서와 같이 색인 데이터 크기가 매우 큰 경우 역파일을 읽는 디스크 비용이 급격히 증가할 수 있다. 본 논문에서는 매우 큰 크기의 역파일을 사용하는 검색 환경에서 디스크 사용을 최소로 할 수 있는 역파일 구조를 제안한다. 제안된 구조는 질의 처리 과정을 고려해 계층 구조로 설계되며 실제 상용 시스템에 적용되어 안정성 및 성능을 입증했다.