Search | Korea Science

An Analysis of the Performance of Collective I/Os and the Subgroup Method (집합 I/O와 부분군 기법의 성능 분석)

Cha, Kwangho;Cho, Hyeyoung;Kim, Sungho
- Proceedings of the Korea Contents Association Conference
- /
- 2007.11a
- /
- pp.513-516
- /
- 2007
Because many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measure and analyze the performance of original collective I/Os and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the two kinds of subgroup method showed different performance. In terms of collective write operation, the subgroup method caused the performance degradation. However, the subgroup method for collective read showed good performance with small data size.
PDF

Improving Performance of I/O Virtualization Framework based on Multi-queue SSD (다중 큐 SSD 기반 I/O 가상화 프레임워크의 성능 향상 기법)

Kim, Tae Yong;Kang, Dong Hyun;Eom, Young Ik
- Journal of KIISE
- /
- v.43 no.1
- /
- pp.27-33
- /
- 2016
Virtualization has become one of the most helpful techniques in computing systems, and today it is prevalent in several computing environments including desktops, data-centers, and enterprises. However, since I/O layers are implemented to be oblivious to the I/O behaviors on virtual machines (VM), there still exists an I/O scalability issue in virtualized systems. In particular, when a multi-queue solid state drive (SSD) is used as a secondary storage, each system reveals a semantic gap that degrades the overall performance of the VM. This is due to two key problems, accelerated lock contentions and the I/O parallelism issue. In this paper, we propose a novel approach, including the design of virtual CPU (vCPU)-dedicated queues and I/O threads, which efficiently distributes the lock contentions and addresses the parallelism issue of Virtio-blk-data-plane in virtualized environments. Our approach is based on the above principle, which allocates a dedicated queue and an I/O thread for each vCPU to reduce the semantic gap. Our experimental results with various I/O traces clearly show that our design improves the I/O operations per second (IOPS) in virtualized environments by up to 155% over existing QEMU-based systems.
https://doi.org/10.5626/JOK.2016.43.1.27 인용 KSCI

CPC: A File I/O Cache Management Policy for Compute-Bound Workloads

Bahn, Hyokyung
- International journal of advanced smart convergence
- /
- v.11 no.2
- /
- pp.1-6
- /
- 2022
With the emergence of the new era of the 4th industrial revolution, compute-bound workloads with large memory footprint like big data processing increase dramatically. Even in such compute-bound workloads, however, we observe bulky I/Os while loading big data from storage to memory. Although file I/O cache plays a role of accelerating the performance of storage I/O, we found out that the cache hit rate in such environments is not improved even though we increase the file I/O cache capacity because of some special I/O references generated by compute-bound workloads. To cope with this situation, we propose a new file I/O cache management policy that improves the cache hit rate for compute-bound workloads significantly. Trace-driven simulations by replaying file I/O reference logs of compute-bound workloads show that the proposed cache management policy improves the cache hit rate compared to the well-acknowledged CLOCK algorithm by a large margin.
https://doi.org/10.7236/IJASC.2022.11.2.1 인용 PDF KSCI

A study on searching image by cluster indexing and sequential I/O (연속적 I/O와 클러스터 인덱싱 구조를 이용한 이미지 데이타 검색 연구)

Kim, Jin-Ok;Hwang, Dae-Joon
- The KIPS Transactions:PartD
- /
- v.9D no.5
- /
- pp.779-788
- /
- 2002
There are many technically difficult issues in searching multimedia data such as image, video and audio because they are massive and more complex than simple text-based data. As a method of searching multimedia data, a similarity retrieval has been studied to retrieve automatically basic features of multimedia data and to make a search among data with retrieved features because exact match is not adaptable to a matrix of features of multimedia. In this paper, data clustering and its indexing are proposed as a speedy similarity-retrieval method of multimedia data. This approach clusters similar images on adjacent disk cylinders and then builds Indexes to access the clusters. To minimize the search cost, the hashing is adapted to index cluster. In addition, to reduce I/O time, the proposed searching takes just one I/O to look up the location of the cluster containing similar object and one sequential file I/O to read in this cluster. The proposed schema solves the problem of multi-dimension by using clustering and its indexing and has higher search efficiency than the content-based image retrieval that uses only clustering or indexing structure.
https://doi.org/10.3745/KIPSTD.2002.9D.5.779 인용 PDF KSCI

Development of Realtime Parallel Data Communication Interface for Remote Control of 6-DOF Industrial Robot (산업용 6관절 로봇의 원격제어를 위한 실시간 병렬데이터통신 인터페이스)

Choi, Myoung-Hwan;Lee, Woo-Won
- Journal of Industrial Technology
- /
- v.21 no.A
- /
- pp.97-103
- /
- 2001
This paper presents the development of the I/O Interface for the real time parallel data communication between controller of a six-axis industrial robot(CRS-A460) and an external computer. The proposed I/O Interface consists of the hardware I/O interface and the software that is downloaded to the robot controller and executed by the controller operating system. The constitution of the digital I/O Port for CRS-A460 robot controller and the digital I/O board for IBM-PC are presented as well as the Process Control Program of the robot controller. The developed protocol for the parallel data communication is described. The data communication is tested, and the performance is analysed. In particular, it is shown that the real-time constraint of the robot controller process is satisfied.
PDF

Design of a Model to Structure Longitudinal Data for Medical Education Based on the I-E-O Model (I-E-O 모형에 근거한 의학교육 종단자료 구축을 위한 모형 설계)

Jung, Hanna;Lee, I Re;Kim, Hae Won;An, Shinki
- Korean Medical Education Review
- /
- v.24 no.2
- /
- pp.156-171
- /
- 2022
The purpose of this study was to establish a model for constructing longitudinal data for medical school, and to structure cohort and longitudinal data using data from Yonsei University College of Medicine (YUCM) according to the established input-environment-output (I-E-O) model. The study was conducted according to the following procedure. First, the data that YUCM has collected was reviewed through data analysis and interviews with the person in charge of each questionnaire. Second, the opinions of experts on the validity of the I-E-O model were collected through the first expert consultation, and as a result, a model was established for each stage of medical education based on the I-E-O model. Finally, in order to further materialize and refine the previously established model for each stage of medical education, secondary expert consultation was conducted. As a result, the survey areas and time period for collecting longitudinal data were organized according to the model for each stage of medical education, and an example of the YUCM cohort constructed according to the established model for each stage of medical education was presented. The results derived from this study constitute a basic step toward building data from universities in longitudinal form, and if longitudinal data are actually constructed through this method, they could be used as an important basis for determining major policies or reorganizing the curricula of universities. These research results have implications in terms of the management and utilization of existing survey data, the composition of cohorts, and longitudinal studies for many medical schools that are conducting surveys in various areas targeting students, such as lecture evaluation and satisfaction surveys.
https://doi.org/10.17496/kmer.2022.24.2.156 인용 PDF KSCI

Design and Implementation of I/O Performance Benchmarking Framework for Linux Container

Oh, Gijun;Son, Suho;Yang, Junseok;Ahn, Sungyong
- International Journal of Internet, Broadcasting and Communication
- /
- v.13 no.1
- /
- pp.180-186
- /
- 2021
In cloud computing service it is important to share the system resource among multiple instances according to user requirements. In particular, the issue of efficiently distributing I/O resources across multiple instances is paid attention due to the rise of emerging data-centric technologies such as big data and deep learning. However, it is difficult to evaluate the I/O resource distribution of a Linux container, which is one of the core technologies of cloud computing, since conventional I/O benchmarks does not support features related to container management. In this paper, we propose a new I/O performance benchmarking framework that can easily evaluate the resource distribution of Linux containers using existing I/O benchmarks by supporting container-related features and integrated user interface. According to the performance evaluation result with trace-replay benchmark, the proposed benchmark framework has induced negligible performance overhead while providing convenience in evaluating the I/O performance of multiple Linux containers.
https://doi.org/10.7236/IJIBC.2021.13.1.180 인용 PDF KSCI

Data De-duplication and Recycling Technique in SSD-based Storage System for Increasing De-duplication Rate and I/O Performance (SSD 기반 스토리지 시스템에서 중복률과 입출력 성능 향상을 위한 데이터 중복제거 및 재활용 기법)

Kim, Ju-Kyeong;Lee, Seung-Kyu;Kim, Deok-Hwan
- Journal of the Institute of Electronics and Information Engineers
- /
- v.49 no.12
- /
- pp.149-155
- /
- 2012
SSD is a storage device of having high-performance controller and cache buffer and consists of many NAND flash memories. Because NAND flash memory does not support in-place update, valid pages are invalidated when update and erase operations are issued in file system and then invalid pages are completely deleted via garbage collection. However, garbage collection performs many erase operations of long latency and then it reduces I/O performance and increases wear leveling in SSD. In this paper, we propose a new method of de-duplicating valid data and recycling invalid data. The method de-duplicates valid data and then recycles invalid data so that it improves de-duplication ratio. Due to reducing number of writes and garbage collection, the method could increase I/O performance and decrease wear leveling in SSD. Experimental result shows that it can reduce maximum 20% number of garbage collections and 9% I/O latency than those of general case.
https://doi.org/10.5573/ieek.2012.49.12.149 인용 PDF

An I/O Interface Circuit Using CTR Code to Reduce Number of I/O Pins (CTR 코드를 사용한 I/O 핀 수를 감소 시킬 수 있는 인터페이스 회로)

Kim, Jun-Bae;Kwon, Oh-Kyong
- Journal of the Korean Institute of Telematics and Electronics D
- /
- v.36D no.1
- /
- pp.47-56
- /
- 1999
As the density of logic gates of VLSI chips has rapidly increased, more number of I/O pins has been required. This results in bigger package size and higher packager cost. The package cost is higher than the cost of bare chips for high I/O count VLSI chips. As the density of logic gates increases, the reduction method of the number of I/O pins for a given complexity of logic gates is required. In this paper, we propose the novel I/O interface circuit using CTR (Constant-Transition-Rate) code to reduce 50% of the number of I/O pins. The rising and falling edges of the symbol pulse of CTR codes contain 2-bit digital data, respectively. Since each symbol of the proposed CTR codes contains 4-bit digital data, the symbol rate can be reduced by the factor of 2 compared with the conventional I/O interface circuit. Also, the simultaneous switching noise(SSN) can be reduced because the transition rate is constant and the transition point of the symbols is widely distributed. The channel encoder is implemented only logic circuits and the circuit of the channel decoder is designed using the over-sampling method. The proper operation of the designed I/O interface circuit was verified using. HSPICE simulation with 0.6 m CMOS SPICE parameters. The simulation results indicate that the data transmission rate of the proposed circuit using 0.6 m CMOS technology is more than 200 Mbps/pin. We implemented the proposed circuit using Altera's FPGA and confimed the operation with the data transfer rate of 22.5 Mbps/pin.
PDF

Design and Implementation of An I/O System for Irregular Application under Parallel System Environments (병렬 시스템 환경하에서 비정형 응용 프로그램을 위한 입출력 시스템의 설계 및 구현)

No, Jae-Chun;Park, Seong-Sun;;Gwon, O-Yeong
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.11
- /
- pp.1318-1332
- /
- 1999
본 논문에서는 입출력 응용을 위해 collective I/O 기법을 기반으로 한 실행시간 시스템의 설계, 구현 그리고 그 성능평가를 기술한다. 여기서는 모든 프로세서가 동시에 I/O 요구에 따라 스케쥴링하며 I/O를 수행하는 collective I/O 방안과 프로세서들이 여러 그룹으로 묶이어, 다음 그룹이 데이터를 재배열하는 통신을 수행하는 동안 오직 한 그룹만이 동시에 I/O를 수행하는 pipelined collective I/O 등의 두 가지 설계방안을 살펴본다. Pipelined collective I/O의 전체 과정은 I/O 노드 충돌을 동적으로 줄이기 위해 파이프라인된다. 이상의 설계 부분에서는 동적으로 충돌 관리를 위한 지원을 제공한다. 본 논문에서는 다른 노드의 메모리 영역에 이미 존재하는 데이터를 재 사용하여 I/O 비용을 줄이기 위해 collective I/O 방안에서의 소프트웨어 캐슁 방안과 두 가지 모형에서의 chunking과 온라인 압축방안을 기술한다. 그리고 이상에서 기술한 방안들이 입출력을 위해 높은 성능을 보임을 기술하는데, 이 성능결과는 Intel Paragon과 ASCI/Red teraflops 기계 상에서 실험한 것이다. 그 결과 응용 레벨에서의 bandwidth는 peak point가 55%까지 측정되었다.Abstract In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. We present two designs, namely, "Collective I/O" and "Pipelined Collective I/O". In the first scheme, all processors participate in the I/O simultaneously, making scheduling of I/O requests simpler but creating a possibility of contention at the I/O nodes. In the second approach, processors are grouped into several groups, so that only one group performs I/O simultaneously, while the next group performs communication to rearrange data, and this entire process is pipelined to reduce I/O node contention dynamically. In other words, the design provides support for dynamic contention management. Then we present a software caching method using collective I/O to reduce I/O cost by reusing data already present in the memory of other nodes. Finally, chunking and on-line compression mechanisms are included in both models. We demonstrate that we can obtain significantly high-performance for I/O above what has been possible so far. The performance results are presented on an Intel Paragon and on the ASCI/Red teraflops machine. Application level I/O bandwidth up to 55% of the peak is observed.he peak is observed.

Search Result 1,264, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)