• Title/Summary/Keyword: Small files

Search Result 119, Processing Time 0.021 seconds

The Design of Method for Efficient Processing of Small Files in the Distributed System based on Hadoop Framework (하둡 프레임워크 기반 분산시스템 내의 작은 파일들을 효율적으로 처리하기 위한 방법의 설계)

  • Kim, Seung-Hyun;Kim, Young-Geun;Kim, Won-Jung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.10
    • /
    • pp.1115-1122
    • /
    • 2015
  • Hadoop framework was designed to be suitable for processing very large files. On the other hand, when processing the Small Files, it waste the resource of a distributed system, and occur performance degradation. It is shown noticeable the more the Small Files. This problem is caused by the Small Files, it can be solved through the merging of associated Small Files. But a way of merging of Small Files has some limited point. in this paper, examines existing limit of merging method, design merging method Small Files for effective process.

Co-Writing Multiple Files Based on Directory Locality for High Performance of Small File Writes (디렉토리 지역성을 활용한 작은 파일들의 모아 쓰기 기법)

  • Lee, Kyung-Jae;Ahn, Woo-Hyun;Oh, Jae-Won
    • The KIPS Transactions:PartA
    • /
    • v.15A no.5
    • /
    • pp.275-286
    • /
    • 2008
  • Fast File System(FFS) utilizes large disk bandwidth to improve the write performance of large files. One way to improve the performance is to write multiple blocks of a large file at a single disk I/O through the disk bandwidth. However, rather than disk bandwidth, the performance of small file writes is limited by disk access times significantly impacted by disk movements such as disk seek and rotation because FFS writes each of small files at a single disk write. We propose CW-FFS (Co-Writing Fast File System) to improve the write performance of small files by minimizing the disk movements that are needed to write small files to disks. Its key technique called co-writing scheme is to dynamically collect multiple small files named by a given directory and then write them at a single disk I/O to contiguous disk locations. Co-writing several small files at a single disk I/O reduces multiple disk movements that are needed for small file writes to one single disk movement, thus increasing the overall write performance of write-intensive applications. Furthermore, a file allocation scheme is introduced to prevent co-writing scheme from having a negative impact on disk spatial locality of small files named by a given directory. The measurement of our technique implemented in the OpenBSD 4.0 shows that CW-FFS increases the performance of small file writes over FFS in the range from 5 to 35% in the Postmark benchmark.

A Distributed Cache Management Scheme for Efficient Accesses of Small Files in HDFS (HDFS에서 소형 파일의 효율적인 접근을 위한 분산 캐시 관리 기법)

  • Oh, Hyunkyo;Kim, Kiyeon;Hwang, Jae-Min;Park, Junho;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.11
    • /
    • pp.28-38
    • /
    • 2014
  • In this paper, we propose the distributed cache management scheme to efficiently access small files in Hadoop Distributed File Systems(HDFS). The proposed scheme can reduce the number of metadata managed by a name node since many small files are merged and stored in a chunk. It is also possible to reduce the file access costs, by keeping the information of requested files using the client cache and data node caches. The client cache keeps small files that a user requests and metadata. Each data node cache keeps the small files that are frequently requested by users. It is shown through performance evaluation that the proposed scheme significantly reduces the processing time over the existing scheme.

A Study on the Improving Performance of Massively Small File Using the Reuse JVM in MapReduce (MapReduce에서 Reuse JVM을 이용한 대규모 스몰파일 처리성능 향상 방법에 관한 연구)

  • Choi, Chul Woong;Kim, Jeong In;Kim, Pan Koo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.9
    • /
    • pp.1098-1104
    • /
    • 2015
  • With the widespread use of smartphones and IoT (Internet of Things), data are being generated on a large scale, and there is increased for the analysis of such data. Hence, distributed processing systems have gained much attention. Hadoop, which is a distributed processing system, saves the metadata of stored files in name nodes; in this case, the main problems are as follows: the memory becomes insufficient; load occurs because of massive small files; scheduling and file processing time increases because of the increased number of small files. In this paper, we propose a solution to address the increase in processing time because of massive small files, and thus improve the processing performance, using the Reuse JVM method provided by Hadoop. Through environment setting, the Reuse JVM method modifies the JVM produced conventionally for every task, so that multiple tasks are reused sequentially in one JVM. As a final outcome, the Reuse JVM method showed the best processing performance when used together with CombineFileInputFormat.

A Chinese Restaurant Game for Distributed Cooperative Caching in Small Cell Networks

  • Chen, Junliang;Wang, Gang;Wang, Fuxiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.1
    • /
    • pp.222-236
    • /
    • 2019
  • Wireless content caching in small cell networks has recently been considered as a promising way to alleviate the congestion of the backhaul in emerging heterogenous cellular network. However, how to select files which are cached in SBSs and how to make SBSs work together is an important issue for cooperative cache research for the propose of reducing file download time. In this paper, a Cooperative-Greedy strategy (CGS) among cache-enabled small base stations (SBSs) in small cell network is proposed, in order to minimize the download time of files. This problem is formulated as a Chinese restaurant game.Using this game model, we can configure file caching schemes based on file popularity and the spectrum resources allocated to several adjacent SBSs. Both the existence and uniquencess of a Nash equilibrium are proved. In the theoretical analysis section, SBSs cooperate with each other in order to cache popular files as many as possible near UEs. Simulation results show that the CGS scheme outperforms other schemes in terms of the file-download time.

Access efficiency of small sized files in Big Data using various Techniques on Hadoop Distributed File System platform

  • Alange, Neeta;Mathur, Anjali
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.359-364
    • /
    • 2021
  • In recent years Hadoop usage has been increasing day by day. The need of development of the technology and its specified outcomes are eagerly waiting across globe to adopt speedy access of data. Need of computers and its dependency is increasing day by day. Big data is exponentially growing as the entire world is working in online mode. Large amount of data has been produced which is very difficult to handle and process within a short time. In present situation industries are widely using the Hadoop framework to store, process and produce at the specified time with huge amount of data that has been put on the server. Processing of this huge amount of data having small files & its storage optimization is a big problem. HDFS, Sequence files, HAR, NHAR various techniques have been already proposed. In this paper we have discussed about various existing techniques which are developed for accessing and storing small files efficiently. Out of the various techniques we have specifically tried to implement the HDFS- HAR, NHAR techniques.

Development of Quality Document Management System Using Hypertext (하이퍼텍스트를 이용한 품질문서 관리시스템 구축 사례)

  • 정현석;남호수;박동준;김호균
    • Journal of Korean Society for Quality Management
    • /
    • v.28 no.3
    • /
    • pp.104-113
    • /
    • 2000
  • In this paper, we present a useful system to manage the quality documents, using the concept of hypertext in HANGUEL wordprocessor, In order to develop this system, we classify all manuals, procedures and forms into files. A relationship chart of these files is constructed and files are hyperlinked according to this chart. We apply this quality document management system using hyper- text to a small precision manufacturing firm by analyzing its all kinds of quality documents. We confirm that this system effectively reduces the handling time of quality documents and supports revising task of quality documents with consistency.

  • PDF

Cyclic fatigue resistance, torsional resistance, and metallurgical characteristics of M3 Rotary and M3 Pro Gold NiTi files

  • Pedulla, Eugenio;Lo Savio, Fabio;La Rosa, Giusy Rita Maria;Miccoli, Gabriele;Bruno, Elena;Rapisarda, Silvia;Chang, Seok Woo;Rapisarda, Ernesto;La Rosa, Guido;Gambarini, Gianluca;Testarelli, Luca
    • Restorative Dentistry and Endodontics
    • /
    • v.43 no.2
    • /
    • pp.25.1-25.10
    • /
    • 2018
  • Objectives: To evaluate the mechanical properties and metallurgical characteristics of the M3 Rotary and M3 Pro Gold files (United Dental). Materials and Methods: One hundred and sixty new M3 Rotary and M3 Pro Gold files (sizes 20/0.04 and 25/0.04) were used. Torque and angle of rotation at failure (n = 20) were measured according to ISO 3630-1. Cyclic fatigue resistance was tested by measuring the number of cycles to failure in an artificial stainless steel canal ($60^{\circ}$ angle of curvature and a 5-mm radius). The metallurgical characteristics were investigated by differential scanning calorimetry. Data were analyzed using analysis of variance and the Student-Newman-Keuls test. Results: Comparing the same size of the 2 different instruments, cyclic fatigue resistance was significantly higher in the M3 Pro Gold files than in the M3 Rotary files (p < 0.001). No significant difference was observed between the files in the maximum torque load, while a significantly higher angular rotation to fracture was observed for M3 Pro Gold (p < 0.05). In the DSC analysis, the M3 Pro Gold files showed one prominent peak on the heating curve and 2 prominent peaks on the cooling curve. In contrast, the M3 Rotary files showed 1 small peak on the heating curve and 1 small peak on the cooling curve. Conclusions: The M3 Pro Gold files showed greater flexibility and angular rotation than the M3 Rotary files, without decrement of their torque resistance. The superior flexibility of M3 Pro Gold files can be attributed to their martensite phase.

The File Splitting Distribution Scheme Using the P2P Networks with The Mesh topology (그물망 위상의 P2P 네트워크를 활용한 파일 분리 분산 방안)

  • Lee Myoung-Hoon;Park Jung-Su;Kim Jin-Hong;Jo In-June
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.8
    • /
    • pp.1669-1675
    • /
    • 2005
  • Recently, the small sized wireless terminals have problems of processing of large sized file because of the trends of a small sized terminals and a large sized files. Moreover, the web servers or the file servers have problems of the overload because of the concentration with many number of files to the them. Also, There is a security vulnerability of the data processing caused by the processing with a unit of the independent file. To resolve the problems, this paper proposes a new scheme of fat splining distribution using the P2P networks with the mesh topology. The proposed scheme is to distribute blocks of file into any peer of P2P networks. It can do that the small sized wireless terminals can process the large size file, the overload problems of a web or file servers can solve because of the decentralized files, and, the security vulnerability of the data processing is mitigated because of the distributed processing with a unit of the blocks to the peers.

qtar: Design and Implementation of an Optimized tar Command with FTL-level Remapping (qtar: 플래시 변환 계층 리매핑 기법을 이용한 최적화된 tar 명령어 구현)

  • Ryoo, Jeongseok;Hahn, Sangwook Shane;Kim, Jihong
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.9-14
    • /
    • 2018
  • Tar is a Linux command that combines several files into a single file. Combining multiple small files into large files increases the compression efficiency and data transfer speed. However, tar has a problem in that smaller target files, result in a lower performance. In this paper, we show that this performance degradation occurs when tar reads the data from the target files and propose qtar (quick tar) to solve this problem via flash-level remapping. When the size of an I/O request is less than 1 MB, the I/O performance decreases proportionally to the decrease in size of the I/O request. Since tar reads the data of files one by one, a smaller file size results in a lower performance. Therefore, the remapping technique is implemented in qtar to read data from the target files at the maximum I/O size regardless of the size of each file. Our evaluations show that the execution time with qtar is reduced by up to 3.4 times compared to that with tar.