• Title/Summary/Keyword: File Cluster

Search Result 114, Processing Time 0.027 seconds

Optimization of LDPC Code Replication Scheme for Cluster File System (클러스터 파일 시스템을 위한 LDPC 코드 복제 기법 최적화)

  • Kim, Se-Hoe;Lee, Won-Joo;Jeon, Chang-Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2010.07a
    • /
    • pp.15-16
    • /
    • 2010
  • 최근 이슈가 되고 있는 클라우드 컴퓨팅은 대용량의 데이터를 분산 저장하고 제공할 수 있는 클러스터 파일 시스템을 필요로 한다. 이러한 클러스터 파일 시스템은 높은 신뢰성과 고가용성을 보장하기 위해서 파일 복제 기법을 사용하고 있다. 가장 많이 쓰이고 있는 복제 기법은 전체-파일 복제 기법으로 높은 파일 가용성을 제공하지만 그만큼 스토리지 오버헤드가 크다는 단점이 있다. 또 다른 복제 기법으로는 LDPC 코드를 이용한 것으로 비교적 적은 스토리지 오버헤드를 가지면서 동시에 비슷한 수준의 파일 가용성을 제공한다. 따라서 본 논문에서는 클러스터 파일 시스템을 위한 LDPC 코드 복제 기법의 최적화 방법을 제안한다.

  • PDF

Design and Implementation of APFS Object Identification Tool for Digital Forensics

  • Cho, Gyu-Sang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.1
    • /
    • pp.10-18
    • /
    • 2022
  • Since High Sierra, APFS has been used as the main file system. It is a well-established file system that has been used stably thus far. From the perspective of digital forensics, there are still many areas to be investigated. Apple File System Reference is provided to the apple developer site, but it is not satisfactory to fully analyze APFS. Researchers know more about the structure of APFS than before, but they have not yet fully analyzed its structure to a perfect level about it. In this paper, we develop APFS object identification tool for digital forensics. The most basic and essential object identification and analysis of the APFS filesystem will be conducted with the tool. The analysis in this study serves as the background for an analysis of the checkpoint operation principle and structure, including the more complex B-tree structure of APFS. There are several options for the developed tool, but the results of two use cases will be shown here. Based on the implemented tool, it is hoped that more functions will be added to make APFS a useful tool for faster and more accurate analyses.

An Empirical Evaluation Analysis of the Performance of In-memory Bigdata Processing Platform (메모리 기반 빅데이터 처리 프레임워크의 성능개선 연구)

  • Lee, Jae hwan;Choi, Jun;Koo, Dong hun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.13-19
    • /
    • 2016
  • Spark, an in-memory big-data processing framework is popular to use for real-time processing workload. Spark can store all intermediate data in the cluster memory so that Spark can minimize I/O access. However, when the resident memory of workload is larger that the physical memory amount of the cluster, the total performance can drop dramatically. In this paper, we analyse the factors of bottleneck on PageRank Application that needs many memory through experiment, and cluster the Spark with Tachyon File System for using memory to solve the factor of bottleneck and then we improve the performance about 18%.

A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing (클라우드 컴퓨팅에서 Hadoop 애플리케이션 특성에 따른 성능 분석)

  • Keum, Tae-Hoon;Lee, Won-Joo;Jeon, Chang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.5
    • /
    • pp.49-56
    • /
    • 2010
  • In this paper, we implement a Hadoop based cluster for cloud computing and evaluate the performance of this cluster based on application characteristics by executing RandomTextWriter, WordCount, and PI applications. A RandomTextWriter creates given amount of random words and stores them in the HDFS(Hadoop Distributed File System). A WordCount reads an input file and determines the frequency of a given word per block unit. PI application induces PI value using the Monte Carlo law. During simulation, we investigate the effect of data block size and the number of replications on the execution time of applications. Through simulation, we have confirmed that the execution time of RandomTextWriter was proportional to the number of replications. However, the execution time of WordCount and PI were not affected by the number of replications. Moreover, the execution time of WordCount was optimum when the block size was 64~256MB. Therefore, these results show that the performance of cloud computing system can be enhanced by using a scheduling scheme that considers application's characteristics.

Design and Implementation of Big Data Cluster for Indoor Environment Monitering (실내 환경 모니터링을 위한 빅데이터 클러스터 설계 및 구현)

  • Jeon, Byoungchan;Go, Mingu
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.13 no.2
    • /
    • pp.77-85
    • /
    • 2017
  • Due to the expansion of accommodation space caused by increase of population along with lifestyle changes, most of people spend their time indoor except for the travel time. Because of this, environmental change of indoor is very important, and it affects people's health and economy in resources. But, most of people don't acknowledge the importance of indoor environment. Thus, monitoring system for sustaining and managing indoor environment systematically is needed, and big data clusters should be used in order to save and manage numerous sensor data collected from many spaces. In this paper, we design a big data cluster for the indoor environment monitoring in order to store the sensor data and monitor unit of the huge building Implementation design big data cluster-based system for the analysis, and a distributed file system and building a Hadoop, HBase for big data processing. Also, various sensor data is saved for collection, and effective indoor environment management and health enhancement through monitoring is expected.

Design and Implementation of The Windows Thesaurus WTPM using Filename of Semantics Clustering (파일명의 의미 클러스터링에 의한 윈도우 시소러스 WTPM 설계와 구현)

  • Kim, Man-pil;Tcha, Hong-jun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.1
    • /
    • pp.73-79
    • /
    • 2009
  • Analyze semantic of files recorded in the user's computer file system based on C++ program language which pursue modularization program and object-oriented programming language. And this refers to it, it design that clustering semantic of filename with thesaurus for user convenience. WTPM makes User Write Files into Cluster with thesaurus semantic structure and reserved words. WTPM process has designed for Icon file's display Mashup structure and implemented by automation algorithm of classification.

  • PDF

Generating FE Mesh Automatically from STL File Model (STL 파일 모델로부터 유한 요소망 자동 생성)

  • Park, Jung-Min;Kwon, Ki-Youn;Lee, Byung-Chai;Chae, Soo-Won
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.31 no.7 s.262
    • /
    • pp.739-746
    • /
    • 2007
  • Recently, models in STL files are widely used in reverse engineering processes, CAD systems and analysis systems. However the models have poor geometric quality and include only triangles, so the models are not suitable for the finite element analysis. This paper presents a general method that generates finite element mesh from STL file models. Given triangular meshes, the method estimates triangles and makes clusters which consist of triangles. The clusters are merged by some geometric indices. After merging clusters, the method applies plane meshing algorithm, based on domain decomposition method, to each cluster and then the result plane mesh is projected into the original triangular set. Because the algorithm uses general methods to generate plane mesh, we can obtain both tri and quad meshes unlike previous researches. Some mechanical part models are used to show the validity of the proposed method.

User modeling based on fuzzy category and interest for web usage mining

  • Lee, Si-Hun;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.1
    • /
    • pp.88-93
    • /
    • 2005
  • Web usage mining is a research field for searching potentially useful and valuable information from web log file. Web log file is a simple list of pages that users refer. Therefore, it is not easy to analyze user's current interest field from web log file. This paper presents web usage mining method for finding users' current interest based on fuzzy categories. We consider not only how many times a user visits pages but also when he visits. We describe a user's current interest with a fuzzy interest degree to categories. Based on fuzzy categories and fuzzy interest degrees, we also propose a method to cluster users according to their interests for user modeling. For user clustering, we define a category vector space. Experiments show that our method properly reflects the time factor of users' web visiting as well as the users' visit number.

Hybrid Channel Model in Parallel File System (병렬 파일 시스템에서의 하이브리드 채널 모델)

  • Lee, Yoon-Young;Hwangbo, Jun-Hyung;Seo, Dae-Wha
    • The KIPS Transactions:PartA
    • /
    • v.10A no.1
    • /
    • pp.25-34
    • /
    • 2003
  • Parallel file system solves I/O bottleneck to store a file distributedly and read it parallel exchanging messages among computers that is connected multiple computers with high speed networks. However, they do not consider the message characteristics and performances are decreased. Accordingly, the current study proposes the Hybrid Channel model (HCM) as a message-management method, whereby the messages of a parallel file system are classified by a message characteristic between control messages and file data blocks, and the communication channel is divided into a message channel and data channel. The message channel then transfers the control messages through TCP/IP with reliability, while the data channel that is implemented by Virtual Interface Architecture (VIA) transfers the file data blocks at high speed. In tests, the proposed parallel file system that is implemented by HCM exhibited a considerably improved performance.

Detection and Recovery of Failure Node in SAN-based Cluster Shared File System $SANique^{TM}$ (SAN 기반 클러스터 공유 파일 시스템 $SANique^{TM}$의 오류 노드 탐지 및 회복 기법)

  • Lee, Kyu-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2609-2617
    • /
    • 2009
  • This paper describes the design overview of shared file system $SANique^{TM}$ and proposes the method for detection of failure node and recovery management algorithm. We also illustrate the characteristics and system architecture of shared file system based on SAN. In order to provide uninterrupted service, the detection and recovery methods are proposed under the all possible system failures and natural disasters. The various kinds of system failures and disasters are characterized and then the detection and recovery method are proposed in each disconnected computing node group.