• Title/Summary/Keyword: Data Tree

Search Result 3,320, Processing Time 0.031 seconds

Suffix Tree Constructing Algorithm for Large DNA Sequences Analysis (대용량 DNA서열 처리를 위한 서픽스 트리 생성 알고리즘의 개발)

  • Choi, Hae-Won
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.37-46
    • /
    • 2010
  • A Suffix Tree is an efficient data structure that exposes the internal structure of a string and allows efficient solutions to a wide range of complex string problems, in particular, in the area of computational biology. However, as the biological information explodes, it is impossible to construct the suffix trees in main memory. We should find an efficient technique to construct the trees in a secondary storage. In this paper, we present a method for constructing a suffix tree in a disk for large set of DNA strings using new index scheme. We also show a typical application example with a suffix tree in the disk.

Digital Hologram Watermarking using Quad-tree Fresnelet Transform (Quad-tree Fresnelet 변환을 이용한 디지털 홀로그램 워터마킹)

  • Seo, Young Ho;Koo, Ja Myung;Lee, Yoon Hyuk;Kim, Dong Wook
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.3
    • /
    • pp.79-89
    • /
    • 2013
  • This paper proposes a watermarking scheme to protect ownership of a digital hologram, an ultra-high value-added content. It performs pre-defined levels of quad-tree Fresnelet transforms. The relationship among the same-positional-blocks is extracted as the digital pre-watermark. For the relationship, we use properties of a digital hologram that a hologram pixel retains all the information of the object and that the same size of partial holograms reconstructs the same size of object but different in their view points. Also we mix a set of private data with the pre-watermark and the result is encrypted by a block cipher algorithm with a private key. Experimental results showed that the proposed scheme is very robust for the various malicious and non-malicious attacks. Also because it extracts the watermarking data instead of inserting, the watermarking process does not harm the original hologram data. So, it is expected to be used effectively for invisible and robust watermark for digital holograms.

CANCER CLASSIFICATION AND PREDICTION USING MULTIVARIATE ANALYSIS

  • Shon, Ho-Sun;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.706-709
    • /
    • 2006
  • Cancer is one of the major causes of death; however, the survival rate can be increased if discovered at an early stage for timely treatment. According to the statistics of the World Health Organization of 2002, breast cancer was the most prevalent cancer for all cancers occurring in women worldwide, and it account for 16.8% of entire cancers inflicting Korean women today. In order to classify the type of breast cancer whether it is benign or malignant, this study was conducted with the use of the discriminant analysis and the decision tree of data mining with the breast cancer data disclosed on the web. The discriminant analysis is a statistical method to seek certain discriminant criteria and discriminant function to separate the population groups on the basis of observation values obtained from two or more population groups, and use the values obtained to allow the existing observation value to the population group thereto. The decision tree analyzes the record of data collected in the part to show it with the pattern existing in between them, namely, the combination of attribute for the characteristics of each class and make the classification model tree. Through this type of analysis, it may obtain the systematic information on the factors that cause the breast cancer in advance and prevent the risk of recurrence after the surgery.

  • PDF

Application Cases of Risk Assessment for British Railtrack System (영국철도시스템에 적용된 리스크평가 사례)

  • Lee, Dong-Ha;Jeong, Gwang-Tae
    • Journal of the Ergonomics Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.81-94
    • /
    • 2003
  • The British railway safety research group has developed a risk assessment model for the railway infrastructure and major railway accidents. The major hazardous factors of the railway infrastructure were identified and classified in the model. The frequency rates of critical top events were predicted by the fault tree analysis method using failure data of the railway system components and ratings of railway maintenance experts, The consequences of critical top events were predicted by the event tree analysis method. They classified the Joss of accident due to railway system into personal. commercial and environmental damages. They also classified 110 hazardous event due to railway system into three categories. train accident. movement accident and non-movement accident. The risk assessment model of the British railway system has been designed to take full account of both the high frequency low consequence type events (events occurring routinely for which there is significant quantity of recorded data) and the low frequency high consequence events (events occurring rarely for which there is little recorded data). The results for each hazardous event were presented in terms of the frequency of occurrence (number of events/year) and the risk (number of equivalent fatalities per year).

Protection of a Multicast Connection Request in an Elastic Optical Network Using Shared Protection

  • BODJRE, Aka Hugues Felix;ADEPO, Joel;COULIBALY, Adama;BABRI, Michel
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.119-124
    • /
    • 2021
  • Elastic Optical Networks (EONs) allow to solve the high demand for bandwidth due to the increase in the number of internet users and the explosion of multicast applications. To support multicast applications, network operator computes a tree-shaped path, which is a set of optical channels. Generally, the demand for bandwidth on an optical channel is enormous so that, if there is a single fiber failure, it could cause a serious interruption in data transmission and a huge loss of data. To avoid serious interruption in data transmission, the tree-shaped path of a multicast connection may be protected. Several works have been proposed methods to do this. But these works may cause the duplication of some resources after recovery due to a link failure. Therefore, this duplication can lead to inefficient use of network resources. Our work consists to propose a method of protection that eliminates the link that causes duplication so that, the final backup path structure after link failure is a tree. Evaluations and analyses have shown that our method uses less backup resources than methods for protection of a multicast connection.

Efficient Mining of Frequent Itemsets in a Sparse Data Set (희소 데이터 집합에서 효율적인 빈발 항목집합 탐사 기법)

  • Park In-Chang;Chang Joong-Hyuk;Lee Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.12D no.6 s.102
    • /
    • pp.817-828
    • /
    • 2005
  • The main research problems in a mining frequent itemsets are reducing memory usage and processing time of the mining process, and most of the previous algorithms for finding frequent itemsets are based on an Apriori-property, and they are multi-scan algorithms. Moreover, their processing time are greatly increased as the length of a maximal frequent itemset. To overcome this drawback, another approaches had been actively proposed in previous researches to reduce the processing time. However, they are not efficient on a sparse .data set This paper proposed an efficient mining algorithm for finding frequent itemsets. A novel tree structure, called an $L_2$-tree, was proposed int, and an efficient mining algorithm of frequent itemsets using $L_2$-tree, called an $L_2$-traverse algorithm was also proposed. An $L_2$-tree is constructed from $L_2$, i.e., a set of frequent itemsets of size 2, and an $L_2$-traverse algorithm can find its mining result in a short time by traversing the $L_2$-tree once. To reduce the processing more, this paper also proposed an optimized algorithm $C_3$-traverse, which removes previously an itemset in $L_2$ not to be a frequent itemsets of size 3. Through various experiments, it was verified that the proposed algorithms were efficient in a sparse data set.

Lossless Image Compression Using Block-Adaptive Context Tree Weighting (블록 적응적인 Context Tree Weighting을 이용한 무손실 영상 압축)

  • Oh, Eun-ju;Cho, Hyun-ji;Yoo, Hoon
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.43-49
    • /
    • 2020
  • This paper proposes a lossless image compression method based on arithmetic coding using block-adaptive Context Tree Weighting. The CTW method predicts and compresses the input data bit by bit. Also, it can achieve a desirable coding distribution for tree sources with an unknown model and unknown parameters. This paper suggests the method to enhance the compression rate about image data, especially aerial and satellite images that require lossless compression. The value of aerial and satellite images is significant. Also, the size of their images is huger than common images. But, existed methods have difficulties to compress these data. For these reasons, this paper shows the experiment to prove a higher compression rate when using the CTW method with divided images than when using the same method with non-divided images. The experimental results indicate that the proposed method is more effective when compressing the divided images.

A Study on the Design of Tolerance for Process Parameter using Decision Tree and Loss Function (의사결정나무와 손실함수를 이용한 공정파라미터 허용차 설계에 관한 연구)

  • Kim, Yong-Jun;Chung, Young-Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.123-129
    • /
    • 2016
  • In the manufacturing industry fields, thousands of quality characteristics are measured in a day because the systems of process have been automated through the development of computer and improvement of techniques. Also, the process has been monitored in database in real time. Particularly, the data in the design step of the process have contributed to the product that customers have required through getting useful information from the data and reflecting them to the design of product. In this study, first, characteristics and variables affecting to them in the data of the design step of the process were analyzed by decision tree to find out the relation between explanatory and target variables. Second, the tolerance of continuous variables influencing on the target variable primarily was shown by the application of algorithm of decision tree, C4.5. Finally, the target variable, loss, was calculated by a loss function of Taguchi and analyzed. In this paper, the general method that the value of continuous explanatory variables has been used intactly not to be transformed to the discrete value and new method that the value of continuous explanatory variables was divided into 3 categories were compared. As a result, first, the tolerance obtained from the new method was more effective in decreasing the target variable, loss, than general method. In addition, the tolerance levels for the continuous explanatory variables to be chosen of the major variables were calculated. In further research, a systematic method using decision tree of data mining needs to be developed in order to categorize continuous variables under various scenarios of loss function.

A Design and Implementation for Dynamic Relocate Algorithm Using the Binary Tree Structure (이진트리구조를 이용한 동적 재배치 알고리즘 설계 및 구현)

  • 최강희
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.6
    • /
    • pp.827-836
    • /
    • 2001
  • Data is represented by file structure in Computer System. But the file size is to be larger, it is hard to control and transmit. Therefore, in recent years, many researchers have developed new algorithms for the data compression. And now, we introduce a new Dynamic Compression Technique, making up for the weaknesses of huffman's. The huffman compression technique has two weaknesses. The first, it needs two steps of reading, one for acquiring character frequency and the other for real compression. The second, low compression rate caused by storing tree information. These weaknesses can be solved by our new Dynamic Relocatable Method, reducing the reading pass by relocating data file to dynamic form, and then storing tree information from pipeline structure. The first, it needs two steps of reading, one for acquiring character frequency and the other for real compression. The second, low compression rate caused by storing tree information. These weaknesses can be solved by our new Dynamic Relocatable Method, reducing the reading pass by relocating data file to dynamic form, and then storing tree information from pipeline structure.

  • PDF

Classification of Land Cover over the Korean Peninsula Using Polar Orbiting Meteorological Satellite Data (극궤도 기상위성 자료를 이용한 한반도의 지면피복 분류)

  • Suh, Myoung-Seok;Kwak, Chong-Heum;Kim, Hee-Soo;Kim, Maeng-Ki
    • Journal of the Korean earth science society
    • /
    • v.22 no.2
    • /
    • pp.138-146
    • /
    • 2001
  • The land cover over Korean peninsula was classified using a multi-temporal NOAA/AVHRR (Advanced Very High Resolution Radiometer) data. Four types of phenological data derived from the 10-day composited NDVI (Normalized Differences Vegetation Index), maximum and annual mean land surface temperature, and topographical data were used not only reducing the data volume but also increasing the accuracy of classification. Self organizing feature map (SOFM), a kind of neural network technique, was used for the clustering of satellite data. We used a decision tree for the classification of the clusters. When we compared the classification results with the time series of NDVI and some other available ground truth data, the urban, agricultural area, deciduous tree and evergreen tree were clearly classified.

  • PDF