• Title/Summary/Keyword: Data Tree

Search Result 3,331, Processing Time 0.034 seconds

Classification of Land Cover over the Korean Peninsula Using Polar Orbiting Meteorological Satellite Data (극궤도 기상위성 자료를 이용한 한반도의 지면피복 분류)

  • Suh, Myoung-Seok;Kwak, Chong-Heum;Kim, Hee-Soo;Kim, Maeng-Ki
    • Journal of the Korean earth science society
    • /
    • v.22 no.2
    • /
    • pp.138-146
    • /
    • 2001
  • The land cover over Korean peninsula was classified using a multi-temporal NOAA/AVHRR (Advanced Very High Resolution Radiometer) data. Four types of phenological data derived from the 10-day composited NDVI (Normalized Differences Vegetation Index), maximum and annual mean land surface temperature, and topographical data were used not only reducing the data volume but also increasing the accuracy of classification. Self organizing feature map (SOFM), a kind of neural network technique, was used for the clustering of satellite data. We used a decision tree for the classification of the clusters. When we compared the classification results with the time series of NDVI and some other available ground truth data, the urban, agricultural area, deciduous tree and evergreen tree were clearly classified.

  • PDF

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

A Study of Data Mining Methodology for Effective Analysis of False Alarm Event on Mechanical Security System (기계경비시스템 오경보 이벤트 분석을 위한 데이터마이닝 기법 연구)

  • Kim, Jong-Min;Choi, Kyong-Ho;Lee, Dong-Hwi
    • Convergence Security Journal
    • /
    • v.12 no.2
    • /
    • pp.61-70
    • /
    • 2012
  • The objective of this study is to achieve the most optimal data mining for effective analysis of false alarm event on mechanical security system. To perform this, this study searches the cause of false alarm and suggests the data conversion and analysis methods to apply to several algorithm of WEKA, which is a data mining program, based on statistical data for the number of case on movement by false alarm, false alarm rate and cause of false alarm. Analysis methods are used to estimate false alarm and set more effective reaction for false alarm by applying several algorithm. To use the suitable data for effective analysis of false alarm event on mechanical security analysis this study uses Decision Tree, Naive Bayes, BayesNet Apriori and J48Tree algorithm, and applies the algorithm by deducting the highest value.

Prediction Model for Unpaid Customers Using Big Data (빅 데이터 기반의 체납 수용가 예측 모델)

  • Jeong, Jaean;Lee, Kyouhwan;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.827-833
    • /
    • 2020
  • In this paper, to reduce the unpaid rate of local governments, the internal data elements affecting the arrears in Water-INFOS are searched through interviews with meter readers in certain local governments. Candidate data affecting arrears from national statistical data were derived. The influence of the independent variable on the dependent variable was sampled by examining the disorder of the dependent variable in the data set called information gain. We also evaluated the higher prediction rates of decision tree and logistic regression using n-fold cross-validation. The results confirmed that the decision tree can find more accurate customer payment patterns than logistic regression. In the process of developing an analysis algorithm model using machine learning, the optimal values of two environmental variables, the minimum number of data and the maximum purity, which directly affect the complexity and accuracy of the decision tree, are derived to improve the accuracy of the algorithm.

Design and Implementation of a Trajectory-based Index Structure for Moving Objects on a Spatial Network (공간 네트워크상의 이동객체를 위한 궤적기반 색인구조의 설계 및 구현)

  • Um, Jung-Ho;Chang, Jae-Woo
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.169-181
    • /
    • 2008
  • Because moving objects usually move on spatial networks, efficient trajectory index structures are required to achieve good retrieval performance on their trajectories. However, there has been little research on trajectory index structures for spatial networks such as FNR-tree and MON-tree. But, because FNR-tree and MON-tree are stored by the unit of the moving object's segment, they can't support the whole moving objects' trajectory. In this paper, we propose an efficient trajectory index structure, named Trajectory of Moving objects on Network Tree(TMN Tree), for moving objects. For this, we divide moving object data into spatial and temporal attribute, and preserve moving objects' trajectory. Then, we design index structure which supports not only range query but trajectory query. In addition, we divide user queries into spatio-temporal area based trajectory query, similar-trajectory query, and k-nearest neighbor query. We propose query processing algorithms to support them. Finally, we show that our trajectory index structure outperforms existing tree structures like FNR-Tree and MON-Tree.

Symmetric Tree Replication Protocol for Efficient Distributed Storage System (효율적인 분산 저장 시스템을 위한 대칭 트리 복제 프로토콜)

  • 최성춘;윤희용;이강신;이호재
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.9
    • /
    • pp.503-513
    • /
    • 2004
  • In large distributed systems, replications of data and service are needed to decrease communication cost, increase availability, and avoid single server bottleneck. Tree Quorum protocol is a representative replication protocol, which exploits a logical structure. Tree quorum protocol is one of the replication protocols allowing low read cost only in the best case, while the number of replicas exponentially increases as the level grows. In this paper, thus, we propose a new replication protocol, called symmetric tree protocol which efficiently solves the problem. The proposed symmetric tree protocol also requires much smaller read cost than previous protocols. We conduct cost and availability analysis of the protocols, and the proposed protocol displays comparable read availability to the tree protocol using much smaller number of nodes. Also, the symmetric tree protocol has much smaller response time than the logarithmic protocol.

Short-term demand forecasting Using Data Mining Method (데이터마이닝을 이용한 단기부하예측)

  • Choi, Sang-Yule;Kim, Hyoung-Joong
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.21 no.10
    • /
    • pp.126-133
    • /
    • 2007
  • This paper proposes information technology based data mining to forecast short term power demand. A time-series analyses have been applied to power demand forecasting, but this method needs not only heavy computational calculation but also large amount of coefficient data. Therefore, it is hard to analyze data in fast way. To overcome time consuming process, the author take advantage of universally easily available information technology based data-mining technique to analyze patterns of days and special days(holidays, etc.). This technique consists of two steps, one is constructing decision tree, the other is estimating and forecasting power flow using decision tree analysis. To validate the efficiency, the author compares the estimated demand with real demand from the Korea Power Exchange.

Semi-supervised Model for Fault Prediction using Tree Methods (트리 기법을 사용하는 세미감독형 결함 예측 모델)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.4
    • /
    • pp.107-113
    • /
    • 2020
  • A number of studies have been conducted on predicting software faults, but most of them have been supervised models using labeled data as training data. Very few studies have been conducted on unsupervised models using only unlabeled data or semi-supervised models using enough unlabeled data and few labeled data. In this paper, we produced new semi-supervised models using tree algorithms in the self-training technique. As a result of the model performance evaluation experiment, the newly created tree models performed better than the existing models, and CollectiveWoods, in particular, outperformed other models. In addition, it showed very stable performance even in the case with very few labeled data.

Changes of Antioxidant Capacity, Total Phenolics, and Vitamin C Contents During Rubus coreanus Fruit Ripening

  • Park, Young-Ki;Kim, Sea-Hyun;Choi, Sun-Ha;Han, Jin-Gyu;Chung, Hun-Gwan
    • Food Science and Biotechnology
    • /
    • v.17 no.2
    • /
    • pp.251-256
    • /
    • 2008
  • Changes in antioxidant activity of Rubus coreanus fruit of 3 clones (S13, S114, and S16), which were selected from different sites, were studied at different ripening stages. Antioxidant activities (tree radical scavenging activity and reducing power) were determined and their relationships to total phenolic contents and ascorbic acid were analyzed. The highest tree radical scavenging activities of 3 clones (S13, S14, and S16) were 79.39, 75.80, and 81.16% at $125\;{\mu}g/mL$, respectively. In general, the antioxidant activity and the related parameters, including total phenolic content and vitamin C content decreased during fruit ripening. Total phenolic contents of the R. coreanus fruits (S13, S14, and S16) were correlated with tree radical scavenging activity ($R^2=0.8114$, 0.9186, and 0.9714). These results improve knowledge of the effect of ripening on the antioxidant activity and related compounds contents that could help to establish the optimum R. coreanus fruit harvest data for various usages.

Two-Stage Decision Tree Analysis for Diagnosis of Personal Sasang Constitution Medicine Type (사상체질 판별을 위한 2단계 의사결정 나무 분석)

  • Jin, Hee-Jeong;Lee, Hae-Jung;Kim, Myoung-Geun;Kim, Hong-Gie;Kim, Jong-Yeol
    • Journal of Sasang Constitutional Medicine
    • /
    • v.22 no.3
    • /
    • pp.87-97
    • /
    • 2010
  • 1. Objectives: In SCM, a personal Sasang constitution must be determined accurately before any Sasang treatment. The purpose of this study is to develop an objective method for classification of Sasang constitution. 2. Methods: We collected samples from 5 centers where SCM is practiced, and applied two-stage decision tree analysis on these samples. We recruited samples from 5 centers. The collected data were from subjects whose response to herbal medicine was confirmed according to Sasang constitution. 3. Results: The two-stage decision tree model shows higher classification power than a simple decision tree model. This study also suggests that gender must be considered in the first stage to improve the accuracy of classification. 4. Conclusions: We identified important factors for classifying Sasang constitutions through two-stage decision tree analysis. The two-stage decision tree model shows higher classification power than a simple decision tree model.