• 제목/요약/키워드: classification trees

검색결과 313건 처리시간 0.029초

데이터의 다중 추상화 수준을 위한 결정 트리 (Decision Trees For Multiple Abstraction Level of Data)

  • 정민아;이도현
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2001년도 봄 학술발표논문집 Vol.28 No.1 (B)
    • /
    • pp.82-84
    • /
    • 2001
  • 데이터 분류(classification)란 이미 분류된 객체집단군 즉, 학습 데이터에 대한 분석을 바탕으로 아직 분류되지 않는 개체의 소속 집단을 결정하는 작업이다. 현재까지 제안된 여러 가지 분류 모델 중 결정 트리(decision tree)는 인간이 이해하기 쉬운 형태를 갖고 있기 때문에 탐사적인 데이터 마이닝(exploatory)작업에 특히 유용하다. 본 논문에서는 결정 트리 분류에 다중 추상화 수준 문제(multiple abstraction level problem)를 소개하고 이러한 문제를 다루기 위한 실용적인 방법을 제안한다. 데이터의 다중 추상화 수준 문제를 해결하기 위해 추상화 수준을 강제로 같게 하는 것이 문제를 해결할 수 없다는 것을 보인 후, 데이터 값들 사이의 일반화, 세분화 관련성을 그대로 유지하면서 존재하는 유용화할 수 있는 방법을 제시한다.

  • PDF

문서지문기법을 이용한 웹 문서의 자동 분류

  • 김진화
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2004년도 추계학술대회 및 정기총회
    • /
    • pp.407-429
    • /
    • 2004
  • As documents in webs are increasing explosively due to the rapid development of electronic documents, an efficient system classifying documents automatically is required. In this study, a new document classification method, which is called Document Finger Print Method, is suggested to classify web documents automatically and efficiently. The performance of the suggested method is evaluated alone with other existing methods such as key words based method, weighted key words based method, neural networks, and decision trees. An experiment is designed with 10 documents categories and 59 randomly selected words. The result shows that the suggested algorithm has a superior classifying performance compared to other methods. The most important advantage of this method is that the suggested method works well without the size limits of the number of words in documents.

  • PDF

귀납적 사례학습에 의한 RC교량 주형의 상태평가 (State Evaluation of RC Bridge Girders by Inductive Case Learning)

  • 안승수;김기현;박광림;황진하
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2000년도 가을 학술발표회논문집
    • /
    • pp.159-165
    • /
    • 2000
  • A new state evaluation approach for structural safety is presented in this study. To reduce the subjectivity of the view and judgement of each expert founded on a limited body of knowledge in cognitive and inferential process of safety assessment, we introduced inductive learning method in AI. Inductive learning derives generalization from experiences. Decision tree induction algorithm analyzes the domain knowledge, produce rules via decision trees and then allow us to determine the classification of an object from case examples. The training set of state evaluation is constructed according to the selected attributes from working reports of RC bridge girders.

  • PDF

Pruning the Boosting Ensemble of Decision Trees

  • Yoon, Young-Joo;Song, Moon-Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제13권2호
    • /
    • pp.449-466
    • /
    • 2006
  • We propose to use variable selection methods based on penalized regression for pruning decision tree ensembles. Pruning methods based on LASSO and SCAD are compared with the cluster pruning method. Comparative studies are performed on some artificial datasets and real datasets. According to the results of comparative studies, the proposed methods based on penalized regression reduce the size of boosting ensembles without decreasing accuracy significantly and have better performance than the cluster pruning method. In terms of classification noise, the proposed pruning methods can mitigate the weakness of AdaBoost to some degree.

Study on the ensemble methods with kernel ridge regression

  • Kim, Sun-Hwa;Cho, Dae-Hyeon;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.375-383
    • /
    • 2012
  • The purpose of the ensemble methods is to increase the accuracy of prediction through combining many classifiers. According to recent studies, it is proved that random forests and forward stagewise regression have good accuracies in classification problems. However they have great prediction error in separation boundary points because they used decision tree as a base learner. In this study, we use the kernel ridge regression instead of the decision trees in random forests and boosting. The usefulness of our proposed ensemble methods was shown by the simulation results of the prostate cancer and the Boston housing data.

농촌정주생활권내의 마을 비보숲의 실태에 관한 연구 - 전북 진안군 지역을 중심으로 - (A Study on the Groves for making enclosed Village in Rural Human Settlement Circle)

  • 박재철
    • 한국조경학회지
    • /
    • 제26권3호
    • /
    • pp.152-161
    • /
    • 1998
  • The purpose of this study was to identify remained real state of groves of enclosed village in human settlement circle. That was practiced in case of Chinan-Gun region which traditional elements had well been conservated. 48 village groves were found by site survey, reference and interview in Chinan-Gun region. 27 groves of 48 village groves were clarified as complementing village grove by classification of grove character. It was identified through survey that many were partially destructed by development and human use. The results of this study showed general, socio-behavioral characteristics, characteristics of forest state and vegetation structure of complementing village groves. Length, area, form, type, motive, location, relationship of those were analyzed to identify general characteristics. Facilities, human behavior and ownership of those were analyzed to identify socio-behavoral characteristics. Dominent species, appearing rate, height, width, density and biodiversity of upper trees were analyzed to identify forest state and vegetation structure. Interrelation of each factor were analiged and comparative review with previous studies was achieved.

  • PDF

Bi-directional Reflectance Effects on Mangrove Classification of IKONOS Multi-angular Images

  • Rubio, M.C.D.;Nadaoka, K.;Paringit, E.C.
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.4-6
    • /
    • 2003
  • Optical signals from an object may vary at different conditions caused by differences in light source and sensor position. Knowledge of these variations is necessary to enable calibration of the satellite images and confirmation of the sun and sensor angles influences of the spectral signals from the objects. With the use high -resolution Ikonos$^{TM}$ multi-angular images, the bi- directional reflectance effects of mangrove trees were observed when three datasets were compared. The influence of bi- directional reflectance may affect the accuracy of interpreting satellite imagery and obtaining biophysical parameters mangrove and other vegetation by indirect means.

  • PDF

부스팅 인공신경망을 활용한 부실예측모형의 성과개선 (Boosting neural networks with an application to bankruptcy prediction)

  • 김명종;강대기
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2009년도 춘계학술대회
    • /
    • pp.872-875
    • /
    • 2009
  • In a bankruptcy prediction model, the accuracy is one of crucial performance measures due to its significant economic impacts. Ensemble is one of widely used methods for improving the performance of classification and prediction models. Two popular ensemble methods, Bagging and Boosting, have been applied with great success to various machine learning problems using mostly decision trees as base classifiers. In this paper, we analyze the performance of boosted neural networks for improving the performance of traditional neural networks on bankruptcy prediction tasks. Experimental results on Korean firms indicated that the boosted neural networks showed the improved performance over traditional neural networks.

  • PDF

Characterizations of five heterotrophic nanoflagellates newly recorded in Korea

  • Jeong, Dong Hyuk;Park, Jong Soo
    • Journal of Species Research
    • /
    • 제10권4호
    • /
    • pp.356-363
    • /
    • 2021
  • Heterotrophic nanoflagellates (HNFs, 2-20 ㎛ in size) are substantially capable of controlling bacterial abundance in aquatic environments, and microbial taxonomists have studied ecologically important and abundant HNFs for a long time. However, the classifications of HNFs have rarely been reported in Korea on the basis of morphology and 18S rDNA sequencing. Here, previously reported five HNFs from non-Korean habitats were isolated from Korean coastal seawater or intertidal sediments for the first time. Light microscopic observations and 18S rDNA phylogenetic trees revealed that the five isolated species were Cafeteria burkhardae strain PH003, Cafeteria graefeae strain UL001, Aplanochytrium minuta (formerly Labyrinthuloides minuta) strain PH004, Neobodo curvifilus strain KM017 (formerly Procryptobia sorokini), and Ancyromonas micra (formerly Planomonas micra) strain IG005. Being morphologically and phylogenetically indistinct from its closest species, all isolates from Korea were therefore regarded as identical species detected in other countries. Thus, this result indicates an expansion of known habitats that range from those of the five isolates in natural ecosystems on Earth.

Emerging Machine Learning in Wearable Healthcare Sensors

  • Gandha Satria Adi;Inkyu Park
    • 센서학회지
    • /
    • 제32권6호
    • /
    • pp.378-385
    • /
    • 2023
  • Human biosignals provide essential information for diagnosing diseases such as dementia and Parkinson's disease. Owing to the shortcomings of current clinical assessments, noninvasive solutions are required. Machine learning (ML) on wearable sensor data is a promising method for the real-time monitoring and early detection of abnormalities. ML facilitates disease identification, severity measurement, and remote rehabilitation by providing continuous feedback. In the context of wearable sensor technology, ML involves training on observed data for tasks such as classification and regression with applications in clinical metrics. Although supervised ML presents challenges in clinical settings, unsupervised learning, which focuses on tasks such as cluster identification and anomaly detection, has emerged as a useful alternative. This review examines and discusses a variety of ML algorithms such as Support Vector Machines (SVM), Random Forests (RF), Decision Trees (DT), Neural Networks (NN), and Deep Learning for the analysis of complex clinical data.