• Title/Summary/Keyword: Tree classifiers

Search Result 79, Processing Time 0.029 seconds

Comparison between Hyperspectral and Multispectral Images for the Classification of Coniferous Species (침엽수종 분류를 위한 초분광영상과 다중분광영상의 비교)

  • Cho, Hyunggab;Lee, Kyu-Sung
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.1
    • /
    • pp.25-36
    • /
    • 2014
  • Multispectral image classification of individual tree species is often difficult because of the spectral similarity among species. In this study, we attempted to analyze the suitability of hyperspectral image to classify coniferous tree species. Several image sets and classification methods were applied and the classification results were compared with the ones from multispectral image. Two airborne hyperspectral images (AISA, CASI) were obtained over the study area in the Gwangneung National Forest. For the comparison, ETM+ multispectral image was simulated using hyperspectral images as to have lower spectral resolution. We also used the transformed hyperspectral data to reduce the data volume for the classification. Three supervised classification schemes (SAM, SVM, MLC) were applied to thirteen image sets. In overall, hyperspectral image provides higher accuracies than multispectral image to discriminate coniferous species. AISA-dual image, which include additional SWIR spectral bands, shows the best result as compared with other hyperspectral images that include only visible and NIR bands. Furthermore, MNF transformed hyperspectral image provided higher classification accuracies than the full-band and other band reduced data. Among three classifiers, MLC showed higher classification accuracy than SAM and SVM classifiers.

A Study on the Implementation of SQL Primitives for Decision Tree Classification (판단 트리 분류를 위한 SQL 기초 기능의 구현에 관한 연구)

  • An, Hyoung Geun;Koh, Jae Jin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.855-864
    • /
    • 2013
  • Decision tree classification is one of the important problems in data mining fields and data minings have been important tasks in the fields of large database technologies. Therefore the coupling efforts of data mining systems and database systems have led the developments of database primitives supporting data mining functions such as decision tree classification. These primitives consist of the special database operations which support the SQL implementation of decision tree classification algorithms. These primitives have become the consisting modules of database systems for the implementations of the specific algorithms. There are two aspects in the developments of database primitives which support the data mining functions. The first is the identification of database common primitives which support data mining functions by analysis. The other is the provision of the extended mechanism for the implementations of these primitives as an interface of database systems. In data mining, some primitives want be stored in DBMS is one of the difficult problems. In this paper, to solve of the problem, we describe the database primitives which construct and apply the optimized decision tree classifiers. Then we identify the useful operations for various classification algorithms and discuss the implementations of these primitives on the commercial DBMS. We implement these primitives on the commercial DBMS and present experimental results demonstrating the performance comparisons.

Scalable and Accurate Intrusion Detection using n-Gram Augmented Naive Bayes and Generalized k-Truncated Suffix Tree (N-그램 증강 나이브 베이스 알고리즘과 일반화된 k-절단 서픽스트리를 이용한 확장가능하고 정확한 침입 탐지 기법)

  • Kang, Dae-Ki;Hwang, Gi-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.805-812
    • /
    • 2009
  • In many intrusion detection applications, n-gram approach has been widely applied. However, n-gram approach has shown a few problems including unscalability and double counting of features. To address those problems, we applied n-gram augmented Naive Bayes with k-truncated suffix tree (k-TST) storage mechanism directly to classify intrusive sequences and compared performance with those of Naive Bayes and Support Vector Machines (SVM) with n-gram features by the experiments on host-based intrusion detection benchmark data sets. Experimental results on the University of New Mexico (UNM) benchmark data sets show that the n-gram augmented method, which solves the problem of independence violation that happens when n-gram features are directly applied to Naive Bayes (i.e. Naive Bayes with n-gram features), yields intrusion detectors with higher accuracy than those from Naive Bayes with n-gram features and shows comparable accuracy to those from SVM with n-gram features. For the scalable and efficient counting of n-gram features, we use k-truncated suffix tree mechanism for storing n-gram features. With the k-truncated suffix tree storage mechanism, we tested the performance of the classifiers up to 20-gram, which illustrates the scalability and accuracy of n-gram augmented Naive Bayes with k-truncated suffix tree storage mechanism.

Medical Diagnosis Problem Solving Based on the Combination of Genetic Algorithms and Local Adaptive Operations (유전자 알고리즘 및 국소 적응 오퍼레이션 기반의 의료 진단 문제 자동화 기법 연구)

  • Lee, Ki-Kwang;Han, Chang-Hee
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.2
    • /
    • pp.193-206
    • /
    • 2008
  • Medical diagnosis can be considered a classification task which classifies disease types from patient's condition data represented by a set of pre-defined attributes. This study proposes a hybrid genetic algorithm based classification method to develop classifiers for multidimensional pattern classification problems related with medical decision making. The classification problem can be solved by identifying separation boundaries which distinguish the various classes in the data pattern. The proposed method fits a finite number of regional agents to the data pattern by combining genetic algorithms and local adaptive operations. The local adaptive operations of an agent include expansion, avoidance and relocation, one of which is performed according to the agent's fitness value. The classifier system has been tested with well-known medical data sets from the UCI machine learning database, showing superior performance to other methods such as the nearest neighbor, decision tree, and neural networks.

  • PDF

A High Order Product Approximation Method based on the Minimization of Upper Bound of a Bayes Error Rate and Its Application to the Combination of Numeral Recognizers (베이스 에러율의 상위 경계 최소화에 기반한 고차 곱 근사 방법과 숫자 인식기 결합에의 적용)

  • Kang, Hee-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.9
    • /
    • pp.681-687
    • /
    • 2001
  • In order to raise a class discrimination power by combining multiple classifiers under the Bayesian decision theory, the upper bound of a Bayes error rate bounded by the conditional entropy of a class variable and decision variables obtained from training data samples should be minimized. Wang and Wong proposed a tree dependence first-order approximation scheme of a high order probability distribution composed of the class and multiple feature pattern variables for minimizing the upper bound of the Bayes error rate. This paper presents an extended high order product approximation scheme dealing with higher order dependency more than the first-order tree dependence, based on the minimization of the upper bound of the Bayes error rate. Multiple recognizers for unconstrained handwritten numerals from CENPARMI were combined by the proposed approximation scheme using the Bayesian formalism, and the high recognition rates were obtained by them.

  • PDF

Fast Modulation Classifier for Software Radio (소프트웨어 라디오를 위한 고속 변조 인식기)

  • Park, Cheol-Sun;Jang, Won;Kim, Dae-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.4C
    • /
    • pp.425-432
    • /
    • 2007
  • In this paper, we deals with automatic modulation classification capable of classifying incident signals without a priori information. The 7 key features which have good properties of sensitive with modulation types and insensitive with SNR variation are selected. The numerical simulations for classifying 9 modulation types using the these features are performed. The numerical simulations of the 4 types of modulation classifiers are performed the investigation of classification accuracy and execution time to implement the fast modulation classifier in software radio. The simulation result indicated that the execution time of DTC was best and SVC and MDC showed good classification performance. The prototype was implemented with DTC type. With the result of field trials, we confirmed the performance in the prototype was agreed with the numerical simulation result of DTC.

Automatic Recognition of Digital Modulation Types using Wavelet Transformation (웨이브릿 변환을 이용한 디지털 변조타입 자동 인식)

  • Park, Cheol-Sun;Nah, Sun-Phil;Yang, Jong-Won;Choi, Jun-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.4
    • /
    • pp.22-30
    • /
    • 2008
  • In this paper, we deal with modulation classification method using WT capable of classifying incident digital signals without a priori information. These key features should have good properties of sensitive with modulation types and insensitive with SNR variation. The 4 key features for modulation recognition are selected using WT coefficients, which have the property of insentive to the changing of noise. The numerical simulations for classifying 8 digital modulation types using these features are peformed. The numerical simulations of the 3 types (i.e. DTC, MDC, and SVMC) of modulation classifiers are performed the investigation of classification accuracy and execution time to design the modulation classification module in software radio. The simulation result indicated that the execution time of MDC and DTC was best and MDC and SVMC showed good classification performance.

Effective Korean sentiment classification method using word2vec and ensemble classifier (Word2vec과 앙상블 분류기를 사용한 효율적 한국어 감성 분류 방안)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.133-140
    • /
    • 2018
  • Accurate sentiment classification is an important research topic in sentiment analysis. This study suggests an efficient classification method of Korean sentiment using word2vec and ensemble methods which have been recently studied variously. For the 200,000 Korean movie review texts, we generate a POS-based BOW feature and a feature using word2vec, and integrated features of two feature representation. We used a single classifier of Logistic Regression, Decision Tree, Naive Bayes, and Support Vector Machine and an ensemble classifier of Adaptive Boost, Bagging, Gradient Boosting, and Random Forest for sentiment classification. As a result of this study, the integrated feature representation composed of BOW feature including adjective and adverb and word2vec feature showed the highest sentiment classification accuracy. Empirical results show that SVM, a single classifier, has the highest performance but ensemble classifiers show similar or slightly lower performance than the single classifier.

유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용

  • Jang, Yeong-Sik;Kim, Jong-U;Heo, Jun
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.05a
    • /
    • pp.309-320
    • /
    • 2007
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. It causes low prediction accuracy of the minority class because classifiers tend to assign instances to major classes and ignore the minor class to reduce overall misclassification rate. In order to solve the data imbalance problem, there has been proposed a number of techniques based on resampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

Land Surface Classification With Airborne Multi-spectral Scanner Image Using A Neuro-Fuzzy Model (뉴로-퍼지 모델을 이용한 항공다중분광주사기 영상의 지표면 분류)

  • Han, Jong-Gyu;Ryu, Keun-Ho;Yeon, Yeon-Kwang;Chi, Kwang-Hoon
    • The KIPS Transactions:PartD
    • /
    • v.9D no.5
    • /
    • pp.939-944
    • /
    • 2002
  • In this paper, we propose and apply new classification method to the remotely sensed image acquired from airborne multi-spectral scanner. This is a neuro-fuzzy image classifier derived from the generic model of a 3-layer fuzzy perceptron. We implement a classification software system with the proposed method for land cover image classification. Comparisons with the proposed and maximum-likelihood classifiers are also presented. The results show that the neuro-fuzzy classification method classifies more accurately than the maximum likelihood method. In comparing the maximum-likelihood classification map with the neuro-fuzzy classification map, it is apparent that there is more different as amount as 7.96% in the overall accuracy. Most of the differences are in the "Building" and "Pine tree", for which the neuro-fuzzy classifier was considerably more accurate. However, the "Bare soil" is classified more correctly with the maximum-likelihood classifier rather than the neuro-fuzzy classifier.