• Title/Summary/Keyword: Classification Algorithms

Search Result 1,195, Processing Time 0.027 seconds

A Study on the Documents's Automatic Classification Using Machine Learning (기계학습을 이용한 문서 자동분류에 관한 연구)

  • Kim, Seong-Hee;Eom, Jae-Eun
    • Journal of Information Management
    • /
    • v.39 no.4
    • /
    • pp.47-66
    • /
    • 2008
  • This study introduced the machine learning algorithms to overcome the many different limitations involved with manual classification and to provide the users with faster and more accurate classification service. The experiments objects of the study were consisted of 100 literature titles for each of the eight subject categories in MeSH. The algorithms used to the experiments included Neural network, C5.0, CHAID and KNN. As results, the combination of the neural network and C5.0 technique recorded classification accuracy of 83.75%, which was 2.5% and 3.75% higher than that of the neural network alone and C5.0 alone, respectively. The number represented the highest accuracy rates among the four classification experiments. Thus the use of the neural network and C5.0 technique together will result in higher accuracy rates than the techniques individually.

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents (학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도)

  • Lee, Yong-Bae
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.325-332
    • /
    • 2014
  • It is generally accepted that classification accuracy is affected by the number of learning documents, but there are few studies that show how this influences automatic text classification. This study is focused on evaluating the deviation-based classification model which is developed recently for genre-based classification and comparing it to other classification algorithms with the changing number of training documents. Experiment results show that the deviation-based classification model performs with a superior accuracy of 0.8 from categorizing 7 genres with only 21 training documents. This exceeds the accuracy of Bayesian and SVM. The Deviation-based classification model obtains strong feature selection capability even with small number of training documents because it learns subject information within genre while other methods use different learning process.

Development of Feature-based Classification Software for High Resolution Satellite Imagery (고해상도 위성영상의 분류를 위한 형상 기반 분류 소프트웨어 개발)

  • Jeong, Soo;Lee, Chang-No
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.12 no.2 s.29
    • /
    • pp.53-59
    • /
    • 2004
  • In this paper, we investigated a method for feature-based classification to develop a software which is suitable for the classification of high resolution satellite imagery. We developed algorithms for image segmentation and fuzzy-based classification required for feature-based classification and designed user interfaces to support interaction with user, considering various elements required for the feature-based classification. Evaluation of the software was accomplished using real image. Classification results were compared and analysed with eCognition software which is unique commercial software for feature-based classification. The classification results from both softwares showed essentially same results and the developed software showed better result in the processing speed.

  • PDF

Target Classification in Sparse Sampling Acoustic Sensor Networks using DTW-Cosine Algorithm (저비율 샘플링 음향 센서네트워크에서 DTW-Cosine 알고리즘을 이용한 목표물 식별기법)

  • Kim, Young-Soo;Kang, Jong-Gu;Kim, Dae-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.2
    • /
    • pp.221-225
    • /
    • 2008
  • In this paper, to avoid the frequency analysis requiring a high sampling rate, time-warped similarity measure algorithms, which are able to classify objects even with a low-rate sampling rate as time- series methods, are presented and proposed the DTW-Cosine algorithm, as the best classifier among them in wireless sensor networks. Two problems, local time shifting and spatial signal variation, should be solved to apply the time-warped similarity measure algorithms to wireless sensor networks. We find that our proposed algorithm can overcome those problems very efficiently and outperforms the other algorithms by at least 10.3% accuracy.

(Real Time Classification System for Lead Pin Images) (실시간 Lead Pin 영상 분류 시스템)

  • 장용훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.9
    • /
    • pp.1177-1188
    • /
    • 2002
  • To classify real time Lead pin images in this paper, The image acquisition system was composed to C.C.D, image frame grabber(DT3153), P.C(PentiumIII). I proposed image processing algorithms. This algorithms were composed to real time monitoring, Lead Pin image acquisition, image noise deletion, object area detection, point detection and pattern classification algorithm. The raw images were acquired from Lead pin images using the system. The result images were obtained from raw images by image processing algorithms. In implemental result, The right recognition was 97 of 100 acceptable products, 95 of 100 defective products. The recognition rate was 96% for total 200 Lead Pins.

  • PDF

A Study on Approximation Model for Optimal Predicting Model of Industrial Accidents (산업재해의 최적 예측모형을 위한 근사모형에 관한 연구)

  • Leem, Young-Moon;Ryu, Chang-Hyun
    • Journal of the Korea Safety Management & Science
    • /
    • v.8 no.3
    • /
    • pp.1-9
    • /
    • 2006
  • Recently data mining techniques have been used for analysis and classification of data related to industrial accidents. The main objective of this study is to compare algorithms for data analysis of industrial accidents and this paper provides an optimal predicting model of 5 kinds of algorithms including CHAID, CART, C4.5, LR (Logistic Regression) and NN (Neural Network) with ROC chart, lift chart and response threshold. Also, this paper provides an approximation model for an optimal predicting model based on NN. The approximation model provided in this study can be utilized for easy interpretation of data analysis using NN. This study uses selected ten independent variables to group injured people according to a dependent variable in a way that reduces variation. In order to find an optimal predicting model among 5 algorithms, a retrospective analysis was performed in 67,278 subjects. The sample for this work chosen from data related to industrial accidents during three years ($2002\;{\sim}\;2004$) in korea. According to the result analysis, NN has excellent performance for data analysis and classification of industrial accidents.

A Feasibility Study on the Improvement of Diagnostic Accuracy for Energy-selective Digital Mammography using Machine Learning (머신러닝을 이용한 에너지 선택적 유방촬영의 진단 정확도 향상에 관한 연구)

  • Eom, Jisoo;Lee, Seungwan;Kim, Burnyoung
    • Journal of radiological science and technology
    • /
    • v.42 no.1
    • /
    • pp.9-17
    • /
    • 2019
  • Although digital mammography is a representative method for breast cancer detection. It has a limitation in detecting and classifying breast tumor due to superimposed structures. Machine learning, which is a part of artificial intelligence fields, is a method for analysing a large amount of data using complex algorithms, recognizing patterns and making prediction. In this study, we proposed a technique to improve the diagnostic accuracy of energy-selective mammography by training data using the machine learning algorithm and using dual-energy measurements. A dual-energy images obtained from a photon-counting detector were used for the input data of machine learning algorithms, and we analyzed the accuracy of predicted tumor thickness for verifying the machine learning algorithms. The results showed that the classification accuracy of tumor thickness was above 95% and was improved with an increase of imput data. Therefore, we expect that the diagnostic accuracy of energy-selective mammography can be improved by using machine learning.

Prediction of Academic Performance of College Students with Bipolar Disorder using different Deep learning and Machine learning algorithms

  • Peerbasha, S.;Surputheen, M. Mohamed
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.350-358
    • /
    • 2021
  • In modern years, the performance of the students is analysed with lot of difficulties, which is a very important problem in all the academic institutions. The main idea of this paper is to analyze and evaluate the academic performance of the college students with bipolar disorder by applying data mining classification algorithms using Jupiter Notebook, python tool. This tool has been generally used as a decision-making tool in terms of academic performance of the students. The various classifiers could be logistic regression, random forest classifier gini, random forest classifier entropy, decision tree classifier, K-Neighbours classifier, Ada Boost classifier, Extra Tree Classifier, GaussianNB, BernoulliNB are used. The results of such classification model deals with 13 measures like Accuracy, Precision, Recall, F1 Measure, Sensitivity, Specificity, R Squared, Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, TPR, TNR, FPR and FNR. Therefore, conclusion could be reached that the Decision Tree Classifier is better than that of different algorithms.

Utilizing Principal Component Analysis in Unsupervised Classification Based on Remote Sensing Data

  • Lee, Byung-Gul;Kang, In-Joan
    • Proceedings of the Korean Environmental Sciences Society Conference
    • /
    • 2003.11a
    • /
    • pp.33-36
    • /
    • 2003
  • Principal component analysis (PCA) was used to improve image classification by the unsupervised classification techniques, the K-means. To do this, I selected a Landsat TM scene of Jeju Island, Korea and proposed two methods for PCA: unstandardized PCA (UPCA) and standardized PCA (SPCA). The estimated accuracy of the image classification of Jeju area was computed by error matrix. The error matrix was derived from three unsupervised classification methods. Error matrices indicated that classifications done on the first three principal components for UPCA and SPCA of the scene were more accurate than those done on the seven bands of TM data and that also the results of UPCA and SPCA were better than those of the raw Landsat TM data. The classification of TM data by the K-means algorithm was particularly poor at distinguishing different land covers on the island. From the classification results, we also found that the principal component based classifications had characteristics independent of the unsupervised techniques (numerical algorithms) while the TM data based classifications were very dependent upon the techniques. This means that PCA data has uniform characteristics for image classification that are less affected by choice of classification scheme. In the results, we also found that UPCA results are better than SPCA since UPCA has wider range of digital number of an image.

  • PDF

A Hierarchical Deep Convolutional Neural Network for Crop Species and Diseases Classification (Deep Convolutional Neural Network(DCNN)을 이용한 계층적 농작물의 종류와 질병 분류 기법)

  • Borin, Min;Rah, HyungChul;Yoo, Kwan-Hee
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.11
    • /
    • pp.1653-1671
    • /
    • 2022
  • Crop diseases affect crop production, more than 30 billion USD globally. We proposed a classification study of crop species and diseases using deep learning algorithms for corn, cucumber, pepper, and strawberry. Our study has three steps of species classification, disease detection, and disease classification, which is noteworthy for using captured images without additional processes. We designed deep learning approach of deep learning convolutional neural networks based on Mask R-CNN model to classify crop species. Inception and Resnet models were presented for disease detection and classification sequentially. For classification, we trained Mask R-CNN network and achieved loss value of 0.72 for crop species classification and segmentation. For disease detection, InceptionV3 and ResNet101-V2 models were trained for nodes of crop species on 1,500 images of normal and diseased labels, resulting in the accuracies of 0.984, 0.969, 0.956, and 0.962 for corn, cucumber, pepper, and strawberry by InceptionV3 model with higher accuracy and AUC. For disease classification, InceptionV3 and ResNet 101-V2 models were trained for nodes of crop species on 1,500 images of diseased label, resulting in the accuracies of 0.995 and 0.992 for corn and cucumber by ResNet101 with higher accuracy and AUC whereas 0.940 and 0.988 for pepper and strawberry by Inception.