• 제목/요약/키워드: Classification accuracy

검색결과 3,087건 처리시간 0.031초

자연어 처리 기반 『상한론(傷寒論)』 변병진단체계(辨病診斷體系) 분류를 위한 기계학습 모델 선정 (Selecting Machine Learning Model Based on Natural Language Processing for Shanghanlun Diagnostic System Classification)

  • 김영남
    • 대한상한금궤의학회지
    • /
    • 제14권1호
    • /
    • pp.41-50
    • /
    • 2022
  • Objective : The purpose of this study is to explore the most suitable machine learning model algorithm for Shanghanlun diagnostic system classification using natural language processing (NLP). Methods : A total of 201 data items were collected from 『Shanghanlun』 and 『Clinical Shanghanlun』, 'Taeyangbyeong-gyeolhyung' and 'Eumyangyeokchahunobokbyeong' were excluded to prevent oversampling or undersampling. Data were pretreated using a twitter Korean tokenizer and trained by logistic regression, ridge regression, lasso regression, naive bayes classifier, decision tree, and random forest algorithms. The accuracy of the models were compared. Results : As a result of machine learning, ridge regression and naive Bayes classifier showed an accuracy of 0.843, logistic regression and random forest showed an accuracy of 0.804, and decision tree showed an accuracy of 0.745, while lasso regression showed an accuracy of 0.608. Conclusions : Ridge regression and naive Bayes classifier are suitable NLP machine learning models for the Shanghanlun diagnostic system classification.

  • PDF

A Rule-based Urban Image Classification System for Time Series Landsat Data

  • Lee, Jin-A;Lee, Sung-Soon;Chi, Kwang-Hoon
    • 대한원격탐사학회지
    • /
    • 제27권6호
    • /
    • pp.637-651
    • /
    • 2011
  • This study presents a rule-based urban image classification method for time series analysis of changes in the vicinity of Asan-si and Cheonan-si in Chungcheongnam-do, using Landsat satellite images (1991-2006). The area has been highly developed through the relocation of industrial facilities, land development, construction of a high-speed railroad, and an extension of the subway. To determine the yearly changing pattern of the urban area, eleven classes were made depending on the trend of development. An algorithm was generalized for the rules to be applied as an unsupervised classification, without the need of training area. The analysis results show that the urban zone of the research area has increased by about 1.53 times, and each correlation graph confirmed the distribution of the Built Up Index (BUI) values for each class. To evaluate the rule-based classification, coverage and accuracy were assessed. When Optimal allowable factor=0.36, the coverage of the rule was 98.4%, and for the test using ground data from 1991 to 2006, overall accuracy was 99.49%. It was confirmed that the method suggested to determine the maximum allowable factor correlates to the accuracy test results using ground data. Among the multiple images, available data was used as best as possible and classification accuracy could be improved since optimal classification to suit objectives was possible. The rule-based urban image classification method is expected to be applied to time series image analyses such as thematic mapping for urban development, urban development, and monitoring of environmental changes.

IRS-1C PAN 데이터와 Landsat TM 데이터의 IHS중합화상을 이용한 토지이용분류 정확도 분석 (An Analysis of the Landuse Classification Accuracy Using IHS Merged Images from IRS-1C PAN Data and Landsat TM Data)

  • 안기원;이효성;서두천;신석효
    • 한국측량학회지
    • /
    • 제16권2호
    • /
    • pp.187-194
    • /
    • 1998
  • 본 연구에서는 높은 해상력의 IRS-1C PAN 데이터와 다양한 관측파장대를 갖고 있는 Landsat TM 데이터를 사용하여, 화상중합방법의 대표적 방법인 IHS방법으로 중합화상을 작성하고, IHS중합화상 및 원화상을 이용하여 토지이용분류를 수행하는데 있어서 어떤 칼라합성밴드가 유효한지를 밝히는데 그 목적이 있다. 분류결과를 평가하기 위해서 10개의 분류항목으로 구성된 sample data를 생성시켰으며, 생성된 sample data의 전체정확도(overall accuracy)로서 분류결과를 평가하였다. 그 결과 Landsat TM 데이터와 IRS-1C PAN데이터를 IHS방법으로 중합하여 토지이용분류를 수행할 경우, TM4, TM5 및 TM7의 적외선영역(infrared spectral region)의 밴드 중 2개 밴드를 포함시켜 분류를 수행하는 것이 좋았으며, 특히 TM 247 중합화상의 경우 분류정확도가 11.8%로 향상되어 가장 좋은 결과를 나타내었다. 또한 토지이용분류를 수행할 경우 3밴드를 중합하여 사용하는 경우보다 1% 원화상에 IRS-1C PAN화상을 추가하여 사용하는 경우의 정확도가 전체적으로 높았다.

  • PDF

A Study on the Land Cover Classification and Cross Validation of AI-based Aerial Photograph

  • Lee, Seong-Hyeok;Myeong, Soojeong;Yoon, Donghyeon;Lee, Moung-Jin
    • 대한원격탐사학회지
    • /
    • 제38권4호
    • /
    • pp.395-409
    • /
    • 2022
  • The purpose of this study is to evaluate the classification performance and applicability when land cover datasets constructed for AI training are cross validation to other areas. For study areas, Gyeongsang-do and Jeolla-do in South Korea were selected as cross validation areas, and training datasets were obtained from AI-Hub. The obtained datasets were applied to the U-Net algorithm, a semantic segmentation algorithm, for each region, and the accuracy was evaluated by applying them to the same and other test areas. There was a difference of about 13-15% in overall classification accuracy between the same and other areas. For rice field, fields and buildings, higher accuracy was shown in the Jeolla-do test areas. For roads, higher accuracy was shown in the Gyeongsang-do test areas. In terms of the difference in accuracy by weight, the result of applying the weights of Gyeongsang-do showed high accuracy for forests, while that of applying the weights of Jeolla-do showed high accuracy for dry fields. The result of land cover classification, it was found that there is a difference in classification performance of existing datasets depending on area. When constructing land cover map for AI training, it is expected that higher quality datasets can be constructed by reflecting the characteristics of various areas. This study is highly scalable from two perspectives. First, it is to apply satellite images to AI study and to the field of land cover. Second, it is expanded based on satellite images and it is possible to use a large scale area and difficult to access.

정규혼합에서 분류정확도 측도들의 최적기준 (Optimal Criterion of Classification Accuracy Measures for Normal Mixture)

  • 유현상;홍종선
    • Communications for Statistical Applications and Methods
    • /
    • 제18권3호
    • /
    • pp.343-355
    • /
    • 2011
  • 두 분포함수의 혼합모형을 가정한 자료에서 적절한 분류점을 찾고 평가하는 것은 중요한 문제이다. 분류정확도 측도로 많이 사용하는 아홉 종류의 MVD, Youden지수, (0,1)까지 최단기준, 수정된(0,1)까지 최단 기준, SSS, 대칭점, 정확도면적, TA, TR에 대하여 설명하고, 이 측도들의 관계를 발견하면서 정확도 측도들의 조건을 몇 개의 범주로 군집화한다. 정규혼합분포를 가정하여 군집된 측도들에 기반하는 분류점들을 구하고, 그 분류점에 대응하는 제I종 오류율과 제II종 오류율 그리고 두 종류의 오류율합을 구하여 크기를 비교하고 토론하다. 추정된 혼합분포에 대하여 어떤 분류 정확도 측도의 제I종과 II종 오류율 또는 오류율합이 최소인지를 탐색할 수 있으며 자주 인용하는 정확도 측도의 장점과 단점을 파악할 수 있다.

ANFIS 기반 분류모형의 설계 및 성능평가 (Design and Evaluation of ANFIS-based Classification Model)

  • 송희석;김재경
    • 지능정보연구
    • /
    • 제15권3호
    • /
    • pp.151-165
    • /
    • 2009
  • 퍼지신경망 모형은 인공신경망의 네트워크 구조 표현방법 및 학습알고리듬과 퍼지시스템의 추론방법을 통합한 모형으로 제어 및 예측분야에 성공적으로 적용되고 있다. 본 연구에서는 퍼지신경망 모형 중 우수한 예측정확도로 인해 최근 각광받고 있는ANFIS (Adaptive Network-based Fuzzy Inference System)모형을 기반으로 하는 분류모형을 설계하고 기존의 분류기법(C5.0 의사결정나무)과 비교하여 분류 정확성 관점에서 평가한다. ANFIS 추론의 경우, 최종 결과값이 계급값이 아닌 연속형 변수값을 취하게 되므로 산출된 결과값을 이용하여 적절한 계급값을 할당하는 과정이 필요하다. 본 연구에서는 의사결정나무기법을 이용하여 계급값을 할당하는 방식과 군집분석을 이용하여 계급값을 할당하는 두 가지 방식을 제안하고 두 가지 데이터 세트에 적용하여 ANFIS를 기반으로 한 분류모형의 정확도를 평가하였다.

  • PDF

Land Use Classification of TM Imagery in Hilly Areas: Integration of Image Processing and Expert Knowledge

  • Ding, Feng;Chen, Wenhui;Zheng, Daxian
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.1329-1331
    • /
    • 2003
  • Improvement of the classification accuracy is one of the major concerns in the field of remote sensing application research in recent years. Previous research shows that the accuracy of the conventional classification methods based only on the original spectral information were usually unsatisfied and need to be refined by manual edit. This present paper describes a method of combining the image processing, ancillary data (such as digital elevation model) and expert knowledge (especially the knowledge of local professionals) to improve the efficiency and accuracy of the satellite image classification in hilly land. Firstly, the Landsat TM data were geo-referenced. Secondly, the individual bands of the image were intensitynormalized and the normalized difference vegetation index (NDVI) image was also generated. Thirdly, a set of sample pixels (collected from field survey) were utilized to discover their corresponding DN (digital number) ranges in the NDVI image, and to explore the relationships between land use type and its corresponding spectral features . Then, using the knowledge discovered from previous steps as well as knowledge from local professionals, with the support of GIS technology and the ancillary data, a set of conditional statements were applied to perform the TM imagery classification. The results showed that the integration of image processing and spatial analysis functions in GIS improved the overall classification result if compared with the conventional methods.

  • PDF

Feature Selection and Hyper-Parameter Tuning for Optimizing Decision Tree Algorithm on Heart Disease Classification

  • Tsehay Admassu Assegie;Sushma S.J;Bhavya B.G;Padmashree S
    • International Journal of Computer Science & Network Security
    • /
    • 제24권2호
    • /
    • pp.150-154
    • /
    • 2024
  • In recent years, there are extensive researches on the applications of machine learning to the automation and decision support for medical experts during disease detection. However, the performance of machine learning still needs improvement so that machine learning model produces result that is more accurate and reliable for disease detection. Selecting the hyper-parameter that could produce the possible maximum classification accuracy on medical dataset is the most challenging task in developing decision support systems with machine learning algorithms for medical dataset classification. Moreover, selecting the features that best characterizes a disease is another challenge in developing machine-learning model with better classification accuracy. In this study, we have proposed an optimized decision tree model for heart disease classification by using heart disease dataset collected from kaggle data repository. The proposed model is evaluated and experimental test reveals that the performance of decision tree improves when an optimal number of features are used for training. Overall, the accuracy of the proposed decision tree model is 98.2% for heart disease classification.

인공지지체 불량 검출을 위한 딥러닝 모델 성능 비교에 관한 연구 (A Comparative Study on Deep Learning Models for Scaffold Defect Detection)

  • 이송연;허용정
    • 반도체디스플레이기술학회지
    • /
    • 제20권2호
    • /
    • pp.109-114
    • /
    • 2021
  • When we inspect scaffold defect using sight, inspecting performance is decrease and inspecting time is increase. We need for automatically scaffold defect detection method to increase detection accuracy and reduce detection times. In this paper. We produced scaffold defect classification models using densenet, alexnet, vggnet algorithms based on CNN. We photographed scaffold using multi dimension camera. We learned scaffold defect classification model using photographed scaffold images. We evaluated the scaffold defect classification accuracy of each models. As result of evaluation, the defect classification performance using densenet algorithm was at 99.1%. The defect classification performance using VGGnet algorithm was at 98.3%. The defect classification performance using Alexnet algorithm was at 96.8%. We were able to quantitatively compare defect classification performance of three type algorithms based on CNN.

딥러닝 기반 농경지 속성분류를 위한 TIF 이미지와 ECW 이미지 간 정확도 비교 연구 (A Study on the Attributes Classification of Agricultural Land Based on Deep Learning Comparison of Accuracy between TIF Image and ECW Image)

  • 김지영;위성승
    • 한국농공학회논문집
    • /
    • 제65권6호
    • /
    • pp.15-22
    • /
    • 2023
  • In this study, We conduct a comparative study of deep learning-based classification of agricultural field attributes using Tagged Image File (TIF) and Enhanced Compression Wavelet (ECW) images. The goal is to interpret and classify the attributes of agricultural fields by analyzing the differences between these two image formats. "FarmMap," initiated by the Ministry of Agriculture, Food and Rural Affairs in 2014, serves as the first digital map of agricultural land in South Korea. It comprises attributes such as paddy, field, orchard, agricultural facility and ginseng cultivation areas. For the purpose of comparing deep learning-based agricultural attribute classification, we consider the location and class information of objects, as well as the attribute information of FarmMap. We utilize the ResNet-50 instance segmentation model, which is suitable for this task, to conduct simulated experiments. The comparison of agricultural attribute classification between the two images is measured in terms of accuracy. The experimental results indicate that the accuracy of TIF images is 90.44%, while that of ECW images is 91.72%. The ECW image model demonstrates approximately 1.28% higher accuracy. However, statistical validation, specifically Wilcoxon rank-sum tests, did not reveal a significant difference in accuracy between the two images.