• Title/Summary/Keyword: Classification accuracy

Search Result 3,065, Processing Time 0.03 seconds

Extraction Method of Significant Clinical Tests Based on Data Discretization and Rough Set Approximation Techniques: Application to Differential Diagnosis of Cholecystitis and Cholelithiasis Diseases (데이터 이산화와 러프 근사화 기술에 기반한 중요 임상검사항목의 추출방법: 담낭 및 담석증 질환의 감별진단에의 응용)

  • Son, Chang-Sik;Kim, Min-Soo;Seo, Suk-Tae;Cho, Yun-Kyeong;Kim, Yoon-Nyun
    • Journal of Biomedical Engineering Research
    • /
    • v.32 no.2
    • /
    • pp.134-143
    • /
    • 2011
  • The selection of meaningful clinical tests and its reference values from a high-dimensional clinical data with imbalanced class distribution, one class is represented by a large number of examples while the other is represented by only a few, is an important issue for differential diagnosis between similar diseases, but difficult. For this purpose, this study introduces methods based on the concepts of both discernibility matrix and function in rough set theory (RST) with two discretization approaches, equal width and frequency discretization. Here these discretization approaches are used to define the reference values for clinical tests, and the discernibility matrix and function are used to extract a subset of significant clinical tests from the translated nominal attribute values. To show its applicability in the differential diagnosis problem, we have applied it to extract the significant clinical tests and its reference values between normal (N = 351) and abnormal group (N = 101) with either cholecystitis or cholelithiasis disease. In addition, we investigated not only the selected significant clinical tests and the variations of its reference values, but also the average predictive accuracies on four evaluation criteria, i.e., accuracy, sensitivity, specificity, and geometric mean, during l0-fold cross validation. From the experimental results, we confirmed that two discretization approaches based rough set approximation methods with relative frequency give better results than those with absolute frequency, in the evaluation criteria (i.e., average geometric mean). Thus it shows that the prediction model using relative frequency can be used effectively in classification and prediction problems of the clinical data with imbalanced class distribution.

New feature and SVM based advanced classification of Computer Graphics and Photographic Images (노이즈 기반의 새로운 피쳐(feature)와 SVM에 기반한 개선된 CG(Computer Graphics) 및 PI(Photographic Images) 판별 방법)

  • Jeong, DooWon;Chung, Hyunji;Hong, Ilyoung;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.2
    • /
    • pp.311-318
    • /
    • 2014
  • As modern computer graphics technology has been developed, it is hard to discriminate computer graphics from photographic images with the naked eye. Advances in graphics technology has brought a lot of convenience to human, it has side effects such as image forgery, malicious edit and fraudulent means. In order to cope with such problems, studies of various algorithms using a feature that represents a characteristic of an image has been processed. In this paper, we verify directly the existing algorithm, and provide new features based a noise that represents the characteristics of the computer graphics well. And this paper introduces the method of using SVM(Support Vector Machine) with features proposed in previous research to improve the discrimination accuracy.

Korean Mobile Spam Filtering System Considering Characteristics of Text Messages (문자메시지의 특성을 고려한 한국어 모바일 스팸필터링 시스템)

  • Sohn, Dae-Neung;Lee, Jung-Tae;Lee, Seung-Wook;Shin, Joong-Hwi;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.7
    • /
    • pp.2595-2602
    • /
    • 2010
  • This paper introduces a mobile spam filtering system that considers the style of short text messages sent to mobile phones for detecting spam. The proposed system not only relies on the occurrence of content words as previously suggested but additionally leverages the style information to reduce critical cases in which legitimate messages containing spam words are mis-classified as spam. Moreover, the accuracy of spam classification is improved by normalizing the messages through the correction of word spacing and spelling errors. Experiment results using real world Korean text messages show that the proposed system is effective for Korean mobile spam filtering.

A Symptom based Taxonomy for Network Security (네트워크상에서의 징후를 기반으로 한 공격분류법)

  • Kim Ki-Yoon;Choi Hyoung-Kee;Choi Dong-Hyun;Lee Byoung-Hee;Choi Yoon-Sung;Bang Hyo-Chan;Na Jung-Chan
    • The KIPS Transactions:PartC
    • /
    • v.13C no.4 s.107
    • /
    • pp.405-414
    • /
    • 2006
  • We present a symptom based taxonomy for network security. This taxonomy classifies attacks in the network using early symptoms of the attacks. Since we use the symptom it is relatively easy to access the information to classify the attack. Furthermore we are able to classify the unknown attack because the symptoms of unknown attacks are correlated with the one of known attacks. The taxonomy classifies the attack in two stages. In the first stage, the taxonomy identifies the attack in a single connection and then, combines the single connections into the aggregated connections to check if the attacks among single connections may create the distribute attack over the aggregated connections. Hence, it is possible to attain the high accuracy in identifying such complex attacks as DDoS, Worm and Bot We demonstrate the classification of the three major attacks in Internet using the proposed taxonomy.

Learning Rules for Identifying Hypernyms in Machine Readable Dictionaries (기계가독형사전에서 상위어 판별을 위한 규칙 학습)

  • Choi Seon-Hwa;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.171-178
    • /
    • 2006
  • Most approaches for extracting hypernyms of a noun from its definitions in an MRD rely on lexical patterns compiled by human experts. Not only these approaches require high cost for compiling lexical patterns but also it is very difficult for human experts to compile a set of lexical patterns with a broad-coverage because in natural languages there are various expressions which represent same concept. To alleviate these problems, this paper proposes a new method for extracting hypernyms of a noun from its definitions in an MRD. In proposed approach, we use only syntactic (part-of-speech) patterns instead of lexical patterns in identifying hypernyms to reduce the number of patterns with keeping their coverage broad. Our experiment has shown that the classification accuracy of the proposed method is 92.37% which is significantly much better than that of previous approaches.

Development of Forest Fire Occurrence Probability Model Using Logistic Regression (로지스틱 회귀모형을 이용한 산불발생확률모형 개발)

  • Lee, Byungdoo;Ryu, Gyesun;Kim, Seonyoung;Kim, Kyongha
    • Journal of Korean Society of Forest Science
    • /
    • v.101 no.1
    • /
    • pp.1-6
    • /
    • 2012
  • To achieve the forest fire management goals such as early detection and quick suppression, fire resources should be allocated at high probability area where forest fires occur. The objective of this study was to develop and validate models to estimate spatially distributed probabilities of occurrence of forest fire. The models were builded by exploring relationships between fire ignition location and forest, terrain and anthropogenic factors using logistic regression. Distance to forest, cemetery, fire history, forest type, elevation, slope were chosen as the significant factors to the model. The model constructed had a good fit and classification accuracy of the model was 63%. This model and map can support the allocation optimization of forest fire resources and increase effectiveness in fire prevention and planning.

Verification Test of High-activity SMEs Using Technology Appraisal Items (기술력 평가항목을 이용한 고활동성 중소기업 판별)

  • Lee, Jun-won
    • Journal of Technology Innovation
    • /
    • v.28 no.1
    • /
    • pp.31-52
    • /
    • 2020
  • This study was started to verify the preliminary(Ex-ante) discrimination power of the firm's high-activity using the 'Forward-looking' oriented technology appraisal model used in technology financing. The analytical firms are classified into the industry (manufacturing / non-manufacturing) and the age of company (initial / non-initial). High-activity SMEs are defined as those that achieve at least twice the average asset turnover ratio of the cluster. As a result of the discriminant model by applying C5.0 method, which is one of decision tree models, classification accuracy is more than 99% in all industries and the age of company, and it is confirmed that the discriminant power of the model is stable. As a result, the management expertise, capital involvement and funding capacity items were identified as a critical variable for the high-activity SMEs. In addition, the technology management capability and technology life cycle were also confirmed to be the items to determine high-activity SMEs in the manufacturing industry. Through this, it was possible to confirm some possibility of prior discrimination and policy utilization of high-activity SMEs by using technology appraisal items.

The Integrated Methodology of Rough Set Theory and Artificial Neural Network for Business Failure Prediction (도산 예측을 위한 러프집합이론과 인공신경망 통합방법론)

  • Kim, Chang-Yun;Ahn, Byeong-Seok;Cho, Sung-Sik;Kim, Soung-Hie
    • Asia pacific journal of information systems
    • /
    • v.9 no.4
    • /
    • pp.23-40
    • /
    • 1999
  • This paper proposes a hybrid intelligent system that predicts the failure of firms based on the past financial performance data, combining neural network and rough set approach, We can get reduced information table, which implies that the number of evaluation criteria such as financial ratios and qualitative variables and objects (i.e., firms) is reduced with no information loss through rough set approach. And then, this reduced information is used to develop classification rules and train neural network to infer appropriate parameters. Through the reduction of information table, it is expected that the performance of the neural network improve. The rules developed by rough sets show the best prediction accuracy if a case does match any of the rules. The rationale of our hybrid system is using rules developed by rough sets for an object that matches any of the rules and neural network for one that does not match any of them. The effectiveness of our methodology was verified by experiments comparing traditional discriminant analysis and neural network approach with our hybrid approach. For the experiment, the financial data of 2,400 Korean firms during the period 1994-1996 were selected, and for the validation, k-fold validation was used.

  • PDF

Exploratory Study on the Quality Grade of Korea Black Raspberry Wines by Using Consumer Preference Data (시판 복분자주의 기호도 분석을 통한 탐색적 등급 분류)

  • Lee, Seung-Joo
    • Korean Journal of Food Science and Technology
    • /
    • v.46 no.3
    • /
    • pp.352-357
    • /
    • 2014
  • In this study, 100 consumers (men, 50; women, 50; age group, 20-50 years) rated their overall preferences for 24 Korean raspberry wines by using a 9-point hedonic scale. The analysis of variance was constructed to evaluate the effect of gender, age, and samples on the preference scores of the wine products. Significant differences were observed in overall preferences for the 24 samples; however, no interactions based on preferences by age and gender groups were noted. Cluster analysis was performed to determine sample clustering based on the frequencies from the preference data. Three clusters were obtained; these three clusters were well separated based on the mean overall preference scores for the samples. Discriminant analysis based on the three clusters also confirmed the same grouping of samples with 100% accuracy.

The Comparative Study between the Effect of Herbal Formula in Schoolbook and the Effect deduced from Compositional Herbal Effects ("방제학"에 기재된 방제 효능과 본초 구성을 기반으로 도출된 효능의 비교 연구)

  • Park, Byoung-Sun;Kim, Eun-Ha;Lee, Sun-A;Lee, Byung-Wook
    • Journal of Korean Medical classics
    • /
    • v.21 no.1
    • /
    • pp.79-92
    • /
    • 2008
  • Objective The analysis method based on the herbal formula's effects is a general tool, in traditional medicine. In effective applications of herbal formula, Korean herbal medicines traditionally used the classification methods based on the curative effects through the various compositions of herbal formulas. However' the effects of herbal formulas were not filed systemically in ancient literatures, and the standards to confirm their effects are not clear. Thus, it is not easy to classify herbal formulas according to the curative effects. Furthermore, there are no standards to estimate the effects of prescriptions frequently directed in clinic. In this study, we aimed to provide the methodology of classifying the curative effects of herbal formulas by calculating the combination of the effect of each compositional herb through the DB systems. Results : By comparing effects of herbal formula with those of compositional herbs, we found that about 25-50% of the herbal effects were included in herbal formula's effects. These results showed that the prospective estimation of herbal formula's effects may be possible through the DB systems filing herbal effects. To enhance the accuracy in explaining the herbal formula's effect, more studies are needed by giving prominence to major effects and by subtracting minor effects.

  • PDF