• Title/Summary/Keyword: Classification of Terms

Search Result 1,421, Processing Time 0.028 seconds

Performance Comparison of Naive Bayesian Learning and Centroid-Based Classification for e-Mail Classification (전자메일 분류를 위한 나이브 베이지안 학습과 중심점 기반 분류의 성능 비교)

  • Kim, Kuk-Pyo;Kwon, Young-S.
    • IE interfaces
    • /
    • v.18 no.1
    • /
    • pp.10-21
    • /
    • 2005
  • With the increasing proliferation of World Wide Web, electronic mail systems have become very widely used communication tools. Researches on e-mail classification have been very important in that e-mail classification system is a major engine for e-mail response management systems which mine unstructured e-mail messages and automatically categorize them. In this research we compare the performance of Naive Bayesian learning and Centroid-Based Classification using the different data set of an on-line shopping mall and a credit card company. We analyze which method performs better under which conditions. We compared classification accuracy of them which depends on structure and size of train set and increasing numbers of class. The experimental results indicate that Naive Bayesian learning performs better, while Centroid-Based Classification is more robust in terms of classification accuracy.

A Study on the classification scheme for the design of Directory Search Engine on the web (web 데이터베이스의 디렉토리 설계를 위한 분류체계 연구)

  • 이명희
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.10 no.1
    • /
    • pp.243-268
    • /
    • 1999
  • The purpose of this study is to develop the classification scheme in subject-based directory search engine for educational research information on the web. Five classification systems. Yahoo Korea, Argus Clearinghouse, DDC, ERIC thesaurus and KEDI thesaurus were measured in terms of coverage of subject fields, system logic, accuracy of terminology and efficiency of searching. For the design of Classification Scheme, this study considered the content of subject areas, features of information resources and efficiency based on users. Finally, the Classification Scheme was established in terms of 16 main divisions and 47 sub-divisions in educational research information.

  • PDF

The Methodic Study on a Standard of Classification of Pulse Condition -a Focus of ${\ulcorner}$The Pulse Studies of Bin-Ho(瀕湖脈學)${\lrcorner}$- (맥상 분류 기준에 대한 방법론적 고찰 - "빈호맥학(瀕湖脈學)"을 중심으로 -)

  • Lee, Ju-Ho;Choi, Hwan-Soo;Kim, Chul-Jung
    • Korean Journal of Oriental Medicine
    • /
    • v.10 no.1
    • /
    • pp.49-61
    • /
    • 2004
  • The Standardization of terms in The Pulse studies(脈學) is a need for development of learning. This study, for the correction of existing misused terms in The Pulse studies, we study on modernly and objectively the terms in The Pulse studies. By a focus of ${\ulcorner}$The Pulse Studies of Bin-Ho(瀕湖脈學)${\lrcorner}$, we studies on the new classification of pulse condition. The error of a existing technical books on Pulse studies begin that the classification of pulse condition is not establish a Standardization. For the correction of existing misused terms in The Pulse studies, we study on the pulse condition is expressed objectively a blood vessel that it is a subject of pulse condition. The expression of blood vessel contain a depth of blood vessel, a speed of pulsation, a curve of blood vessel, thickness of blood vessel, a diameter of blood vessel in expand and contract of blood vessel, a interval in expand and contract of blood vessel, a distinctness on a boundary of blood vessel, a speed of blood flow in blood vessel, a volume of blood flow in blood vessel, a condition of blood in blood vessel, a propelling power of blood vessel. These is standard of the new classification of pulse condition.

  • PDF

Image Fusion for Improving Classification

  • Lee, Dong-Cheon;Kim, Jeong-Woo;Kwon, Jay-Hyoun;Kim, Chung;Park, Ki-Surk
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1464-1466
    • /
    • 2003
  • classification of the satellite images provides information about land cover and/or land use. Quality of the classification result depends mainly on the spatial and spectral resolutions of the images. In this study, image fusion in terms of resolution merging, and band integration with multi-source of the satellite images; Landsat ETM+ and Ikonos were carried out to improve classification. Resolution merging and band integration could generate imagery of high resolution with more spectral bands. Precise image co-registration is required to remove geometric distortion between different sources of images. Combination of unsupervised and supervised classification of the fused imagery was implemented to improve classification. 3D display of the results was possible by combining DEM with the classification result so that interpretability could be improved.

  • PDF

The Study on the Trends of Resecarch Papers Published in the Journal of Oriental Medical Surgery?phthalmology & Otolaryngology Society. (大韓眼耳鼻咽喉皮膚科學會誌에 揭載된 硏究論文들의 傾向性 考察)

  • Kwon, Kang;Seo, Hyung-sik
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.16 no.1
    • /
    • pp.1-32
    • /
    • 2003
  • Objectives: This study was carried out to analyze and understand the trends of research papers published In the Journal of Oriental Medical Surgery?phthalmology & Otolaryngology Society(JOMSOOS). Methods: We studied the 308 research papers that had been published in JOMSOOS from 1988 Vol. 1. No. 1. to 2002 Vol. 15. No. 2. Our study was developed through the four stages in order to analyze the papers; 1) we analyzed all the papers overall to classify them into three categories; original article. review article and case report. 2) we classified the original articles in terms of methodology. 3) we also analyzed the case reports according to the sort of disease each paper dealt with. 4) we had another statistical approach to each paper to figure out the distribution of diagnoses in detail. Results: We have got the following outcomes from our analysis of the papers in terms of the four stages. 1. Overall Analysis. 1) Classification of 308 research papers between 1988 and 2002: 137 original articles(44.48$\%$), 111 review articles(36.04$\%$), 56 case reports(18.18$\%$). 2) Used language: Korean(99.03$\%$). English(0.97$\%$). 3) The Number of Authors: 2 persons(42.86$\%$). 3 persons(29.87$\%$), 1 person(14.61$\%$). 2. Original Article Analysis 1) Classification of 137 original articles in terms of methodology: 90 experimental studies(65.69$\%$)46 descriptive studies(33.58$\%$), 1 analytic study(0.73$\%$). 2) Classification of the original articles according to the use of statistical methods: No statistical methods(36.42$\%$), Descriptive methods only(1.99$\%$), Not defined(23.18$\%$), t-test(24.50$\%$), ANOVA(3.97$\%$), Multiple comparison(2.65$\%$), Non-parametric test(2.65$\%$), Other methods(1.32$\%$). 3) Classification of 46 descriptive articles in terms of diseases: otorhinolaryngology(43.48$\%$), dermatology(23.91$\%$), ophthalmology(13.04$\%$), facial palsy(13.04$\%$). 4) Classification of descriptive articles in terms of the number of patients: the highest was 'more than 26 but less than 50 persons'(19 articles - 41.30$\%$). 5) Classification of descriptive articles in terms of the period for patients observation: the highest was the time 'more than 9 but less than 12 months(34.78$\%$)' Out of the 34.78$\%$, the number of articles with the patients observed for more than 12 months was 13(28.26$\%$). 3. Case Report Analysis 1) Classification of 56 case reports in terms of the sort of disease: dermatology(44.64$\%$), ophthalmology(19.64$\%$), otorhinolaryngology(14.29$\%$), facial palsy(8.93$\%$). 2) Classification in terms of the number of patients: 1 person(50$\%$), 3 persons(16.07$\%$), 2 persons(14.29$\%$). 4. Diagnosis Distribution of Each Disease. 1) Studies regarding ophthalmology : the percentage of 'strabismus' cases was the highest(33.33$\%$). 2) Studies regarding otorhinolaryngology : nasal inflammation(34.48$\%$), tinnitus(20.69$\%$). 3) Studies regarding dermatology: the percentage of 'allergic skin disease' was the highest(33.33$\%$). Conclusions: We analyzed the trends of research papers that have been published in JOMSOOS in detail. We came to understand the trends of the research through this study. However, we acknowledge that we only adopted the quantitative method out of various possible analysis methods. For further studies, we strongly urge to adopt the qualitative methods as well.

  • PDF

Classification Index and Grade Levels for Energy Efficiency Classification of Agricultural Heaters in Korea

  • Shin, Chang Seop;Jang, Ji Hoon;Kim, Young Tae;Kim, Kyeong Uk
    • Journal of Biosystems Engineering
    • /
    • v.38 no.4
    • /
    • pp.264-269
    • /
    • 2013
  • Purpose: This study was carried out to develop a classification index and grade levels to rate agricultural heaters for energy efficiency classification. Methods: The classification index was developed mainly by taking simplicity of calculation and easy access to relevant data into consideration. The grade levels were developed on the basis of a 5-grade classification system in which graded heaters are to be normally distributed over the grades. The value of each grade level were determined in terms of the classification index values calculated using the published performance data of agricultural heaters tested at the FACT in Korea over the past 12 years. Results: The thermal efficiency of agricultural heaters based on the enthalpy method was proposed as a reasonable classification index. The grade levels were proposed in equation form for three types of agricultural heaters: fossil fuel heaters, wood pellet heaters and wood pellet boilers. A reasonable energy efficiency classification of agricultural heaters could be performed using the proposed classification index and grade levels. Conclusions: It is expected that energy saving programs will be extended to agricultural machines in the near future. The classification index and grade levels to rate agricultural heaters for energy efficiency classification were developed and proposed for such near future to come.

Applying Traditional Korean Medical Terms to SUI in the Unified Medical Language System(UMLS) Metathesaurus

  • Hong, Seong-Cheon;Jeong, Heon-Young;Jeon, Byong-Uk
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.16 no.1
    • /
    • pp.1-8
    • /
    • 2010
  • Objective: Various controlled vocabulary such as thesaurus and classification make us to reuse and share effectively by defining different concept and linking terms each other. The UMLS(Unified Medical Language System) is one of the most universal medical terminology systems. It is needed various methods to share and reuse information of traditional Korean medicine. We will research on method that adopt SUI of the UMLS(that is de facto standard in medical terminology system) in traditional Korean medical terminology. Method: We described major problems and applying process when we tried to add traditional Korean medicine in the part of meridian into the UMLS metathesaurus. Comparing western medical terms and traditional Korean medical terms for applying UMLS metathesaurus, there is not only many consistency, but also differences. Result: We confirmed what is the differences and consistency between western medical terms and traditional Korean medical terms. And then reviewed methods that apply the CUI, LUI, SUI in traditional Korean medical terms. Traditional Korean medical terms are not discriminated by singular or plural string. In addition, traditional Korean medical terms have vary string by initial law: the law of initial sound of a syllable. Character is described with Korean, traditional Chinese, modern Chinese, etc. According to meaning, language, initial law, SUI has a distinct value respectively. Conclusion: There are many differences to apply the UMLS between western medical terms and traditional Korean medical terms. For the better implementation to traditional Korean medicine into the UMLS, further research is needed in standardization and classification of traditional Korean medical terms, medical information system, etc. We hope this study helps the implementation UMLS, EHR, knowledge based system in Oriental medicine in the future.

  • PDF

The Prediction Performance of the CART Using Bank and Insurance Company Data (CART의 예측 성능:은행 및 보험 회사 데이터 사용)

  • Park, Jeong-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1468-1472
    • /
    • 1996
  • In this study, the performance of the CART(Classification and Regression Tree) is compared with that of discriminant analysis method. In most experiments using bank data, discriminant analysis shows better performance in terms of the total cost. In contrast, most experiments using insurance data show that the CART is better than discriminant analysis in terms of the total cost. The contradictory result are analysed by using the characteristics of the data sets. The performances of both the Classification and Regression Tree and discriminant analysis depend on the parameters:failure prior probability, data used, type I error, type II error cost, and validation method.

  • PDF

Category Factor Based Feature Selection for Document Classification

  • Kang Yun-Hee
    • International Journal of Contents
    • /
    • v.1 no.2
    • /
    • pp.26-30
    • /
    • 2005
  • According to the fast growth of information on the Internet, it is becoming increasingly difficult to find and organize useful information. To reduce information overload, it needs to exploit automatic text classification for handling enormous documents. Support Vector Machine (SVM) is a model that is calculated as a weighted sum of kernel function outputs. This paper describes a document classifier for web documents in the fields of Information Technology and uses SVM to learn a model, which is constructed from the training sets and its representative terms. The basic idea is to exploit the representative terms meaning distribution in coherent thematic texts of each category by simple statistics methods. Vector-space model is applied to represent documents in the categories by using feature selection scheme based on TFiDF. We apply a category factor which represents effects in category of any term to the feature selection. Experiments show the results of categorization and the correlation of vector length.

  • PDF

A Real-Time Concept-Based Text Categorization System using the Thesauraus Tool (시소러스 도구를 이용한 실시간 개념 기반 문서 분류 시스템)

  • 강원석;강현규
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.1
    • /
    • pp.167-167
    • /
    • 1999
  • The majority of text categorization systems use the term-based classification method. However, because of too many terms, this method is not effective to classify the documents in areal-time environment. This paper presents a real-time concept-based text categorization system,which classifies texts using thesaurus. The system consists of a Korean morphological analyzer, athesaurus tool, and a probability-vector similarity measurer. The thesaurus tool acquires the meaningsof input terms and represents the text with not the term-vector but the concept-vector. Because theconcept-vector consists of semantic units with the small size, it makes the system enable to analyzethe text with real-time. As representing the meanings of the text, the vector supports theconcept-based classification. The probability-vector similarity measurer decides the subject of the textby calculating the vector similarity between the input text and each subject. In the experimentalresults, we show that the proposed system can effectively analyze texts with real-time and do aconcept-based classification. Moreover, the experiment informs that we must expand the thesaurustool for the better system.