• Title/Summary/Keyword: 주 범주

Search Result 739, Processing Time 0.021 seconds

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

Analysis on the Key Words related to Healthcare Issues of the Prevention and Control of COVID-19 in Major Korean Newspapers, 2020 (2020년 코로나-19 관련 한국 주요 신문에서 방역관련 주요 주제어 분석)

  • Kim, Min-Young;Gu, Bo-Kyung;Yoon, Bo-Ra;Baek, Jin-Won;Lee, Moo-Sik
    • Journal of agricultural medicine and community health
    • /
    • v.46 no.3
    • /
    • pp.153-161
    • /
    • 2021
  • Backgrounds: This study was performed to analyze the main key words of newspaper articles related to COVID-19 in 2020 for each category of quarantine measures according to the epidemic period of COVID-19. Methods: We analyzed articles related to COVID-19 in three major newspapers of Korea between February 17 and December 31, 2020. We targeted the front page articles on mondays and thursdays. The analysis of the relationship between the two variables was confirmed through the chi-square test. Results: As a result of analyzing the main key words for each category of quarantine measures, non-pharmaceutical intervention were the most common at 54.3%, followed by 3Ts(test, tracing, treatment and vaccine) at 31.9%. In the category of non-pharmaceutical intervention, social distancing was the most common at 33.9%. In the categories such as 3Ts(test, tracing, treatment) and vaccine, diagnostic tests were the most common at 41.8%. Conclusions: It was identified that non-pharmaceutical intervention were the most common, and there was a difference in the reporting of main key words by category of quarantine measures for each epidemic period related to COVID-19 in 2020.

Automatic Text Categorization Using Passage-based Weight Function and Passage Type (문단 단위 가중치 함수와 문단 타입을 이용한 문서 범주화)

  • Joo, Won-Kyun;Kim, Jin-Suk;Choi, Ki-Seok
    • The KIPS Transactions:PartB
    • /
    • v.12B no.6 s.102
    • /
    • pp.703-714
    • /
    • 2005
  • Researches in text categorization have been confined to whole-document-level classification, probably due to lacks of full-text test collections. However, full-length documents availably today in large quantities pose renewed interests in text classification. A document is usually written in an organized structure to present its main topic(s). This structure can be expressed as a sequence of sub-topic text blocks, or passages. In order to reflect the sub-topic structure of a document, we propose a new passage-level or passage-based text categorization model, which segments a test document into several Passages, assigns categories to each passage, and merges passage categories to document categories. Compared with traditional document-level categorization, two additional steps, passage splitting and category merging, are required in this model. By using four subsets of Routers text categorization test collection and a full-text test collection of which documents are varying from tens of kilobytes to hundreds, we evaluated the proposed model, especially the effectiveness of various passage types and the importance of passage location in category merging. Our results show simple windows are best for all test collections tested in these experiments. We also found that passages have different degrees of contribution to main topic(s), depending on their location in the test document.

Ordering Variables and Categories on the Mosaic Plot (모자이크 플롯에서 변수와 범주의 순서화)

  • Lee, Moon-Joo;Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.5
    • /
    • pp.875-888
    • /
    • 2008
  • Mosaic plots, proposed by Hartigan and Kleiner (1981, 1984), are very useful in visualizing categorical data. In mosaic plot, multi-way classified cell frequencies are represented by rectangles with proportional area. The plot is easy to understand while preserving the information contained in the data. Plot's appearance, however, does change substantially depending on the order of variables and the orders of categories with variable put into the plot. In this study, we propose the algorithms for ordering variables and categories of the categorical data to be explored via mosaic plots. We demonstrate our methods to three well-known datasets: Titanic, Housing and PreSex.

A Study on the Comparison between E-MDR and D-MDR in Continuous Data (연속형 데이터에서 E-MDR과 D-MDR방법 비교)

  • Lee, Jea-Young;Lee, Ho-Guen
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.4
    • /
    • pp.579-586
    • /
    • 2009
  • We have used multifactor dimensionality reduction(MDR) method to study interaction effect of statistical model in general. But MDR method cannot be applied in all cases. It can be applied to the only case-control data. So, two methods are suggested E-MDR and D-MDR method using regression tree algorithm and dummy variables. We applied the methods on the identify interaction effects of single nucleotide polymorphisms(SNPs) responsible for longissimus mulcle dorsi area(LMA), carcass cold weight(CWT) and average daily gain(ADG) in a Hanwoo beef cattle population. Finally, we compare the results using permutation test.

The effects of attribute alignment on category learning (속성간의 대응이 범주학습에 미치는 효과)

  • 이태연
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.4
    • /
    • pp.29-39
    • /
    • 2001
  • Kaplan(2000) reported that instances were categorized more accurate in the aligned condition than in the non-aligned condition irrespective of similarity between instances[16]. This study investigated wether Kaplan(2000)\\`s results could be explained by stimulus types she used and alignment effects in categorization were due to selective attention to aligned attributes. In Experiment 1. I examined whether attribute alignment produced significant effects on similarity and categorization and aligned attributes were recalled more than non-aligned ones. Results showed that instances were rated more similar and categories were learned more rapidly in the aligned condition than in the non-aligned condition. It can be explained that categories are learned rapidly in the aligned condition because attribute alignment increases within-category similarity. But. the result that aligned attributes were recalled more than non-aliened ones in the attribute recall test implies that alignment effects in categorization can be independent of similarity between instances partially. In Experiment 2. I used equal numbed of attributes defining two categories and instructed subjects to pay their attention to categorization-relevant dimensions only. Results showed that dimension instruction facilitated category learning in the non-aligned condition only but categories were learned more rapidly in the aligned condition than in the non-aliened condition irrespective of instruction types. In conclusion. attribute alignment in categorization may facilitate paying selective attention to categorization-relevant attributes.

  • PDF

Some Application Principles of Categorial Grammars for Korean Syntactic Analysis and Sentence Generation (한국어 구문 분석과 문장 생성을 위한 범주 문법 적용의 몇 가지 원칙)

  • Song, Do-Gyu;Cha, Keon-Hoe;Park, Jay-Duke
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.353-359
    • /
    • 1997
  • 주로 영어, 불어 등의 형상적 언어(configurational languages)의 구문 분석을 위해 개발된 범주 문법은 문장 구성 성분의 문장 내의 위치가 대체적으로 고정적이며 통사 기능이 그 위치로서 할당 되는 형상적 언어의 통사적인 특성에 따라 방향성의 개념을 도입하였다. 그러나 이 방향성 개념은 문장 구성 성분의 문장 내의 위치가 비교적 자유로운 한국어 등의 비형상적 언어(non-configurational languages)에 그대로 적용하기에는 많은 무리가 따른다. 심지어 형상적 언어에 적용하는 경우에도 도치나 외치된 문장 또 격리된 구조(unbounded dependency constructions)가 있는 문장들도 적절히 분석해 내지 못한다. 이런 이유로 본고에서는 범주 문법에 도입되어 있는 방향성을 재고하고 아울러 한국어 구문 분석과 문장 생성을 위한 범주 문법 적용상의 다섯 원칙을 제안한다.

  • PDF

Improving Classification Accuracy for Numerical and Nominal Data using Virtual Examples (가상예제를 이용한 수치 및 범주 속성 데이터의 분류 성능 향상)

  • Lee, Yu-Jung;Kang, Jae-Ho;Kang, Byoung-Ho;Ryu, Kwang-Ryel
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10b
    • /
    • pp.183-188
    • /
    • 2006
  • 본 논문에서는 베이지안 네트워크를 기반으로 생성하고 평가한 가상예제를 활용하여 범주속성 및 수치속성 데이터에 대한 분류 성능을 향상시키는 방안을 제안한다. 가상예제를 활용하는 종래의 연구들은 주로 수치 속성 데이터를 대상으로 한 반면 본 연구에서는 범주속성 데이터에 대해서도 가상예제를 적용하여 효과를 확인하였다. 그리고 대상 도메인에 특화된 지식을 활용하여 특정 학습 알고리즘의 성능을 향상시키는 것을 목표로 한 기존 연구들과는 달리 본 연구에서는 도메인에 특화된 지식을 활용하는 대신 주어진 훈련 집합을 기반으로 만든 베이지안 네트워크로부터 가상예제를 생성하고, 그 예제가 네트워크의 조건부 우도를 증가시키는데 기여할 경우 유용한 것으로 선별한다. 이러한 생성 및 선별과정을 반복하여 적절한 크기의 가상예제 집합을 수집하여 사용한다. 범주 속성 데이터와 수치 속성을 포함한 데이터를 대상으로 한 실험 결과, 여러 가지 학습 모델의 성능이 향상됨을 확인하였다.

  • PDF

A Grounded Theory Approach to the Process of Conflict between Early Childhood Teacher and Parent on the Perspectives of Teachers (유아교사의 관점에서 본 교사와 학부모의 갈등과정 : 근거이론적 접근)

  • Kim, Young Ju;Lee, Kyeong Hwa
    • Korean Journal of Childcare and Education
    • /
    • v.11 no.5
    • /
    • pp.237-260
    • /
    • 2015
  • This study sought to explain the process of conflict between early childhood teacher and parent (T-P conflict) and was guided by the following three questions: (a) how does a T-P conflict begin? (b) how does a T-P conflict develop over time? and (c) how does a T-P conflict end? One hundred cases were provided by private kindergarten teachers with experiences of T-P conflict. A qualitative grounded theory design was used for analysis of the data. Open coding and axial coding resulted in six categories: (a) "causes of conflict" (b) "conditional context of conflict" (c) "state of conflict" (d) "amplification of conflict" (e) "problem solving strategies of conflict", and (f) "cease of conflict". The stage of selective coding drew out three core categories: (a) "prelude with tuneless instruments" (b) "duet for discords and concords, and (c) "splendid finale vs. unplanned intermission". Additionally the study raised the doubts about current early childhood education policies based on neo-liberalism and their impacts on relationships between teachers and parents.