• Title/Summary/Keyword: 다중레이블 분류방법

Search Result 24, Processing Time 0.021 seconds

Research on Mining Technology for Explainable Decision Making (설명가능한 의사결정을 위한 마이닝 기술)

  • Kyungyong Chung
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.4
    • /
    • pp.186-191
    • /
    • 2023
  • Data processing techniques play a critical role in decision-making, including handling missing and outlier data, prediction, and recommendation models. This requires a clear explanation of the validity, reliability, and accuracy of all processes and results. In addition, it is necessary to solve data problems through explainable models using decision trees, inference, etc., and proceed with model lightweight by considering various types of learning. The multi-layer mining classification method that applies the sixth principle is a method that discovers multidimensional relationships between variables and attributes that occur frequently in transactions after data preprocessing. This explains how to discover significant relationships using mining on transactions and model the data through regression analysis. It develops scalable models and logistic regression models and proposes mining techniques to generate class labels through data cleansing, relevance analysis, data transformation, and data augmentation to make explanatory decisions.

Internet Multicast Routing Protocol Model using MPLS Networks (MPLS망을 이용한 인터넷 멀티캐스트 라우팅 프로토콜 모델)

  • Kim, Young-Jun
    • The KIPS Transactions:PartC
    • /
    • v.10C no.1
    • /
    • pp.77-86
    • /
    • 2003
  • This paper describes the new method for Internet multicast routing protocols using MPLS (Multiprotocol Label Switching) networks. Internet multicast routing protocols are divided into three categories in terms if tree types and tree characteristics : a shortest path tree a shared tree and hybrid tree types. MPLS should support various multicast mechanisms because of extremely different IP multicast architectures, such as uni-/bi-directional link, Flooding/prune tree maintenance mechanism. the existence of different tree types with the same group, etc. There are so many problems over MPLS multicast that the solutions can't be easily figured out. In this Paper, we make a few assumptions on which the solutions of IP multicast routing protocols over MPLS networks are given. A broadcasting label is defined for the shortest path tree types. Cell interleaving problems of the shared tree types is solved by using block-based transmission mechanism. Finally, the existing hybrid-type multicast routing protocol is reasonably modified Shortest Path tree type to support MPLS multicast. It has been shown that these modifications give better performance (transmission delay) than the orignal method.

Recognition of Answer Type for WiseQA (WiseQA를 위한 정답유형 인식)

  • Heo, Jeong;Ryu, Pum Mo;Kim, Hyun Ki;Ock, Cheol Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.7
    • /
    • pp.283-290
    • /
    • 2015
  • In this paper, we propose a hybrid method for the recognition of answer types in the WiseQA system. The answer types are classified into two categories: the lexical answer type (LAT) and the semantic answer type (SAT). This paper proposes two models for the LAT detection. One is a rule-based model using question focuses. The other is a machine learning model based on sequence labeling. We also propose two models for the SAT classification. They are a machine learning model based on multiclass classification and a filtering-rule model based on the lexical answer type. The performance of the LAT detection and the SAT classification shows F1-score of 82.47% and precision of 77.13%, respectively. Compared with IBM Watson for the performance of the LAT, the precision is 1.0% lower and the recall is 7.4% higher.

Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

  • Kim, Dasom;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.193-215
    • /
    • 2016
  • In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.