• Title/Summary/Keyword: learning classification

Search Result 3,242, Processing Time 0.03 seconds

A Study on Automatic Classification of Record Text Using Machine Learning (기계학습을 이용한 기록 텍스트 자동분류 사례 연구)

  • Kim, Hae Chan Sol;An, Dae Jin;Yim, Jin Hee;Rieh, Hae-Young
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.321-344
    • /
    • 2017
  • Research on automatic classification of records and documents has been conducted for a long time. Recently, artificial intelligence technology has been developed to combine machine learning and deep learning. In this study, we first looked at the process of automatic classification of documents and learning method of artificial intelligence. We also discussed the necessity of applying artificial intelligence technology to records management using various cases of machine learning, especially supervised methods. And we conducted a test to automatically classify the public records of the Seoul metropolitan government into BRM using ETRI's Exobrain, based on supervised machine learning method. Through this, we have drawn up issues to be considered in each step in records management agencies to automatically classify the records into various classification schemes.

Deep Meta Learning Based Classification Problem Learning Method for Skeletal Maturity Indication (골 성숙도 판별을 위한 심층 메타 학습 기반의 분류 문제 학습 방법)

  • Min, Jeong Won;Kang, Dong Joong
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.98-107
    • /
    • 2018
  • In this paper, we propose a method to classify the skeletal maturity with a small amount of hand wrist X-ray image using deep learning-based meta-learning. General deep-learning techniques require large amounts of data, but in many cases, these data sets are not available for practical application. Lack of learning data is usually solved through transfer learning using pre-trained models with large data sets. However, transfer learning performance may be degraded due to over fitting for unknown new task with small data, which results in poor generalization capability. In addition, medical images require high cost resources such as a professional manpower and mcuh time to obtain labeled data. Therefore, in this paper, we use meta-learning that can classify using only a small amount of new data by pre-trained models trained with various learning tasks. First, we train the meta-model by using a separate data set composed of various learning tasks. The network learns to classify the bone maturity using the bone maturity data composed of the radiographs of the wrist. Then, we compare the results of the classification using the conventional learning algorithm with the results of the meta learning by the same number of learning data sets.

Analysis of the Relation between Biological Classification Ability and Cortisol-hormonal Change of Middle School Students

  • Bae, Ye-Jun;Lee, Il-Sun;Byeon, Jung-Ho;Kwon, Yong-Ju
    • Journal of The Korean Association For Science Education
    • /
    • v.32 no.6
    • /
    • pp.1063-1071
    • /
    • 2012
  • The purpose of this study is to investigate the relation between the classification ability quotient and cortisol-hormonal change of middle school students. Thirty-three students, second graders in middle school, performed the classification task that can be an indicator of students' classification ability. And then amount of the secreted hormone was analyzed during task performance. The study results were as follows: First, the classification methods of students mostly utilized visual, qualitative. Their classification patterns for each subject were static, partial, and non-comparative. Second, the amount of stress-hormone was secreted from students during the experiment decreased in overall after the free classification. It seemed that student-centered activity relieved stress. Third, the classification ability quotient turned out to be significantly correlated to the stress hormone, which means that there was a close relationship between classification ability and stress level. It was also considered that stress had a positive effect on the improvement of classification ability. This study provided physiologically more accurate information on the stress increased in the learning process than other conventional studies based on reports or interviews. Finally, researchers could recognize the effect of stress in the cognitive activity and the need to find an appropriate level of stress in learning processes.

An Analytical Study on Automatic Classification of Domestic Journal articles Based on Machine Learning (기계학습에 기초한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.37-62
    • /
    • 2018
  • This study examined the factors affecting the performance of automatic classification based on machine learning for domestic journal articles in the field of LIS. In particular, In view of the classification performance that assigning automatically the class labels to the articles in "Journal of the Korean Society for Information Management", I investigated the characteristics of the key factors(weighting schemes, training set size, classification algorithms, label assigning methods) through the diversified experiments. Consequently, It is effective to apply each element appropriately according to the classification environment and the characteristics of the document set, and a fairly good performance can be obtained by using a simpler model. In addition, the classification of domestic journals can be considered as a multi-label classification that assigns more than one category to a specific article. Therefore, I proposed an optimal classification model using simple and fast classification algorithm and small learning set considering this environment.

A multi-layed neural network learning procedure and generating architecture method for improving neural network learning capability (다층신경망의 학습능력 향상을 위한 학습과정 및 구조설계)

  • 이대식;이종태
    • Korean Management Science Review
    • /
    • v.18 no.2
    • /
    • pp.25-38
    • /
    • 2001
  • The well-known back-propagation algorithm for multi-layered neural network has successfully been applied to pattern c1assification problems with remarkable flexibility. Recently. the multi-layered neural network is used as a powerful data mining tool. Nevertheless, in many cases with complex boundary of classification, the successful learning is not guaranteed and the problems of long learning time and local minimum attraction restrict the field application. In this paper, an Improved learning procedure of multi-layered neural network is proposed. The procedure is based on the generalized delta rule but it is particular in the point that the architecture of network is not fixed but enlarged during learning. That is, the number of hidden nodes or hidden layers are increased to help finding the classification boundary and such procedure is controlled by entropy evaluation. The learning speed and the pattern classification performance are analyzed and compared with the back-propagation algorithm.

  • PDF

An Automatic Document Classification with Bayesian Learning (베이지안 학습을 이용한 문서의 자동분류)

  • Kim, Jin-Sang;Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.1
    • /
    • pp.19-30
    • /
    • 2000
  • As the number of online documents increases enormously with the expansion of information technology, the importance of automatic document classification is greatly enlarged. In this paper, an automatic document classification method is investigated and applied to UseNet 20 newsgroup articles to test its efficacy. The classification system uses Naive Bayes classification algorithm and the experimental result shows that a randomly selected newsgroup arcicle can be classified into its own category over 77% accuracy.

  • PDF

Classification of Traffic Flows into QoS Classes by Unsupervised Learning and KNN Clustering

  • Zeng, Yi;Chen, Thomas M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.2
    • /
    • pp.134-146
    • /
    • 2009
  • Traffic classification seeks to assign packet flows to an appropriate quality of service(QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.

Domain Adaptation for Opinion Classification: A Self-Training Approach

  • Yu, Ning
    • Journal of Information Science Theory and Practice
    • /
    • v.1 no.1
    • /
    • pp.10-26
    • /
    • 2013
  • Domain transfer is a widely recognized problem for machine learning algorithms because models built upon one data domain generally do not perform well in another data domain. This is especially a challenge for tasks such as opinion classification, which often has to deal with insufficient quantities of labeled data. This study investigates the feasibility of self-training in dealing with the domain transfer problem in opinion classification via leveraging labeled data in non-target data domain(s) and unlabeled data in the target-domain. Specifically, self-training is evaluated for effectiveness in sparse data situations and feasibility for domain adaptation in opinion classification. Three types of Web content are tested: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. Findings of this study suggest that, when there are limited labeled data, self-training is a promising approach for opinion classification, although the contributions vary across data domains. Significant improvement was demonstrated for the most challenging data domain-the blogosphere-when a domain transfer-based self-training strategy was implemented.

Development of e-Mail Classifiers for e-Mail Response Management Systems (전자메일 자동관리 시스템을 위한 전자메일 분류기의 개발)

  • Kim, Kuk-Pyo;Kwon, Young-S.
    • Journal of Information Technology Services
    • /
    • v.2 no.2
    • /
    • pp.87-95
    • /
    • 2003
  • With the increasing proliferation of World Wide Web, electronic mail systems have become very widely used communication tools. Researches on e-mail classification have been very important in that e-mail classification system is a major engine for e-mail response management systems which mine unstructured e-mail messages and automatically categorize them. in this research we develop e-mail classifiers for e-mail Response Management Systems (ERMS) using naive bayesian learning and centroid-based classification. We analyze which method performs better under which conditions, comparing classification accuracies which may depend on the structure, the size of training data set and number of classes, using the different data set of an on-line shopping mall and a credit card company. The developed e-mail classifiers have been successfully implemented in practice. The experimental results show that naive bayesian learning performs better, while centroid-based classification is more robust in terms of classification accuracy.

An Optimal Weighting Method in Supervised Learning of Linguistic Model for Text Classification

  • Mikawa, Kenta;Ishida, Takashi;Goto, Masayuki
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.1
    • /
    • pp.87-93
    • /
    • 2012
  • This paper discusses a new weighting method for text analyzing from the view point of supervised learning. The term frequency and inverse term frequency measure (tf-idf measure) is famous weighting method for information retrieval, and this method can be used for text analyzing either. However, it is an experimental weighting method for information retrieval whose effectiveness is not clarified from the theoretical viewpoints. Therefore, other effective weighting measure may be obtained for document classification problems. In this study, we propose the optimal weighting method for document classification problems from the view point of supervised learning. The proposed measure is more suitable for the text classification problem as used training data than the tf-idf measure. The effectiveness of our proposal is clarified by simulation experiments for the text classification problems of newspaper article and the customer review which is posted on the web site.