• Title/Summary/Keyword: Automatic Machine Learning

Search Result 298, Processing Time 0.025 seconds

An Empirical Comparison of Machine Learning Models for Classifying Emotions in Korean Twitter (한국어 트위터의 감정 분류를 위한 기계학습의 실증적 비교)

  • Lim, Joa-Sang;Kim, Jin-Man
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.2
    • /
    • pp.232-239
    • /
    • 2014
  • As online texts have been rapidly growing, their automatic classification gains more interest with machine learning methods. Nevertheless, comparatively few research could be found, aiming for Korean texts. Evaluating them with statistical methods are also rare. This study took a sample of tweets and used machine learning methods to classify emotions with features of morphemes and n-grams. As a result, about 76% of emotions contained in tweets was correctly classified. Of the two methods compared in this study, Support Vector Machines were found more accurate than Na$\ddot{i}$ve Bayes. The linear model of SVM was not inferior to the non-linear one. Morphological features did not contribute to accuracy more than did the n-grams.

An Automatic Document Classification with Bayesian Learning (베이지안 학습을 이용한 문서의 자동분류)

  • Kim, Jin-Sang;Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.1
    • /
    • pp.19-30
    • /
    • 2000
  • As the number of online documents increases enormously with the expansion of information technology, the importance of automatic document classification is greatly enlarged. In this paper, an automatic document classification method is investigated and applied to UseNet 20 newsgroup articles to test its efficacy. The classification system uses Naive Bayes classification algorithm and the experimental result shows that a randomly selected newsgroup arcicle can be classified into its own category over 77% accuracy.

  • PDF

A fully deep learning model for the automatic identification of cephalometric landmarks

  • Kim, Young Hyun;Lee, Chena;Ha, Eun-Gyu;Choi, Yoon Jeong;Han, Sang-Sun
    • Imaging Science in Dentistry
    • /
    • v.51 no.3
    • /
    • pp.299-306
    • /
    • 2021
  • Purpose: This study aimed to propose a fully automatic landmark identification model based on a deep learning algorithm using real clinical data and to verify its accuracy considering inter-examiner variability. Materials and Methods: In total, 950 lateral cephalometric images from Yonsei Dental Hospital were used. Two calibrated examiners manually identified the 13 most important landmarks to set as references. The proposed deep learning model has a 2-step structure-a region of interest machine and a detection machine-each consisting of 8 convolution layers, 5 pooling layers, and 2 fully connected layers. The distance errors of detection between 2 examiners were used as a clinically acceptable range for performance evaluation. Results: The 13 landmarks were automatically detected using the proposed model. Inter-examiner agreement for all landmarks indicated excellent reliability based on the 95% confidence interval. The average clinically acceptable range for all 13 landmarks was 1.24 mm. The mean radial error between the reference values assigned by 1 expert and the proposed model was 1.84 mm, exhibiting a successful detection rate of 36.1%. The A-point, the incisal tip of the maxillary and mandibular incisors, and ANS showed lower mean radial error than the calibrated expert variability. Conclusion: This experiment demonstrated that the proposed deep learning model can perform fully automatic identification of cephalometric landmarks and achieve better results than examiners for some landmarks. It is meaningful to consider between-examiner variability for clinical applicability when evaluating the performance of deep learning methods in cephalometric landmark identification.

Music Emotion Classification Based On Three-Level Structure (3 레벨 구조 기반의 음악 무드분류)

  • Kim, Hyoung-Gook;Jeong, Jin-Guk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.2E
    • /
    • pp.56-62
    • /
    • 2007
  • This paper presents the automatic music emotion classification on acoustic data. A three-level structure is developed. The low-level extracts the timbre and rhythm features. The middle-level estimates the indication functions that represent the emotion probability of a single analysis unit. The high-level predicts the emotion result based on the indication function values. Experiments are carried out on 695 homogeneous music pieces labeled with four emotions, including pleasant, calm, sad, and excited. Three machine learning methods, GMM, MLP, and SVM, are compared on the high-level. The best result of 90.16% is obtained by MLP method.

A Study on the Automatic Extraction of Fomulation and Properties in Chemical Field Patent Document by Using Machine Learning Technology (기계학습 기술을 활용한 화학분야 특허문서의 조성/물성 정보 자동추출 방법 연구)

  • Kim, Hongki;Lee, Hayoung;Park, Jinwoo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.277-280
    • /
    • 2019
  • 본 논문에서는 화학분야 특허 문서에 존재하는 도표(TABLE) 데이터를 인공지능 기술을 활용하여 자동으로 추출하고 정형화된 형태로 가공하는 방법을 제안한다. 특허 문서에서 도표 데이터는 실시예에서 실험결과나 비교결과를 간결하고 가시적으로 표현하기 위하여 주로 사용되나, 셀의 속성을 정의하는 헤더부분과 수치가 표현되는 값 부분의 경계가 모호하여 구조화하는데 어려움이 있다. 본 논문에서 제안하는 방법은 소량의 학습데이터를 구축하고 기계학습을 통해 도표에 존재하는 셀의 속성을 예측하고, 예측된 속성을 토대로 조성과 물성 정보를 자동으로 구분하여 추출하는 방법을 제시한다. 제시된 방법을 활용하여 화학 분야 조성물 특허의 도표데이터에 시뮬레이션 결과 각 항목별 98.17%의 속성 예측 정확도를 나타내었으며 기존 규칙기반 연구보다 작업난이도, 예측정확도에서 우수한 성과를 보인다.

  • PDF

Tongue Image Segmentation Using CNN and Various Image Augmentation Techniques (콘볼루션 신경망(CNN)과 다양한 이미지 증강기법을 이용한 혀 영역 분할)

  • Ahn, Ilkoo;Bae, Kwang-Ho;Lee, Siwoo
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.5
    • /
    • pp.201-210
    • /
    • 2021
  • In Korean medicine, tongue diagnosis is one of the important diagnostic methods for diagnosing abnormalities in the body. Representative features that are used in the tongue diagnosis include color, shape, texture, cracks, and tooth marks. When diagnosing a patient through these features, the diagnosis criteria may be different for each oriental medical doctor, and even the same person may have different diagnosis results depending on time and work environment. In order to overcome this problem, recent studies to automate and standardize tongue diagnosis using machine learning are continuing and the basic process of such a machine learning-based tongue diagnosis system is tongue segmentation. In this paper, image data is augmented based on the main tongue features, and backbones of various famous deep learning architecture models are used for automatic tongue segmentation. The experimental results show that the proposed augmentation technique improves the accuracy of tongue segmentation, and that automatic tongue segmentation can be performed with a high accuracy of 99.12%.

An Automatic Diagnosis System for Hepatitis Diseases Based on Genetic Wavelet Kernel Extreme Learning Machine

  • Avci, Derya
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.4
    • /
    • pp.993-1002
    • /
    • 2016
  • Hepatitis is a major public health problem all around the world. This paper proposes an automatic disease diagnosis system for hepatitis based on Genetic Algorithm (GA) Wavelet Kernel (WK) Extreme Learning Machines (ELM). The classifier used in this paper is single layer neural network (SLNN) and it is trained by ELM learning method. The hepatitis disease datasets are obtained from UCI machine learning database. In Wavelet Kernel Extreme Learning Machine (WK-ELM) structure, there are three adjustable parameters of wavelet kernel. These parameters and the numbers of hidden neurons play a major role in the performance of ELM. Therefore, values of these parameters and numbers of hidden neurons should be tuned carefully based on the solved problem. In this study, the optimum values of these parameters and the numbers of hidden neurons of ELM were obtained by using Genetic Algorithm (GA). The performance of proposed GA-WK-ELM method is evaluated using statical methods such as classification accuracy, sensitivity and specivity analysis and ROC curves. The results of the proposed GA-WK-ELM method are compared with the results of the previous hepatitis disease studies using same database as well as different database. When previous studies are investigated, it is clearly seen that the high classification accuracies have been obtained in case of reducing the feature vector to low dimension. However, proposed GA-WK-ELM method gives satisfactory results without reducing the feature vector. The calculated highest classification accuracy of proposed GA-WK-ELM method is found as 96.642 %.

Implementation of a Machine Learning-based Recommender System for Preventing the University Students' Dropout (대학생 중도탈락 예방을 위한 기계 학습 기반 추천 시스템 구현 방안)

  • Jeong, Do-Heon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.37-43
    • /
    • 2021
  • This study proposed an effective automatic classification technique to identify dropout patterns of university students, and based on this, an intelligent recommender system to prevent dropouts. To this end, 1) a data processing method to improve the performance of machine learning was proposed based on actual enrollment/dropout data of university students, and 2) performance comparison experiments were conducted using five types of machine learning algorithms. 3) As a result of the experiment, the proposed method showed superior performance in all algorithms compared to the baseline method. The precision rate of discrimination of enrolled students was measured to be up to 95.6% when using a Random Forest(RF), and the recall rate of dropout students was measured to be up to 80.0% when using Naive Bayes(NB). 4) Finally, based on the experimental results, a method for using a counseling recommender system to give priority to students who are likely to drop out was suggested. It was confirmed that reasonable decision-making can be conducted through convergence research that utilizes technologies in the IT field to solve the educational issues, and we plan to apply various artificial intelligence technologies through continuous research in the future.

A Study of Big Data Domain Automatic Classification Using Machine Learning (머신러닝을 이용한 빅데이터 도메인 자동 판별에 관한 연구)

  • Kong, Seongwon;Hwang, Deokyoul
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.11-18
    • /
    • 2018
  • This study is a study on domain automatic classification for domain - based quality diagnosis which is a key element of big data quality diagnosis. With the increase of the value and utilization of Big Data and the rise of the Fourth Industrial Revolution, the world is making efforts to create new value by utilizing big data in various fields converged with IT such as law, medical, and finance. However, analysis based on low-reliability data results in critical problems in both the process and the result, and it is also difficult to believe that judgments based on the analysis results. Although the need of highly reliable data has also increased, research on the quality of data and its results have been insufficient. The purpose of this study is to shorten the work time to automizing the domain classification work which was performed from manually to using machine learning in the domain - based quality diagnosis, which is a key element of diagnostic evaluation for improving data quality. Extracts information about the characteristics of the data that is stored in the database and identifies the domain, and then featurize it, and automizes the domain classification using machine learning. We will use it for big data quality diagnosis and contribute to quality improvement.

Regression Algorithms Evaluation for Analysis of Crosstalk in High-Speed Digital System

  • Minhyuk Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1449-1461
    • /
    • 2024
  • As technology advances, processor speeds are increasing at a rapid pace and digital systems require a significant amount of data bandwidth. As a result, careful consideration of signal integrity is required to ensure reliable and high-speed data processing. Crosstalk has become a vital area of research in signal integrity for electronic packages, mainly because of the high level of integration. Analytic formulas were analyzed in this study to identify the features that can predict crosstalk in multi-conductor transmission lines. Through the analysis, five variables were found and obtained a dataset consisting of 302,500, data points. The study evaluated the performance of various regression models for optimization via automatic machine learning by comparing the machine learning predictions with the analytic solution. Extra tree regression consistently outperformed other algorithms, with coefficients of determination exceeding 0.9 and root mean square logarithmic errors below 0.35. The study also notes that different algorithms produced varied predictions for the two metrics.