• 제목/요약/키워드: Machine training

검색결과 1,190건 처리시간 0.026초

Support Vector Machine Classification Using Training Sets of Small Mixed Pixels: An Appropriateness Assessment of IKONOS Imagery

  • Yu, Byeong-Hyeok;Chi, Kwang-Hoon
    • 대한원격탐사학회지
    • /
    • 제24권5호
    • /
    • pp.507-515
    • /
    • 2008
  • Many studies have generally used a large number of pure pixels as an approach to training set design. The training set are used, however, varies between classifiers. In the recent research, it was reported that small mixed pixels between classes are actually more useful than larger pure pixels of each class in Support Vector Machine (SVM) classification. We evaluated a usability of small mixed pixels as a training set for the classification of high-resolution satellite imagery. We presented an advanced approach to obtain a mixed pixel readily, and evaluated the appropriateness with the land cover classification from IKONOS satellite imagery. The results showed that the accuracy of the classification based on small mixed pixels is nearly identical to the accuracy of the classification based on large pure pixels. However, it also showed a limitation that small mixed pixels used may provide insufficient information to separate the classes. Small mixed pixels of the class border region provide cost-effective training sets, but its use with other pixels must be considered in use of high-resolution satellite imagery or relatively complex land cover situations.

A Smart Bench Press Machine: Automatic Weight Control Sensitive to User Tiredness

  • Kim, Jihun;Jo, Han-jin;Kim, Kiyoung;Ji, Hae-geun;Kim, Jaehyo
    • International Journal of Advanced Culture Technology
    • /
    • 제7권1호
    • /
    • pp.209-215
    • /
    • 2019
  • In order to provide a safe free-weight-training environment to people without workout trainers, we suggest a smart bench press machine with an automatic weight control system sensitive to user tiredness. Physical weight plates on the machine are replaced with a hydraulic cylinder as a press load and the cylinder knob is coupled with a step motor to change its tensile force automatically in-between lifting exercises. Three subjects participated to verify the usability of the smart bench press machine. They were asked to lift a 6-RM press load 10 times with 3 different lifting conditions: 1) no assistance, 2) a human assistance, and 3) the automatic weight control. All subjects were not able to complete the 10 sets without assistance due to tiredness, but they finished the full sets under the two assistive conditions. Average lifting speeds under the automatic weight control condition showed the most consistent level. Normalized quasi-tension data based on surface electromyogram signals of both Pectoralis Majors revealed that the subjects maintained the target muscle activation level above 50% but not more than 80% throughout the 10 sets. Therefore, the smart bench press machine is expected to both keep pace with the lifting exercise and reduce risk of injuries due to excessive muscle tensions.

Comparison of Machine Learning-Based Radioisotope Identifiers for Plastic Scintillation Detector

  • Jeon, Byoungil;Kim, Jongyul;Yu, Yonggyun;Moon, Myungkook
    • Journal of Radiation Protection and Research
    • /
    • 제46권4호
    • /
    • pp.204-212
    • /
    • 2021
  • Background: Identification of radioisotopes for plastic scintillation detectors is challenging because their spectra have poor energy resolutions and lack photo peaks. To overcome this weakness, many researchers have conducted radioisotope identification studies using machine learning algorithms; however, the effect of data normalization on radioisotope identification has not been addressed yet. Furthermore, studies on machine learning-based radioisotope identifiers for plastic scintillation detectors are limited. Materials and Methods: In this study, machine learning-based radioisotope identifiers were implemented, and their performances according to data normalization methods were compared. Eight classes of radioisotopes consisting of combinations of 22Na, 60Co, and 137Cs, and the background, were defined. The training set was generated by the random sampling technique based on probabilistic density functions acquired by experiments and simulations, and test set was acquired by experiments. Support vector machine (SVM), artificial neural network (ANN), and convolutional neural network (CNN) were implemented as radioisotope identifiers with six data normalization methods, and trained using the generated training set. Results and Discussion: The implemented identifiers were evaluated by test sets acquired by experiments with and without gain shifts to confirm the robustness of the identifiers against the gain shift effect. Among the three machine learning-based radioisotope identifiers, prediction accuracy followed the order SVM > ANN > CNN, while the training time followed the order SVM > ANN > CNN. Conclusion: The prediction accuracy for the combined test sets was highest with the SVM. The CNN exhibited a minimum variation in prediction accuracy for each class, even though it had the lowest prediction accuracy for the combined test sets among three identifiers. The SVM exhibited the highest prediction accuracy for the combined test sets, and its training time was the shortest among three identifiers.

Domain Adaptation for Opinion Classification: A Self-Training Approach

  • Yu, Ning
    • Journal of Information Science Theory and Practice
    • /
    • 제1권1호
    • /
    • pp.10-26
    • /
    • 2013
  • Domain transfer is a widely recognized problem for machine learning algorithms because models built upon one data domain generally do not perform well in another data domain. This is especially a challenge for tasks such as opinion classification, which often has to deal with insufficient quantities of labeled data. This study investigates the feasibility of self-training in dealing with the domain transfer problem in opinion classification via leveraging labeled data in non-target data domain(s) and unlabeled data in the target-domain. Specifically, self-training is evaluated for effectiveness in sparse data situations and feasibility for domain adaptation in opinion classification. Three types of Web content are tested: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. Findings of this study suggest that, when there are limited labeled data, self-training is a promising approach for opinion classification, although the contributions vary across data domains. Significant improvement was demonstrated for the most challenging data domain-the blogosphere-when a domain transfer-based self-training strategy was implemented.

BERT를 이용한 한국어 특허상담 기계독해 (Korean Machine Reading Comprehension for Patent Consultation Using BERT)

  • 민재옥;박진우;조유정;이봉건
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제9권4호
    • /
    • pp.145-152
    • /
    • 2020
  • 기계독해는(Machine reading comprehension) 사용자 질의와 관련된 문서를 기계가 이해한 후 정답을 추론하는 인공지능 자연어처리 태스크를 말하며, 이러한 기계독해는 챗봇과 같은 자동상담 서비스에 활용될 수 있다. 최근 자연어처리 분야에서 가장 높은 성능을 보이고 있는 BERT 언어모델은 대용량의 데이터를 pre-training 한 후에 각 자연어처리 태스크에 대해 fine-tuning하여 학습된 모델로 추론함으로써 문제를 해결하는 방식이다. 본 논문에서는 BERT기반 특허상담 기계독해 태스크를 위해 특허상담 데이터 셋을 구축하고 그 구축 방법을 소개하며, patent 코퍼스를 pre-training한 Patent-BERT 모델과 특허상담 모델학습에 적합한 언어처리 알고리즘을 추가함으로써 특허상담 기계독해 태스크의 성능을 향상시킬 수 있는 방안을 제안한다. 본 논문에서 제안한 방법을 사용하여 특허상담 질의에 대한 정답 결정에서 성능이 향상됨을 보였다.

정형 데이터와 비정형 데이터를 동시에 고려하는 기계학습 기반의 직업훈련 중도탈락 예측 모형 (A Machine Learning-Based Vocational Training Dropout Prediction Model Considering Structured and Unstructured Data)

  • 하만석;안현철
    • 한국콘텐츠학회논문지
    • /
    • 제19권1호
    • /
    • pp.1-15
    • /
    • 2019
  • 직업훈련 교육 현장에서 느끼는 가장 큰 어려움 중 하나는 중도탈락 문제이다. 훈련과정마다 많은 수의 학생들이 중도탈락을 하게 되어 국가 예산 낭비 및 청년 취업률 개선에 장애 요인이 되고 있다. 본 연구에서는 중도탈락의 원인을 주로 분석한 기존 연구들과 달리, 각종 수강생 정보를 활용하여 사전에 중도탈락을 예측할 수 있는 기계학습 기반 모형을 제안하고자 한다. 특히 본 연구의 제안모형은 수강생 관련 정형 데이터 뿐 아니라 비정형 데이터인 강사의 상담일지 정보까지 동시에 고려하여 모형의 예측정확도를 제고하고자 하였다. 이 때 비정형 데이터에 대한 분석은 최근 주목받고 있는 텍스트 분석 기술인 Word2vec과 합성곱 신경망을 이용해 수행하였다. 국내 한 직업훈련기관의 실제 데이터에 제안모형을 적용해 본 결과, 정형데이터만을 사용하여 중도탈락을 예측할 때보다 비정형 데이터를 함께 고려했을 때 예측의 정확도가 최대 20%까지 향상됨을 확인할 수 있었다. 아울러, Support Vector Machine을 기반으로 정형 데이터와 비정형 데이터를 결합해 분석했을 때, 검증용 데이터셋 기준으로 90% 후반대의 높은 예측 정확도를 나타냄을 확인하였다.

Prediction of critical heat flux for narrow rectangular channels in a steady state condition using machine learning

  • Kim, Huiyung;Moon, Jeongmin;Hong, Dongjin;Cha, Euiyoung;Yun, Byongjo
    • Nuclear Engineering and Technology
    • /
    • 제53권6호
    • /
    • pp.1796-1809
    • /
    • 2021
  • The subchannel of a research reactor used to generate high power density is designed to be narrow and rectangular and comprises plate-type fuels operating under downward flow conditions. Critical heat flux (CHF) is a crucial parameter for estimating the safety of a nuclear fuel; hence, this parameter should be accurately predicted. Here, machine learning is applied for the prediction of CHF in a narrow rectangular channel. Although machine learning can effectively analyze large amounts of complex data, its application to CHF, particularly for narrow rectangular channels, remains challenging because of the limited flow conditions available in existing experimental databases. To resolve this problem, we used four CHF correlations to generate pseudo-data for training an artificial neural network. We also propose a network architecture that includes pre-training and prediction stages to predict and analyze the CHF. The trained neural network predicted the CHF with an average error of 3.65% and a root-mean-square error of 17.17% for the test pseudo-data; the respective errors of 0.9% and 26.4% for the experimental data were not considered during training. Finally, machine learning was applied to quantitatively investigate the parametric effect on the CHF in narrow rectangular channels under downward flow conditions.

기계번역을 이용한 교차언어 문서 범주화의 분류 성능 분석 (Classification Performance Analysis of Cross-Language Text Categorization using Machine Translation)

  • 이용구
    • 한국문헌정보학회지
    • /
    • 제43권1호
    • /
    • pp.313-332
    • /
    • 2009
  • 교차언어 문서 범주화(CLTC)는 다른 언어로 된 학습집단을 이용하여 문헌을 자동 분류할 수 있다. 이 연구는 KTSET으로부터 CLTC에 적합한 실험문헌집단을 추출하고, 기계 번역기를 이용하여 가능한 여러 CLTC 방법의 분류 성능을 비교하였다. 분류기는 SVM 분류기를 이용하였다. 실험 결과, CLTC 중에 다국어 학습방법이 가장 좋은 분류 성능을 보였으며, 학습집단 번역방법, 검증집단 번역방법 순으로 분류 성능이 낮아졌다. 하지만 학습집단 번역방법이 기계번역 측면에서 효율적이며, 일반적인 환경에 쉽게 적용할 수 있고, 비교적 분류 성능이 좋아 CLTC 방법 중에서 가장 높은 이용 가능성을 보였다. 한편 CLTC에서 기계번역을 이용하였을 때 번역과정에서 발생하는 자질축소나 주제적 특성이 없는 자질로의 번역으로 인해 성능 저하를 가져왔다.

Machine learning application for predicting the strawberry harvesting time

  • Yang, Mi-Hye;Nam, Won-Ho;Kim, Taegon;Lee, Kwanho;Kim, Younghwa
    • 농업과학연구
    • /
    • 제46권2호
    • /
    • pp.381-393
    • /
    • 2019
  • A smart farm is a system that combines information and communication technology (ICT), internet of things (IoT), and agricultural technology that enable a farm to operate with minimal labor and to automatically control of a greenhouse environment. Machine learning based on recently data-driven techniques has emerged with big data technologies and high-performance computing to create opportunities to quantify data intensive processes in agricultural operational environments. This paper presents research on the application of machine learning technology to diagnose the growth status of crops and predicting the harvest time of strawberries in a greenhouse according to image processing techniques. To classify the growth stages of the strawberries, we used object inference and detection with machine learning model based on deep learning neural networks and TensorFlow. The classification accuracy was compared based on the training data volume and training epoch. As a result, it was able to classify with an accuracy of over 90% with 200 training images and 8,000 training steps. The detection and classification of the strawberry maturities could be identified with an accuracy of over 90% at the mature and over mature stages of the strawberries. Concurrently, the experimental results are promising, and they show that this approach can be applied to develop a machine learning model for predicting the strawberry harvesting time and can be used to provide key decision support information to both farmers and policy makers about optimal harvest times and harvest planning.

Improving the Subject Independent Classification of Implicit Intention By Generating Additional Training Data with PCA and ICA

  • Oh, Sang-Hoon
    • International Journal of Contents
    • /
    • 제14권4호
    • /
    • pp.24-29
    • /
    • 2018
  • EEG-based brain-computer interfaces has focused on explicitly expressed intentions to assist physically impaired patients. For EEG-based-computer interfaces to function effectively, it should be able to understand users' implicit information. Since it is hard to gather EEG signals of human brains, we do not have enough training data which are essential for proper classification performance of implicit intention. In this paper, we improve the subject independent classification of implicit intention through the generation of additional training data. In the first stage, we perform the PCA (principal component analysis) of training data in a bid to remove redundant components in the components within the input data. After the dimension reduction by PCA, we train ICA (independent component analysis) network whose outputs are statistically independent. We can get additional training data by adding Gaussian noises to ICA outputs and projecting them to input data domain. Through simulations with EEG data provided by CNSL, KAIST, we improve the classification performance from 65.05% to 66.69% with Gamma components. The proposed sample generation method can be applied to any machine learning problem with fewer samples.