• Title/Summary/Keyword: 준지도분류

Search Result 28, Processing Time 0.029 seconds

Graph Construction Based on Fast Low-Rank Representation in Graph-Based Semi-Supervised Learning (그래프 기반 준지도 학습에서 빠른 낮은 계수 표현 기반 그래프 구축)

  • Oh, Byonghwa;Yang, Jihoon
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.15-21
    • /
    • 2018
  • Low-Rank Representation (LRR) based methods are widely used in many practical applications, such as face clustering and object detection, because they can guarantee high prediction accuracy when used to constructing graphs in graph - based semi-supervised learning. However, in order to solve the LRR problem, it is necessary to perform singular value decomposition on the square matrix of the number of data points for each iteration of the algorithm; hence the calculation is inefficient. To solve this problem, we propose an improved and faster LRR method based on the recently published Fast LRR (FaLRR) and suggests ways to introduce and optimize additional constraints on the underlying optimization goals in order to address the fact that the FaLRR is fast but actually poor in classification problems. Our experiments confirm that the proposed method finds a better solution than LRR does. We also propose Fast MLRR (FaMLRR), which shows better results when the goal of minimizing is added.

Semi-Supervised SAR Image Classification via Adaptive Threshold Selection (선별적인 임계값 선택을 이용한 준지도 학습의 SAR 분류 기술)

  • Jaejun Do;Minjung Yoo;Jaeseok Lee;Hyoi Moon;Sunok Kim
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.27 no.3
    • /
    • pp.319-328
    • /
    • 2024
  • Semi-supervised learning is a good way to train a classification model using a small number of labeled and large number of unlabeled data. We applied semi-supervised learning to a synthetic aperture radar(SAR) image classification model with a limited number of datasets that are difficult to create. To address the previous difficulties, semi-supervised learning uses a model trained with a small amount of labeled data to generate and learn pseudo labels. Besides, a lot of number of papers use a single fixed threshold to create pseudo labels. In this paper, we present a semi-supervised synthetic aperture radar(SAR) image classification method that applies different thresholds for each class instead of all classes sharing a fixed threshold to improve SAR classification performance with a small number of labeled datasets.

Korean Automated Scoring System for Supply-Type Items using Semi-Supervised Learning (준지도학습 방법을 이용한 한국어 서답형 문항 자동채점 시스템)

  • Cheon, Min-Ah;Seo, Hyeong-Won;Kim, Jae-Hoon;Noh, Eun-Hee;Sung, Kyung-Hee;Lim, EunYoung
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.112-116
    • /
    • 2014
  • 서답형 문항은 학생들의 종합적인 사고능력을 판단하는데 매우 유용하지만 채점할 때, 시간과 비용이 매우 많이 소요되고 채점자의 공정성을 확보해야 하는 어려움이 있다. 이러한 문제를 개선하기 위해 본 논문에서는 서답형 문항에 대한 자동채점 시스템을 제안한다. 본 논문에서 제안하는 시스템은 크게 언어 처리 단계와 채점 단계로 나뉜다. 첫 번째로 언어 처리 단계에서는 형태소 분석과 같은 한국어 정보처리 시스템을 이용하여 학생들의 답안을 분석한다. 두 번째로 채점 단계를 진행하는데 이 단계는 아래와 같은 순서로 진행된다. 1) 첫 번째 단계에서 분석 결과가 완전히 일치하는 답안들을 하나의 유형으로 간주하여 각 유형에 속한 답안의 빈도수가 높은 순서대로 정렬하여 인간 채점자가 고빈도 학생 답안을 수동으로 채점한다. 2) 현재까지 채점된 결과와 모범답안을 학습말뭉치로 간주하여 자질 추출 및 자질 가중치 학습을 수행한다. 3) 2)의 학습 결과를 토대로 미채점 답안들을 군집화하여 분류한다. 4) 분류된 결과 중에서 신뢰성이 높은 채점 답안에 대해서 인간 채점자가 확인하고 학습말뭉치에 추가한다. 5) 이와 같은 방법으로 미채점 답안이 존재하지 않을 때까지 반복한다. 제안된 시스템을 평가하기 위해서 2013년 학업성취도 평가의 사회(중3) 및 국어(고2) 과목의 서답형 문항을 사용하였다. 각 과목에서 1000개의 학생 답안을 추출하여 채점시간과 정확률을 평가하였다. 채점시간을 전체적으로 약 80% 이상 줄일 수 있었고 채점 정확률은 사회 및 국어 과목에 대해 각각 98.7%와 97.2%로 나타났다. 앞으로 자동 채점 시스템의 성능을 개선하고 인간 채점자의 집중도를 높일 수 있도록 인터페이스를 개선한다면 국가수준의 대단위 평가에 충분히 활용할 수 있을 것으로 생각한다.

  • PDF

Semi-Supervised Learning to Predict Default Risk for P2P Lending (준지도학습 기반의 P2P 대출 부도 위험 예측에 대한 연구)

  • Kim, Hyun-jung
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.185-192
    • /
    • 2022
  • This study investigates the effect of the semi-supervised learning(SSL) method on predicting default risk of peer-to-peer(P2P) loans. Despite its proven performance, the supervised learning(SL) method requires labeled data, which may require a lot of effort and resources to collect. With the rapid growth of P2P platforms, the number of loans issued annually that have no clear final resolution is continuously increasing leading to abundance in unlabeled data. The research data of P2P loans used in this study were collected on the LendingClub platform. This is why an SSL model is needed to predict the default risk by using not only information from labeled loans(fully paid or defaulted) but also information from unlabeled loans. The results showed that in terms of default risk prediction and despite the use of a small number of labeled data, the SSL method achieved a much better default risk prediction performance than the SL method trained using a much larger set of labeled data.

A Study on the User-Based Small Fishing Boat Collision Alarm Classification Model Using Semi-supervised Learning (준지도 학습을 활용한 사용자 기반 소형 어선 충돌 경보 분류모델에대한 연구)

  • Ho-June Seok;Seung Sim;Jeong-Hun Woo;Jun-Rae Cho;Jaeyong Jung;DeukJae Cho;Jong-Hwa Baek
    • Journal of Navigation and Port Research
    • /
    • v.47 no.6
    • /
    • pp.358-366
    • /
    • 2023
  • This study aimed to provide a solution for improving ship collision alert of the 'accident vulnerable ship monitoring service' among the 'intelligent marine traffic information system' services of the Ministry of Oceans and Fisheries. The current ship collision alert uses a supervised learning (SL) model with survey labels based on large ship-oriented data and its operators. Consequently, the small ship data and the operator's opinion are not reflected in the current collision-supervised learning model, and the effect is insufficient because the alarm is provided from a longer distance than the small ship operator feels. In addition, the supervised learning (SL) method requires a large number of labeled data, and the labeling process requires a lot of resources and time. To overcome these limitations, in this paper, the classification model of collision alerts for small ships using unlabeled data with the semi-supervised learning (SSL) algorithms (Label Propagation and TabNet) was studied. Results of real-time experiments on small ship operators using the classification model of collision alerts showed that the satisfaction of operators increased.

Detection Fastener Defect using Semi Supervised Learning and Transfer Learning (준지도 학습과 전이 학습을 이용한 선로 체결 장치 결함 검출)

  • Sangmin Lee;Seokmin Han
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.91-98
    • /
    • 2023
  • Recently, according to development of artificial intelligence, a wide range of industry being automatic and optimized. Also we can find out some research of using supervised learning for deteceting defect of railway in domestic rail industry. However, there are structures other than rails on the track, and the fastener is a device that binds the rail to other structures, and periodic inspections are required to prevent safety accidents. In this paper, we present a method of reducing cost for labeling using semi-supervised and transfer model trained on rail fastener data. We use Resnet50 as the backbone network pretrained on ImageNet. At first we randomly take training data from unlabeled data and then labeled that data to train model. After predict unlabeled data by trained model, we adopted a method of adding the data with the highest probability for each class to the training data by a predetermined size. Futhermore, we also conducted some experiments to investigate the influence of the number of initially labeled data. As a result of the experiment, model reaches 92% accuracy which has a performance difference of around 5% compared to supervised learning. This is expected to improve the performance of the classifier by using relatively few labels without additional labeling processes through the proposed method.

Semi-Supervised Data Augmentation Method for Korean Fact Verification Using Generative Language Models (자연어 생성 모델을 이용한 준지도 학습 기반 한국어 사실 확인 자료 구축)

  • Jeong, Jae-Hwan;Jeon, Dong-Hyeon;Kim, Seon-Hun;Gang, In-Ho
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.105-111
    • /
    • 2021
  • 한국어 사실 확인 과제는 학습 자료의 부재로 인해 연구에 어려움을 겪고 있다. 본 논문은 수작업으로 구성된 학습 자료를 토대로 자연어 생성 모델을 이용하여 한국어 사실 확인 자료를 구축하는 방법을 제안한다. 본 연구는 임의의 근거를 기반으로 하는 주장을 생성하는 방법 (E2C)과 임의의 주장을 기반으로 근거를 생성하는 방법 (C2E)을 모두 실험해보았다. 이때 기존 학습 자료에 위 두 학습 자료를 각각 추가하여 학습한 사실 확인 분류기가 기존의 학습 자료나 영문 사실 확인 자료 FEVER를 국문으로 기계 번역한 학습 자료를 토대로 구성된 분류기보다 평가 자료에 대해 높은 성능을 기록하였다. 또한, C2E 방법의 경우 수작업으로 구성된 자료 없이 기존의 자연어 추론 과제 자료와 HyperCLOVA Few Shot 예제만으로도 높은 성능을 기록하여, 비지도 학습 방식으로 사실 확인 자료를 구축할 수 있는 가능성 역시 확인하였다.

  • PDF

Semi-Supervised Learning for Fault Detection and Classification of Plasma Etch Equipment (준지도학습 기반 반도체 공정 이상 상태 감지 및 분류)

  • Lee, Yong Ho;Choi, Jeong Eun;Hong, Sang Jeen
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.121-125
    • /
    • 2020
  • With miniaturization of semiconductor, the manufacturing process become more complex, and undetected small changes in the state of the equipment have unexpectedly changed the process results. Fault detection classification (FDC) system that conducts more active data analysis is feasible to achieve more precise manufacturing process control with advanced machine learning method. However, applying machine learning, especially in supervised learning criteria, requires an arduous data labeling process for the construction of machine learning data. In this paper, we propose a semi-supervised learning to minimize the data labeling work for the data preprocessing. We employed equipment status variable identification (SVID) data and optical emission spectroscopy data (OES) in silicon etch with SF6/O2/Ar gas mixture, and the result shows as high as 95.2% of labeling accuracy with the suggested semi-supervised learning algorithm.

Multi-view semi-supervised learning for 3D human pose estimation (3 차원 휴먼 자세 추정을 위한 다시점 준지도 학습)

  • Kim, Do Yeop;Chang, Ju Yong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.134-138
    • /
    • 2021
  • 3 차원 휴먼 자세 추정 모델은 다시점 모델과 단시점 모델로 분류될 수 있다. 일반적으로 다시점 모델은 단시점 모델에 비하여 뛰어난 자세 추정 성능을 보인다. 단시점 모델의 경우 3 차원 자세 추정 성능의 향상은 많은 양의 학습 데이터를 필요로 한다. 하지만 3 차원 자세에 대한 참값을 획득하는 것은 쉬운 일이 아니다. 이러한 문제를 다루기 위해, 우리는 다시점 모델로부터 다시점 휴먼 자세 데이터에 대한 의사 참값을 생성하고, 이를 단시점 모델의 학습에 활용하는 방법을 제안한다. 또한, 우리는 각각의 다시점 영상으로부터 추정된 자세의 일관성을 고려하는 다시점 일관성 손실함수를 제안하여, 이것이 단시점 모델의 효과적인 학습에 도움을 준다는 것을 보인다.

  • PDF

A Study on GPR Image Classification by Semi-supervised Learning with CNN (CNN 기반의 준지도학습을 활용한 GPR 이미지 분류)

  • Kim, Hye-Mee;Bae, Hye-Rim
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.197-206
    • /
    • 2021
  • GPR data is used for underground exploration. The data gathered are interpreted by experts based on experience as the underground facilities often reflect GPR. In addition, GPR data are different in the noise and characteristics of the data depending on the equipment, environment, etc. This often results in insufficient data with accurate labels. Generally, a large amount of training data have to be obtained to apply CNN models that exhibit high performance in image classification problems. However, due to the characteristics of GPR data, it makes difficult to obtain sufficient data. Finally, this makes neural networks unable to learn based on general supervised learning methods. This paper proposes an image classification method considering data characteristics to ensure that the accuracy of each label is similar. The proposed method is based on semi-supervised learning, and the image is classified using clustering techniques after extracting the feature values of the image from the neural network. This method can be utilized not only when the amount of the labeled data is insufficient, but also when labels that depend on the data are not highly reliable.