• Title/Summary/Keyword: Unlabeled

Search Result 154, Processing Time 0.033 seconds

Semi-supervised Model for Fault Prediction using Tree Methods (트리 기법을 사용하는 세미감독형 결함 예측 모델)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.4
    • /
    • pp.107-113
    • /
    • 2020
  • A number of studies have been conducted on predicting software faults, but most of them have been supervised models using labeled data as training data. Very few studies have been conducted on unsupervised models using only unlabeled data or semi-supervised models using enough unlabeled data and few labeled data. In this paper, we produced new semi-supervised models using tree algorithms in the self-training technique. As a result of the model performance evaluation experiment, the newly created tree models performed better than the existing models, and CollectiveWoods, in particular, outperformed other models. In addition, it showed very stable performance even in the case with very few labeled data.

The Functions of Lipophorin in Insect Hemolymph (곤충혈림프에 존재하는 리포포린의 기능)

  • Jung, Eun-Suk;Joe, Jun-Ho;Yun, Hwa-Kyung
    • Proceedings of the KAIS Fall Conference
    • /
    • 2006.11a
    • /
    • pp.287-289
    • /
    • 2006
  • 곤충 혈림프에서 존재하는 리포포린은 선택적으로 지질을 지질 사용및 저장기관으로 운반한다. 본 연구는 유충지방체, 성충난소 및 정소로 지질의 운반과 유충지방체 및 성충난소로 리포포린 자체가 흡수되는 과정을 조사하였다. 이들의 기능을 조사하기 위해 FITC-labeled 리포포린과 DiI-labeled 리포포린을 사용하였다. 유충지방체, 성충난소 및 정소를 DiI-labeled 리포포린과 배양한 결과 리포포린으로 부터 각 기관으로 지질을 운반함을 알 수 있었고, 또한 receptor-mediated endocytosis 억제제인 suramin, unlabeled 리포포린과 배양한 결과는 리포포린에서 각 기관으로 운반되는 지질의 양이 현저하게 감소함을 알 수 있었다. 또한, 유충지방체와 성충난소를 FITC-labeled 리포포린과 배양한 결과 위에서 언급한 지질 뿐만 아니라 리포포린 자체도 각 기관의 에너지원으로 사용하기 위해 흡수된다는 사실을 알 수 있었으며, suramin과 unlabeled 리포포린과 배양한 결과 리포포린 자체가 흡수되는 양이 현저하게 감소함을 알 수 있었다. 위 실험결과로부터 리포포린에 의한 지질의 운반과정과 리포포린 자체의 흡수과정이 receptor-mediated endocytosis로 이루어짐을 알 수 있었다.

  • PDF

Software Fault Prediction using Semi-supervised Learning Methods (세미감독형 학습 기법을 사용한 소프트웨어 결함 예측)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.3
    • /
    • pp.127-133
    • /
    • 2019
  • Most studies of software fault prediction have been about supervised learning models that use only labeled training data. Although supervised learning usually shows high prediction performance, most development groups do not have sufficient labeled data. Unsupervised learning models that use only unlabeled data for training are difficult to build and show poor performance. Semi-supervised learning models that use both labeled data and unlabeled data can solve these problems. Self-training technique requires the fewest assumptions and constraints among semi-supervised techniques. In this paper, we implemented several models using self-training algorithms and evaluated them using Accuracy and AUC. As a result, YATSI showed the best performance.

Patch based Semi-supervised Linear Regression for Face Recognition

  • Ding, Yuhua;Liu, Fan;Rui, Ting;Tang, Zhenmin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.3962-3980
    • /
    • 2019
  • To deal with single sample face recognition, this paper presents a patch based semi-supervised linear regression (PSLR) algorithm, which draws facial variation information from unlabeled samples. Each facial image is divided into overlapped patches, and a regression model with mapping matrix will be constructed on each patch. Then, we adjust these matrices by mapping unlabeled patches to $[1,1,{\cdots},1]^T$. The solutions of all the mapping matrices are integrated into an overall objective function, which uses ${\ell}_{2,1}$-norm minimization constraints to improve discrimination ability of mapping matrices and reduce the impact of noise. After mapping matrices are computed, we adopt majority-voting strategy to classify the probe samples. To further learn the discrimination information between probe samples and obtain more robust mapping matrices, we also propose a multistage PSLR (MPSLR) algorithm, which iteratively updates the training dataset by adding those reliably labeled probe samples into it. The effectiveness of our approaches is evaluated using three public facial databases. Experimental results prove that our approaches are robust to illumination, expression and occlusion.

K-Means Clustering with Deep Learning for Fingerprint Class Type Prediction

  • Mukoya, Esther;Rimiru, Richard;Kimwele, Michael;Mashava, Destine
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.29-36
    • /
    • 2022
  • In deep learning classification tasks, most models frequently assume that all labels are available for the training datasets. As such strategies to learn new concepts from unlabeled datasets are scarce. In fingerprint classification tasks, most of the fingerprint datasets are labelled using the subject/individual and fingerprint datasets labelled with finger type classes are scarce. In this paper, authors have developed approaches of classifying fingerprint images using the majorly known fingerprint classes. Our study provides a flexible method to learn new classes of fingerprints. Our classifier model combines both the clustering technique and use of deep learning to cluster and hence label the fingerprint images into appropriate classes. The K means clustering strategy explores the label uncertainty and high-density regions from unlabeled data to be clustered. Using similarity index, five clusters are created. Deep learning is then used to train a model using a publicly known fingerprint dataset with known finger class types. A prediction technique is then employed to predict the classes of the clusters from the trained model. Our proposed model is better and has less computational costs in learning new classes and hence significantly saving on labelling costs of fingerprint images.

Development of a Steel Plate Surface Defect Detection System Based on Small Data Deep Learning (소량 데이터 딥러닝 기반 강판 표면 결함 검출 시스템 개발)

  • Gaybulayev, Abdulaziz;Lee, Na-Hyeon;Lee, Ki-Hwan;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.3
    • /
    • pp.129-138
    • /
    • 2022
  • Collecting and labeling sufficient training data, which is essential to deep learning-based visual inspection, is difficult for manufacturers to perform because it is very expensive. This paper presents a steel plate surface defect detection system with industrial-grade detection performance by training a small amount of steel plate surface images consisting of labeled and non-labeled data. To overcome the problem of lack of training data, we propose two data augmentation techniques: program-based augmentation, which generates defect images in a geometric way, and generative model-based augmentation, which learns the distribution of labeled data. We also propose a 4-step semi-supervised learning using pseudo labels and consistency training with fixed-size augmentation in order to utilize unlabeled data for training. The proposed technique obtained about 99% defect detection performance for four defect types by using 100 real images including labeled and unlabeled data.

Automatic Text Classification by Learning from Unlabeled Data (레이블이 없는 데이터로부터의 학습에 의한 자동 문서 분류)

  • 박성배;김유환;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.265-267
    • /
    • 2001
  • 본 논문에서는 레이블이 없는 데이터를 이용하는 새로운 자동 문서 분류 방법을 제시한다. 제시된 방법은 적은 수의 레이블이 있는 데이터로부터 학습된 후 많은 수의 레이블이 없는 데이터로 보강되는 일련의 분류기(classifier)에 기반한다. 레이블이 없는 데이터를 활용하기 때문에, 필요한 레이블이 있는 데이터의 수가 줄어들고, 분류 정확도가 향상된다. 두 개의 표준 데이터 집합에 대한 실험 결과, 레이블이 없는 데이터를 사용함으로써 분류 정확도가 증가함을 보였다. 분류 정확도는 전체 데이터의 2/3만 사용하고도 NIPS 2000 워크숍 데이터 집합에 대해서는 약 7.9% 정도, WebKB 데이터 집합에 대해서는 9.2% 증가하였다.

  • PDF

Semisupervised support vector quantile regression

  • Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.517-524
    • /
    • 2015
  • Unlabeled examples are easier and less expensive to be obtained than labeled examples. In this paper semisupervised approach is used to utilize such examples in an effort to enhance the predictive performance of nonlinear quantile regression problems. We propose a semisupervised quantile regression method named semisupervised support vector quantile regression, which is based on support vector machine. A generalized approximate cross validation method is used to choose the hyper-parameters that affect the performance of estimator. The experimental results confirm the successful performance of the proposed S2SVQR.

Synthesis and Mass Spectrometry of Deueteriu Labeled Tranylcypromine Hydrochloride

  • Kang, Gun-Il;Hong, Suk-Gil
    • Archives of Pharmacal Research
    • /
    • v.8 no.2
    • /
    • pp.77-84
    • /
    • 1985
  • [$^{2}$H$_{2}$] Tranylcypromine hydrochloride (trans-3, 3-dideuterio-2-phenylcyclopropylamine HCL) was synthesized for application to the metabolic studies. Mass fragmentation processes for the tranylcypromine and its two synthetic intermediates .gamma-phenyl-.gamma.-butyrolactone and trans-2-phenylcyclopropanecarboxylic acid were described based upon comparisons between labeled and unlabeled compounds.

  • PDF