• Title/Summary/Keyword: Training Samples

Search Result 562, Processing Time 0.024 seconds

Training Sample and Feature Selection Methods for Pseudo Sample Neural Networks (의사 샘플 신경망에서 학습 샘플 및 특징 선택 기법)

  • Heo, Gyeongyong;Park, Choong-Shik;Lee, Chang-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.4
    • /
    • pp.19-26
    • /
    • 2013
  • Pseudo sample neural network (PSNN) is a variant of traditional neural network using pseudo samples to mitigate the local-optima-convergence problem when the size of training samples is small. PSNN can take advantage of the smoothed solution space through the use of pseudo samples. PSNN has a focus on the quantity problem in training, whereas, methods stressing the quality of training samples is presented in this paper to improve further the performance of PSNN. It is evident that typical samples and highly correlated features help in training. In this paper, therefore, kernel density estimation is used to select typical samples and correlation factor is introduced to select features, which can improve the performance of PSNN. Debris flow data set is used to demonstrate the usefulness of the proposed methods.

Semi-supervised Software Defect Prediction Model Based on Tri-training

  • Meng, Fanqi;Cheng, Wenying;Wang, Jingdong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.4028-4042
    • /
    • 2021
  • Aiming at the problem of software defect prediction difficulty caused by insufficient software defect marker samples and unbalanced classification, a semi-supervised software defect prediction model based on a tri-training algorithm was proposed by combining feature normalization, over-sampling technology, and a Tri-training algorithm. First, the feature normalization method is used to smooth the feature data to eliminate the influence of too large or too small feature values on the model's classification performance. Secondly, the oversampling method is used to expand and sample the data, which solves the unbalanced classification of labelled samples. Finally, the Tri-training algorithm performs machine learning on the training samples and establishes a defect prediction model. The novelty of this model is that it can effectively combine feature normalization, oversampling techniques, and the Tri-training algorithm to solve both the under-labelled sample and class imbalance problems. Simulation experiments using the NASA software defect prediction dataset show that the proposed method outperforms four existing supervised and semi-supervised learning in terms of Precision, Recall, and F-Measure values.

Generic Training Set based Multimanifold Discriminant Learning for Single Sample Face Recognition

  • Dong, Xiwei;Wu, Fei;Jing, Xiao-Yuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.368-391
    • /
    • 2018
  • Face recognition (FR) with a single sample per person (SSPP) is common in real-world face recognition applications. In this scenario, it is hard to predict intra-class variations of query samples by gallery samples due to the lack of sufficient training samples. Inspired by the fact that similar faces have similar intra-class variations, we propose a virtual sample generating algorithm called k nearest neighbors based virtual sample generating (kNNVSG) to enrich intra-class variation information for training samples. Furthermore, in order to use the intra-class variation information of the virtual samples generated by kNNVSG algorithm, we propose image set based multimanifold discriminant learning (ISMMDL) algorithm. For ISMMDL algorithm, it learns a projection matrix for each manifold modeled by the local patches of the images of each class, which aims to minimize the margins of intra-manifold and maximize the margins of inter-manifold simultaneously in low-dimensional feature space. Finally, by comprehensively using kNNVSG and ISMMDL algorithms, we propose k nearest neighbor virtual image set based multimanifold discriminant learning (kNNMMDL) approach for single sample face recognition (SSFR) tasks. Experimental results on AR, Multi-PIE and LFW face datasets demonstrate that our approach has promising abilities for SSFR with expression, illumination and disguise variations.

Imbalanced SVM-Based Anomaly Detection Algorithm for Imbalanced Training Datasets

  • Wang, GuiPing;Yang, JianXi;Li, Ren
    • ETRI Journal
    • /
    • v.39 no.5
    • /
    • pp.621-631
    • /
    • 2017
  • Abnormal samples are usually difficult to obtain in production systems, resulting in imbalanced training sample sets. Namely, the number of positive samples is far less than the number of negative samples. Traditional Support Vector Machine (SVM)-based anomaly detection algorithms perform poorly for highly imbalanced datasets: the learned classification hyperplane skews toward the positive samples, resulting in a high false-negative rate. This article proposes a new imbalanced SVM (termed ImSVM)-based anomaly detection algorithm, which assigns a different weight for each positive support vector in the decision function. ImSVM adjusts the learned classification hyperplane to make the decision function achieve a maximum GMean measure value on the dataset. The above problem is converted into an unconstrained optimization problem to search the optimal weight vector. Experiments are carried out on both Cloud datasets and Knowledge Discovery and Data Mining datasets to evaluate ImSVM. Highly imbalanced training sample sets are constructed. The experimental results show that ImSVM outperforms over-sampling techniques and several existing imbalanced SVM-based techniques.

A New Method for Hyperspectral Data Classification

  • Dehghani, Hamid.;Ghassemian, Hassan.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.637-639
    • /
    • 2003
  • As the number of spectral bands of high spectral resolution data increases, the capability to detect more detailed classes should also increase, and the classification accuracy should increase as well. Often, it is impossible to access enough training pixels for supervise classification. For this reason, the performance of traditional classification methods isn't useful. In this paper, we propose a new model for classification that operates based on decision fusion. In this classifier, learning is performed at two steps. In first step, only training samples are used and in second step, this classifier utilizes semilabeled samples in addition to original training samples. At the beginning of this method, spectral bands are categorized in several small groups. Information of each group is used as a new source and classified. Each of this primary classifier has special characteristics and discriminates the spectral space particularly. With using of the benefits of all primary classifiers, it is made sure that the results of the fused local decisions are accurate enough. In decision fusion center, some rules are used to determine the final class of pixels. This method is applied to real remote sensing data. Results show classification performance is improved, and this method may solve the limitation of training samples in the high dimensional data and the Hughes phenomenon may be mitigated.

  • PDF

Effect of Training Sequence Control in On-line Learning for Multilayer Perceptron (다계층 퍼셉트론의 온라인 학습에서 학습 순서 제어의 효과)

  • Lee, Jae-Young;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.491-502
    • /
    • 2010
  • When human beings acquire and develop knowledge through education, their prior knowledge influences the next learning process. As this is a fact that should be considered in machine learning, we need to examine the effects of controlling the order of training sequence on machine learning. In this research, the role of the supervisor is extended to control the order of training samples, in addition to just instructing the target values for classification problems. The supervisor sequences the training examples categorized by SOM to the learning model which in this case is MLP. The proposed method is distinguished in that it selects the most instructive example from categories formed by SOM to assist the learning progress, while others use SOM only as a preprocessing method for training samples. The result shows that the method is effective in terms of the number of samples used and time taken in training.

Utilizing Experiences of Supervisor in Sequential Learning for Multilayer Perceptron (지도 경험을 활용한 다계층 퍼셉트론의 순차적 학습 방법)

  • Lee, Jae-Young;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.10
    • /
    • pp.723-735
    • /
    • 2010
  • Evaluating the level of achievement and providing the knowledge which is appropriate at the evaluated level have great influence in studying of the human beings. This shows the importance of the order of training and the training order should be considered in machine learning. In this research, to assess the influence of the order of training, we propose a method of controlling the order of training samples utilizing the experience of supervisor in the training of MLP. The supervisor finds out the current state of MLP using teaching experience and student evaluation, and then selects the most instructive sample for MLP in that state. We use CRF to represent and utilize the experience of supervisor. While the proposed method is similar to active learning in selecting samples, it is basically different in that selection is not to reduce the number of samples to be used but to assist the learning progress. The result from classification problem shows that the method is usually effective in terms of time taken in training in contrast to random selection.

Discrimination between earthquake and explosion by using seismic spectral characteristics and linear discriminant analysis (지진파 스펙트럼특성과 선형판별분석을 이용한 자연지진과 인공지진 식별)

  • 제일영;전정수;이희일
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 2003.09a
    • /
    • pp.13-19
    • /
    • 2003
  • Discriminant method using seismic signal was studied for discrimination of surface explosion. By means of the seismic spectral characteristics, multi-variate discriminant analysis was performed. Four single discriminant techniques - Pg/Lg, Lg1/Lg2, Pg1/Pg2, and Rg/Lg - based on seismic source theory were applied to explosion and earthquake training data sets. The Pg/Lg discriminant technique was most effective among the four techniques. Nevertheless, it could not perfectly discriminate the samples of the training data sets. In this study, a compound linear discriminant analysis was defined by using common characteristics of the training data sets for the single discriminants. The compound linear discriminant analysis was used for the single discriminant as an independent variable. From this analysis, all the samples of the training data sets were correctly discriminated, and the probability of misclassification was lowered to 0.7%.

  • PDF

X-Ray Diffraction line profile analysis of defects and precipitates in high displacement damage neutron-irradiated austenitic stainless steels

  • Shreevalli M.;Ran Vijay Kumar;Divakar R.;Ashish K.;Padmaprabu C.;Karthik V.;Archna Sagdeo
    • Nuclear Engineering and Technology
    • /
    • v.56 no.1
    • /
    • pp.114-122
    • /
    • 2024
  • Irradiation-induced defects and the precipitates in the wrapper material of the Indian Fast Breeder Test Reactor (FBTR), SS 316 are analyzed using the synchrotron source-based Angle Dispersive X-Ray Diffraction (ADXRD) technique with X-rays of energy 17.185 keV (wavelength ~0.72146 Å). The differences and similarities in the high displacement damage samples as a function of dpa (displacement per atom) and dpa rate in the range of 2.9 × 10-7- 9 × 10-7 dpa/s are studied. Ferrite and M23C6 are commonly observed in the present set of high displacement damage 40-74 dpa SS 316 samples irradiated at temperatures in the range of 400-483 ℃. Also, the dislocation density has increased as a function of the irradiation dose. The X-ray diffraction peak profile parameters quantified such as peak shift and asymmetry show that the irradiation-induced defects are sensitive to the dpa rate-irradiation temperature combinations. The increase in yield strength as a function of displacement damage is also correlated to the dislocation density.

Effect of Dental Practicality Index training using an online video on decision-making and confidence level in treatment planning by dental undergraduates

  • Zhai Wei See;Ming Sern Lee;Abhishek Parolia;Shalini Kanagasingam;Shilpa Gunjal;Shanon Patel
    • Restorative Dentistry and Endodontics
    • /
    • v.49 no.1
    • /
    • pp.8.1-8.12
    • /
    • 2024
  • Objectives: The purpose of this study was to evaluate the effect of Dental Practicality Index (DPI) training using an online video on the treatment planning decisions and confidence level of dental undergraduates (DUs). Materials and Methods: Ninety-four DUs were shown 15 clinical case scenarios and asked to decide on treatment plans based on 4 treatment options. The most appropriate treatment plan had been decided by a consensus panel of experienced dentists. DUs then underwent DPI training using an online video. In a post-DPI-training test, DUs were shown the same clinical case scenarios and asked to assign the best treatment option. After 6 weeks, DUs were retested to assess their knowledge retention. In all 3 tests, DUs completed the confidence level scale questionnaire. Data were analyzed using the related-samples Wilcoxon signed rank test and the independent-samples Mann-Whitney U test with the level of significance set at p < 0.05. Results: DPI training significantly improved the mean scores of the DUs from 7.53 in the pre-DPI-training test to 9.01 in the post-DPI-training test (p < 0.001). After 6 weeks, the mean scores decreased marginally to 8.87 in the retention test (p = 0.563). DPI training increased their confidence level from 5.68 pre-DPI training to 7.09 post-DPI training. Conclusions: Training DUs using DPI with an online video improved their decision-making and confidence level in treatment planning.