• 제목/요약/키워드: training data

검색결과 7,301건 처리시간 0.034초

웹 기반 제품정보관리 교육 서비스 (A Web Based Training Service for Product Data Management)

  • 도남철
    • 한국CDE학회논문집
    • /
    • 제9권3호
    • /
    • pp.260-265
    • /
    • 2004
  • This paper proposed a Web-based training service for product data management by supporting an integrated product data management system, various technical documents. and efficient communication systems. It also supports a general product development process and a consistent product data model that enable participants to experience management of consistent product information during the product development life cycle. The Web based environment of the service also provides participants with a collaborative workplace with other participants and a Web portal for all the components of the service.

Tri-training algorithm based on cross entropy and K-nearest neighbors for network intrusion detection

  • Zhao, Jia;Li, Song;Wu, Runxiu;Zhang, Yiying;Zhang, Bo;Han, Longzhe
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3889-3903
    • /
    • 2022
  • To address the problem of low detection accuracy due to training noise caused by mislabeling when Tri-training for network intrusion detection (NID), we propose a Tri-training algorithm based on cross entropy and K-nearest neighbors (TCK) for network intrusion detection. The proposed algorithm uses cross-entropy to replace the classification error rate to better identify the difference between the practical and predicted distributions of the model and reduce the prediction bias of mislabeled data to unlabeled data; K-nearest neighbors are used to remove the mislabeled data and reduce the number of mislabeled data. In order to verify the effectiveness of the algorithm proposed in this paper, experiments were conducted on 12 UCI datasets and NSL-KDD network intrusion datasets, and four indexes including accuracy, recall, F-measure and precision were used for comparison. The experimental results revealed that the TCK has superior performance than the conventional Tri-training algorithms and the Tri-training algorithms using only cross-entropy or K-nearest neighbor strategy.

데이터 증가를 통한 선형 모델의 일반화 성능 개량 (중심극한정리를 기반으로) (Improvement of generalization of linear model through data augmentation based on Central Limit Theorem)

  • 황두환
    • 지능정보연구
    • /
    • 제28권2호
    • /
    • pp.19-31
    • /
    • 2022
  • 기계학습 모델 구축 간 트레이닝 데이터를 활용하며, 훈련 간 사용되지 않은 테스트 데이터를 활용하여 모델의 정확도와 일반화 성능을 판단한다. 일반화 성능이 낮은 모델의 경우 새롭게 받아들이게 되는 데이터에 대한 예측 정확도가 현저히 감소하게 되며 이러한 현상을 두고 모델이 과적합 되었다고 한다. 본 연구는 중심극한정리를 기반으로 데이터를 생성 및 기존의 훈련용 데이터와 결합하여 새로운 훈련용 데이터를 구성하고 데이터의 정규성을 증가시킴과 동시에 이를 활용하여 모델의 일반화 성능을 증가시키는 방법에 대한 것이다. 이를 위해 중심극한정리의 성질을 활용해 데이터의 각 특성별로 표본평균 및 표준편차를 활용하여 데이터를 생성하였고, 새로운 훈련용 데이터의 정규성 증가 정도를 파악하기 위하여 Kolmogorov-Smirnov 정규성 검정을 진행한 결과, 새로운 훈련용 데이터가 기존의 데이터에 비해 정규성이 증가하였음을 확인할 수 있었다. 일반화 성능은 훈련용 데이터와 테스트용 데이터에 대한 예측 정확도의 차이를 통해 측정하였다. 새롭게 생성된 데이터를 K-Nearest Neighbors(KNN), Logistic Regression, Linear Discriminant Analysis(LDA)에 적용하여 훈련시키고 일반화 성능 증가정도를 파악한 결과, 비모수(non-parametric) 기법인 KNN과 모델 구성 간 정규성을 가정으로 갖는 LDA의 경우에 대하여 일반화 성능이 향상되었음을 확인할 수 있었다.

광주·전남 지역의 물리치료학 전공 학생들의 임상실습만족도 (A Study on the Degree of Satisfaction on Clinical Practice for the Students in the Depart of Physical Therapy Located in Gwang-ju and Jeonnam)

  • 조남정;정준성
    • 대한통합의학회지
    • /
    • 제1권2호
    • /
    • pp.13-22
    • /
    • 2013
  • Purpose : The purpose of the research is that get a cut above clinical practice effect through satisfaction of clinical training, practical training, content, oversight of training and evaluation system. Clinical training consists of part of university in Gwang Ju and Jeon nam. Method : The target of training student was studying at physiotherapy a tree or four-year-course collage in Gwang ju and Jean nam. Data collection period is from 21 November 2012 to 1 February. We explained how to do a means of collecting data and get students consent fill in questionnaire. Data collection prossed by using spss 10.1 program also independent proofs, descriptive statistics, crosstabulation, regression analysis and frequency analysis. Results : The subjects average age is 24 in general characteristic. A school system of subjects was a tree-year-course students. They were 58people(39.1%). A school system of subjects was a four-year-course students. They were 90people(60.9%).The male was 72(48.6%) and the female was 76(51.4%). We researched to know about satisfaction of clinical training, practical training, content, environment of practical establishment, trainee manage and evaluation method. All-round satisfaction of clinical training average was 1.90 Satisfaction of clinical training period and content average was 1.83Satisfaction of environment of practical establishment average was 1.88 Satisfaction of clinical training establishments' trainee manage and evaluation average was 1.94 Conclusion : It is important that student can get specific their future and can do at clinical throught clinical training after their graduation improving satisfaction of clinical training would give to impact a physical therapist reserve.

Speaker Verification with the Constraint of Limited Data

  • Kumari, Thyamagondlu Renukamurthy Jayanthi;Jayanna, Haradagere Siddaramaiah
    • Journal of Information Processing Systems
    • /
    • 제14권4호
    • /
    • pp.807-823
    • /
    • 2018
  • Speaker verification system performance depends on the utterance of each speaker. To verify the speaker, important information has to be captured from the utterance. Nowadays under the constraints of limited data, speaker verification has become a challenging task. The testing and training data are in terms of few seconds in limited data. The feature vectors extracted from single frame size and rate (SFSR) analysis is not sufficient for training and testing speakers in speaker verification. This leads to poor speaker modeling during training and may not provide good decision during testing. The problem is to be resolved by increasing feature vectors of training and testing data to the same duration. For that we are using multiple frame size (MFS), multiple frame rate (MFR), and multiple frame size and rate (MFSR) analysis techniques for speaker verification under limited data condition. These analysis techniques relatively extract more feature vector during training and testing and develop improved modeling and testing for limited data. To demonstrate this we have used mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) as feature. Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) are used for modeling the speaker. The database used is NIST-2003. The experimental results indicate that, improved performance of MFS, MFR, and MFSR analysis radically better compared with SFSR analysis. The experimental results show that LPCC based MFSR analysis perform better compared to other analysis techniques and feature extraction techniques.

외국어 발음오류 검출 음성인식기를 위한 MCE 학습 알고리즘 (MCE Training Algorithm for a Speech Recognizer Detecting Mispronunciation of a Foreign Language)

  • 배민영;정용주;권철홍
    • 음성과학
    • /
    • 제11권4호
    • /
    • pp.43-52
    • /
    • 2004
  • Model parameters in HMM based speech recognition systems are normally estimated using Maximum Likelihood Estimation(MLE). The MLE method is based mainly on the principle of statistical data fitting in terms of increasing the HMM likelihood. The optimality of this training criterion is conditioned on the availability of infinite amount of training data and the correct choice of model. However, in practice, neither of these conditions is satisfied. In this paper, we propose a training algorithm, MCE(Minimum Classification Error), to improve the performance of a speech recognizer detecting mispronunciation of a foreign language. During the conventional MLE(Maximum Likelihood Estimation) training, the model parameters are adjusted to increase the likelihood of the word strings corresponding to the training utterances without taking account of the probability of other possible word strings. In contrast to MLE, the MCE training scheme takes account of possible competing word hypotheses and tries to reduce the probability of incorrect hypotheses. The discriminant training method using MCE shows better recognition results than the MLE method does.

  • PDF

전산수치해석 기반 화재훈련 VR 시뮬레이터의 개발 (A Development of Fire Training Simulator Based on Computational Fluid Dynamics Simulation)

  • 차무현;이재경;박성환;최병일
    • 한국CDE학회논문집
    • /
    • 제14권4호
    • /
    • pp.271-280
    • /
    • 2009
  • An experience based training system concerning various fire situations which may result many casualties has been required to make rapid decision and improve the responsiveness. Recently, the necessity of virtual reality (VR) based training system which can replace a dangerous full-scale fire training and be easily adopted to the training or evaluation process is increasing. This study constructed tile virtual environment according to pre-defined scenarios, utilized the FDS(Fire Dynamics Simulator), three dimensional computational fire analysis program, to derive numerically simulated data on the propagation of fire. Finally, by visualizing the realistic fire and smoke behavior through virtual reality technique and implementing real-time interaction, we developed a VR-based fire training simulator. Also, in order to ensure the sense for tile real of a virtual world and reaI-time performance at the same time, we proposed appropriate data processing and space search algorithms, demonstrate d the value of proposed method through experiments.

ADHD 아동을 위한 사회기술훈련 프로그램의 개발과 효과 (The Development of a Social Skill Training Program for ADHD Children and It's Effect)

  • 이혜숙
    • 초등상담연구
    • /
    • 제6권1호
    • /
    • pp.171-191
    • /
    • 2007
  • The purpose of this study is to develop social skill training in order to reduce problematic behaviors and improve peer relations for elementary school students who have ADHD(Attention Deficit Hyperactivity Disorder) and then verify its effectiveness. The problems for this study are as follows: Firstly, is the social skill training for students with ADHD effective in enhancing their self-esteem? Secondly, is the social skill training for students with ADHD effective in reducing their carelessness, hyperactivity and impulsive character? Thirdly, is the social skill training for students with ADHD effective in improving peer relations? Subjects were six 5th grade children who were selected by the ADHD-SC4 at P elementary school in Pyeongtaek. The social skill training consisted of 10 sessions which included forming friendship, recognizing, making friends, solving problems, reeducation and evaluation. Qualitative data were collected through self-esteem inventory, peer-relation test, self-reported scales for children and Conners' Teacher rating score for ADHD children. The collected data were analysed with t-test. Qualitative data were collected though teacher's interview and observation an the children. The results of the study were follows: First, the social skill training did not give a significant effect in enhancing the self-esteem of the children with ADHD. Second, the social skill training had a positive effect in reducing in attentiveness, hyperactivity and impulsive behavior of the children with ADHD. Third, the social skill training did not give a significant effect in improving the peer relations of the children with ADHD. Fourth the qualitative data showed that the social skill training had positive effect in enhancing over all classroom behavior.

  • PDF

A Feature Selection Technique based on Distributional Differences

  • Kim, Sung-Dong
    • Journal of Information Processing Systems
    • /
    • 제2권1호
    • /
    • pp.23-27
    • /
    • 2006
  • This paper presents a feature selection technique based on distributional differences for efficient machine learning. Initial training data consists of data including many features and a target value. We classified them into positive and negative data based on the target value. We then divided the range of the feature values into 10 intervals and calculated the distribution of the intervals in each positive and negative data. Then, we selected the features and the intervals of the features for which the distributional differences are over a certain threshold. Using the selected intervals and features, we could obtain the reduced training data. In the experiments, we will show that the reduced training data can reduce the training time of the neural network by about 40%, and we can obtain more profit on simulated stock trading using the trained functions as well.

Development of Personal-Credit Evaluation System Using Real-Time Neural Learning Mechanism

  • Park, Jong U.;Park, Hong Y.;Yoon Chung
    • 정보기술과데이타베이스저널
    • /
    • 제2권2호
    • /
    • pp.71-85
    • /
    • 1995
  • Many research results conducted by neural network researchers have claimed that the classification accuracy of neural networks is superior to, or at least equal to that of conventional methods. However, in series of neural network classifications, it was found that the classification accuracy strongly depends on the characteristics of training data set. Even though there are many research reports that the classification accuracy of neural networks can be different, depending on the composition and architecture of the networks, training algorithm, and test data set, very few research addressed the problem of classification accuracy when the basic assumption of data monotonicity is violated, In this research, development project of automated credit evaluation system is described. The finding was that arrangement of training data is critical to successful implementation of neural training to maintain monotonicity of the data set, for enhancing classification accuracy of neural networks.

  • PDF