• Title/Summary/Keyword: Number of training data

Search Result 948, Processing Time 0.036 seconds

EFFECTS OF RANDOMIZING PATTERNS AND TRAINING UNEQUALLY REPRESENTED CLASSES FOR ARTIFICIAL NEURAL NETWORKS

  • Kim, Young-Sup;Coleman Tommy L.
    • 한국공간정보시스템학회:학술대회논문집
    • /
    • 2002.03a
    • /
    • pp.45-52
    • /
    • 2002
  • Artificial neural networks (ANN) have been successfully used for classifying remotely sensed imagery. However, ANN still is not the preferable choice for classification over the conventional classification methodology such as the maximum likelihood classifier commonly used in the industry production environment. This can be attributed to the ANN characteristic built-in stochastic process that creates difficulties in dealing with unequally represented training classes, and its training performance speed. In this paper we examined some practical aspects of training classes when using a back propagation neural network model for remotely sensed imagery. During the classification process of remotely sensed imagery, representative training patterns for each class are collected by polygons or by using a region-growing methodology over the imagery. The number of collected training patterns for each class may vary from several pixels to thousands. This unequally populated training data may cause the significant problems some neural network empirical models such as back-propagation have experienced. We investigate the effects of training over- or under- represented training patterns in classes and propose the pattern repopulation algorithm, and an adaptive alpha adjustment (AAA) algorithm to handle unequally represented classes. We also show the performance improvement when input patterns are presented in random fashion during the back-propagation training.

  • PDF

Training for Huge Data set with On Line Pruning Regression by LS-SVM

  • Kim, Dae-Hak;Shim, Joo-Yong;Oh, Kwang-Sik
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.137-141
    • /
    • 2003
  • LS-SVM(least squares support vector machine) is a widely applicable and useful machine learning technique for classification and regression analysis. LS-SVM can be a good substitute for statistical method but computational difficulties are still remained to operate the inversion of matrix of huge data set. In modern information society, we can easily get huge data sets by on line or batch mode. For these kind of huge data sets, we suggest an on line pruning regression method by LS-SVM. With relatively small number of pruned support vectors, we can have almost same performance as regression with full data set.

  • PDF

Semi-supervised learning for sentiment analysis in mass social media (대용량 소셜 미디어 감성분석을 위한 반감독 학습 기법)

  • Hong, Sola;Chung, Yeounoh;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.482-488
    • /
    • 2014
  • This paper aims to analyze user's emotion automatically by analyzing Twitter, a representative social network service (SNS). In order to create sentiment analysis models by using machine learning techniques, sentiment labels that represent positive/negative emotions are required. However it is very expensive to obtain sentiment labels of tweets. So, in this paper, we propose a sentiment analysis model by using self-training technique in order to utilize "data without sentiment labels" as well as "data with sentiment labels". Self-training technique is that labels of "data without sentiment labels" is determined by utilizing "data with sentiment labels", and then updates models using together with "data with sentiment labels" and newly labeled data. This technique improves the sentiment analysis performance gradually. However, it has a problem that misclassifications of unlabeled data in an early stage affect the model updating through the whole learning process because labels of unlabeled data never changes once those are determined. Thus, labels of "data without sentiment labels" needs to be carefully determined. In this paper, in order to get high performance using self-training technique, we propose 3 policies for updating "data with sentiment labels" and conduct a comparative analysis. The first policy is to select data of which confidence is higher than a given threshold among newly labeled data. The second policy is to choose the same number of the positive and negative data in the newly labeled data in order to avoid the imbalanced class learning problem. The third policy is to choose newly labeled data less than a given maximum number in order to avoid the updates of large amount of data at a time for gradual model updates. Experiments are conducted using Stanford data set and the data set is classified into positive and negative. As a result, the learned model has a high performance than the learned models by using "data with sentiment labels" only and the self-training with a regular model update policy.

Studies on the induction of pregnancy and the number of fetuses during pregnancy in rats

  • Choi, Seung-Hee;Cho, Yong-Seong;Kim, Min-Ji;Lee, Chae-Hyeok;Seong, Hwan-Hoo;Baek, Soon-Hwa;Lee, Jang-Hee
    • Journal of Animal Reproduction and Biotechnology
    • /
    • v.35 no.3
    • /
    • pp.232-238
    • /
    • 2020
  • This study used adult wistar-based rats to observe the sexual cycle as a morphological characteristic of vaginal epithelial cells by vaginal smearing, and investigated the fetal number through mating with male rats of the same strain. The target animal was a 12 to 13-week-old Wistar-based mature unlighted rat (weight 220 g to 240 g), room temperature 23 ± 2℃, 14 hours artificial lighting (05:00 to 19:00 hours), 10 hours Adapted individuals were used for rearing for at least 2 weeks under the conditions of the darkroom (19:00 to 05:00). The feed was managed for free feeding of pellet feed for animals and water. The vaginal smearing method was used for the experiments by observing the sexual cycle every morning and confirming that the normal sexual cycle of 4 or 5 days was repeated at least 2 cycles or more. As a result, the proestrus was found to have few red blood cells, the cells and nuclei were rather large and round, and many nucleated cells were identified. In the case of the estrus, the cells were large and the nuclei were not stained, and most of the keratinocytes were found. In addition, in the metestrus and diestrus, there were many white blood cells, and it was confirmed that nucleated epithelial cells and keratinocytes were significantly reduced. The pregnancy period was 21 ± 1.8 days, and the number of live births per delivery was 11.9 on average. The number of fetuses on the 8th and 10th days of pregnancy were 15.2 ± 0.4 and 15.4 ± 0.3, respectively. On the contrary, the number of fetuses on the 12th day of pregnancy was 12.9 ± 0.6, which was significantly (p < 0.05) decreased compared to the 10th day of pregnancy, and the number of fetuses was similar until delivery. As a result of investigating the change of body weight according to the birth weight and growth stage after delivery, the birth weight of female and male was 9.2 ± 2.0 g and 9.8 ± 2.5 g, respectively. After that, until the 16th day, the female and the male showed similarly moderate weight gain, and then showed a rapid weight gain until the 21st day of lactation. With reference to the results of this study, it is expected to be used as basic data for determining the mating time of rodents and controlling pregnancy and fetal number.

Kinematic characteristics of the ankle joint and RPM during the supra maximal training in cycling (사이클링 초최대운동(Supra maximal training)시 RPM과 족관절의 운동학적 분석)

  • Lee, Yong-Woo
    • Korean Journal of Applied Biomechanics
    • /
    • v.15 no.4
    • /
    • pp.75-83
    • /
    • 2005
  • The purpose of this study was to determine the kinematic characteristics of the ankle joint and RPM(repetition per minutes) during the supra maximal training in cycling. For this study, 8 national representative cyclists, distance cyclists in track and road, were selected. During the super-maximum pedalling, kinematic data were collected using a six-camera(240Hz) Qualisys system. the room coordinate system was right-handed and fixed in the back of a roller for cycle, with right-handed orthogonal segment coordinate systems defined for the leg and foot. Lateral kinematic data were recorded at least for 3 minutes while the participants pedal on a roller. Two-dimensional Cartesian coordinates for each marker were determined at the time of recording using a nonlinear transformation technique. Coordinate data were low-pass filtered using a fourth-order Butterworth recursive filter with cutoff frequency of 15Hz. Variables analyzed in this study were compared using a one factor(time) ANOVA with repeated measures. The results of investigation suggest that the number of rotating pedal was decreased with time phase during the super-maximum pedaling. Maximum angle of the ankle joint showed little in change with time phase compared with minimum angle of that.

Modeling High Power Semiconductor Device Using Backpropagation Neural Network (역전파 신경망을 이용한 고전력 반도체 소자 모델링)

  • Kim, Byung-Whan;Kim, Sung-Mo;Lee, Dae-Woo;Roh, Tae-Moon;Kim, Jong-Dae
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.5
    • /
    • pp.290-294
    • /
    • 2003
  • Using a backpropagation neural network (BPNN), a high power semiconductor device was empirically modeled. The device modeled is a n-LDMOSFET and its electrical characteristics were measured with a HP4156A and a Tektronix curve tracer 370A. The drain-source current $(I_{DS})$ was measured over the drain-source voltage $(V_{DS})$ ranging between 1 V to 200 V at each gate-source voltage $(V_{GS}).$ For each $V_{GS},$ the BPNN was trained with 100 training data, and the trained model was tested with another 100 test data not pertaining to the training data. The prediction accuracy of each $V_{GS}$ model was optimized as a function of training factors, including training tolerance, number of hidden neurons, initial weight distribution, and two gradients of activation functions. Predictions from optimized models were highly consistent with actual measurements.

The Effects of Swimming Training on Lymphocyte Proliferation and ROS Production in Spleen Lymphocytes of BALB/c Mice (규칙적인 수영훈련이 마우스 비장세포의 ROS생성과 림프구 증식에 미치는 영향)

  • Kwak, Yi-Sub;Park, Jeon-Han;Kim, Se-Jong;Jang, Yun-Soo;Lee, Bong-Ki
    • IMMUNE NETWORK
    • /
    • v.2 no.2
    • /
    • pp.96-101
    • /
    • 2002
  • Background: Aerobic training can be defined as any physical exercise that increases the heart rate and enhances the body's intake of oxygen long enough to benefit the condition of body. Running, cycling, and swimming are examples of aerobic activities. This type of exercise optimises immune functions. Recently several experimental findings suggested that the regular swimming training increase immune response, but there have been very few reports which compare warm water exercise with cold water exercise in spleen lymphocytes. Methods: This study was designed to examine the effects of regular swimming training on Index, the number of lymphocytes, proliferative activity and production of reactive oxygen species (ROS) by splenocytes in BALB/c mice. Thirty six mice (6 week old) were performed 10 weeks of regular swimming training and they were divided into 6 groups according to the regular swimming training (CRG: control resting group, CEG: control exercise group, WRG: warm water trained resting group, WEG: warm water trained exercise group, CORG: cold water trained resting group, COEG: cold water exercise group). Analytical items were weight change, spleen index, the number of lymphocytes, proliferative activity and production of ROS. All data were expressed as mean and standard deviation by using SPSS package program (ver. 10.0). Results: The swimming training significantly decreased body weight, and increased spleen index, the number of lymphocytes and proliferative activity in the presence or absence of Con A and LPS added conditions. For the WRG and CORG, the quantity of ROS from splenocytes was higher than CRG, whereas, ROS by spleen lymphocytes was lower following 90 min acute exercise stress. Conclusion: These results suggested that the swimming training not only increases the number of lymphocytes but also increases proliferative activity by splenocytes in vitro.

A Machine Learning-Based Vocational Training Dropout Prediction Model Considering Structured and Unstructured Data (정형 데이터와 비정형 데이터를 동시에 고려하는 기계학습 기반의 직업훈련 중도탈락 예측 모형)

  • Ha, Manseok;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.1-15
    • /
    • 2019
  • One of the biggest difficulties in the vocational training field is the dropout problem. A large number of students drop out during the training process, which hampers the waste of the state budget and the improvement of the youth employment rate. Previous studies have mainly analyzed the cause of dropouts. The purpose of this study is to propose a machine learning based model that predicts dropout in advance by using various information of learners. In particular, this study aimed to improve the accuracy of the prediction model by taking into consideration not only structured data but also unstructured data. Analysis of unstructured data was performed using Word2vec and Convolutional Neural Network(CNN), which are the most popular text analysis technologies. We could find that application of the proposed model to the actual data of a domestic vocational training institute improved the prediction accuracy by up to 20%. In addition, the support vector machine-based prediction model using both structured and unstructured data showed high prediction accuracy of the latter half of 90%.

Analysis of scientific military training data using zero-inflated and Hurdle regression (영과잉 및 허들 회귀모형을 이용한 과학화 전투훈련 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan;Kwon, Ojeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1511-1520
    • /
    • 2017
  • The purpose of this study is to analyze military combat training data to improve military operation and training methods and verify required military doctrine. We set the number of combat disabled enemies, which the individual combatants make using their weapons, as the response variable regarding offensive operations from scientific military training data of reinforced infantry battalion. Our response variable has more zero observations than would be allowed for by the traditional GLM such as Poisson regression. We used the zero-inflated regression and the hurdle regression for data analysis considering the over-dispersion and excessive zero observation problems. Our result can be utilized as an appropriate reference in order to verify a military doctrine for small units and analysis of various operational and tactical factors.

Meta-Analysis on the Effects of Action Observation Training on Stroke Patients' Walking; Focused on Domestic Research (뇌졸중 환자의 동작관찰훈련이 보행에 미치는 효과에 대한 메타분석; 국내연구를 중심으로)

  • Lee, Jeongwoo;Ko, Un;Doo, Yeongtaek
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.7 no.4
    • /
    • pp.119-130
    • /
    • 2019
  • Purpose : The purpose of this study was to investigate the meta-analysis on the effects of action observation training on stroke patients' walking. Methods : Domestic databases (DBpia, KISS, NDSL, and RISS) were searched for studies that conducted randomized controlled trials (RCTs) associated with action observation training in adults after stroke. The search outcomes were items associated with the walking function. The 18 studies that were included in the study were analyzed using R meta-analysis. A random-effect model was used for the analysis of the effect size because of the significant heterogeneity among the studies. Sub-group and meta-regression analysis were also used. Egger's regression test was conducted to analyze the publishing bias. Cumulative meta-analysis and sensitivity analysis were also done to analyze a data error. Results : The mean effect size was 2.77. The sub-group analysis showed a statistical difference in the number of training sessions per week. No statistically significant difference was found in the meta-regression analysis. Publishing bias was found in the data, but the results of the trim-and-fill method showed that such bias did not affect the obtained data. Also, the cumulative meta-analysis and sensitivity analysis showed no data errors. Conclusion : The meta-analysis of the studies that conducted randomized clinical trials revealed that action observation training effectively improved walking of the chronic stroke patients.