• Title/Summary/Keyword: bootstrap learning

Search Result 17, Processing Time 0.037 seconds

Improvement of Support Vector Clustering using Evolutionary Programming and Bootstrap

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.3
    • /
    • pp.196-201
    • /
    • 2008
  • Statistical learning theory has three analytical tools which are support vector machine, support vector regression, and support vector clustering for classification, regression, and clustering respectively. In general, their performances are good because they are constructed by convex optimization. But, there are some problems in the methods. One of the problems is the subjective determination of the parameters for kernel function and regularization by the arts of researchers. Also, the results of the learning machines are depended on the selected parameters. In this paper, we propose an efficient method for objective determination of the parameters of support vector clustering which is the clustering method of statistical learning theory. Using evolutionary algorithm and bootstrap method, we select the parameters of kernel function and regularization constant objectively. To verify improved performances of proposed research, we compare our method with established learning algorithms using the data sets form ucr machine learning repository and synthetic data.

Improving an Ensemble Model by Optimizing Bootstrap Sampling (부트스트랩 샘플링 최적화를 통한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Internet Computing and Services
    • /
    • v.17 no.2
    • /
    • pp.49-57
    • /
    • 2016
  • Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving prediction accuracy. Bagging is one of the most popular ensemble learning techniques. Bagging has been known to be successful in increasing the accuracy of prediction of the individual classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then combines the predictions of these classifiers to get the final classification result. Bootstrap samples are simple random samples selected from the original training data, so not all bootstrap samples are equally informative, due to the randomness. In this study, we proposed a new method for improving the performance of the standard bagging ensemble by optimizing bootstrap samples. A genetic algorithm is used to optimize bootstrap samples of the ensemble for improving prediction accuracy of the ensemble model. The proposed model is applied to a bankruptcy prediction problem using a real dataset from Korean companies. The experimental results showed the effectiveness of the proposed model.

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

Path Loss Prediction Using an Ensemble Learning Approach

  • Beom Kwon;Eonsu Noh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.1-12
    • /
    • 2024
  • Predicting path loss is one of the important factors for wireless network design, such as selecting the installation location of base stations in cellular networks. In the past, path loss values were measured through numerous field tests to determine the optimal installation location of the base station, which has the disadvantage of taking a lot of time to measure. To solve this problem, in this study, we propose a path loss prediction method based on machine learning (ML). In particular, an ensemble learning approach is applied to improve the path loss prediction performance. Bootstrap dataset was utilized to obtain models with different hyperparameter configurations, and the final model was built by ensembling these models. We evaluated and compared the performance of the proposed ensemble-based path loss prediction method with various ML-based methods using publicly available path loss datasets. The experimental results show that the proposed method outperforms the existing methods and can predict the path loss values accurately.

Realization of home appliance classification system using deep learning (딥러닝을 이용한 가전제품 분류 시스템 구현)

  • Son, Chang-Woo;Lee, Sang-Bae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1718-1724
    • /
    • 2017
  • Recently, Smart plugs for real time monitoring of household appliances based on IoT(Internet of Things) have been activated. Through this, consumers are able to save energy by monitoring real-time energy consumption at all times, and reduce power consumption through alarm function based on consumer setting. In this paper, we measure the alternating current from a wall power outlet for real-time monitoring. At this time, the current pattern for each household appliance was classified and it was experimented with deep learning to determine which product works. As a result, we used a cross validation method and a bootstrap verification method in order to the classification performance according to the type of appliances. Also, it is confirmed that the cost function and the learning success rate are the same as the train data and test data.

Ensemble Learning Algorithm of Specialized Networks (전문화된 네트워크들의 결합에 의한 앙상블 학습 알고리즘)

  • 신현정;이형주;조성준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.308-310
    • /
    • 2000
  • 관찰학습(OLA: Observational Learning Algorithm)은 앙상블 네트워크의 각 구성 모델들이 아른 모델들을 관찰함으로써 얻어진 가상 데이터와 초기에 bootstrap된 실제 데이터를 학습에 함께 이용하는 방법이다. 본 논문에서는, 초기 학습 데이터 셋을 분할하고 분할된 각 데이터 셋에 대하여 앙상블의 구성 모델들을 전문화(specialize)시키는 방법을 적용하여 기존의 관찰학습 알고리즘을 개선시켰다. 제안된 알고리즘은 bagging 및 boosting과의 비교 실험에 의하여, 보다 적은 수의 구성 모델로 동일 내지 보다 나은 성능을 나타냄이 실험적으로 검증되었다.

  • PDF

Prediction of Germination of Korean Red Pine (Pinus densiflora) Seed using FT NIR Spectroscopy and Binary Classification Machine Learning Methods (FT NIR 분광법 및 이진분류 머신러닝 방법을 이용한 소나무 종자 발아 예측)

  • Yong-Yul Kim;Ja-Jung Ku;Da-Eun Gu;Sim-Hee Han;Kyu-Suk Kang
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.2
    • /
    • pp.145-156
    • /
    • 2023
  • In this study, Fourier-transform near-infrared (FT-NIR) spectra of Korean red pine seeds stored at -18℃ and 4℃ for 18 years were analyzed. To develop seed-germination prediction models, the performance of seven machine learning methods, namely XGBoost, Boosted Tree, Bootstrap Forest, Neural Networks, Decision Tree, Support Vector Machine, PLS-DA, were compared. The predictive performance, assessed by accuracy, misclassification, and area under the curve (0.9722, 0.0278, and 0.9735 for XGBoost, and 0.9653, 0.0347, and 0.9647 for Boosted Tree), was better for the XGBoost and decision tree models when compared with other models. The 54 wave-number variables of the two models were of high relative importance in seed-germination prediction and were grouped into six spectral ranges (811~1,088 nm, 1,137~1,273 nm, 1,336~1,453 nm, 1,666~1,671 nm, 1,879~2,045 nm, and 2,058~2,409 nm) for aromatic amino acids, cellulose, lignin, starch, fatty acids, and moisture, respectively. Use of the NIR spectral data and two machine learning models developed in this study gave >96% accuracy for the prediction of pine-seed germination after long-term storage, indicating this approach could be useful for non-destructive viability testing of stored seed genetic resources.

A comparative assessment of bagging ensemble models for modeling concrete slump flow

  • Aydogmus, Hacer Yumurtaci;Erdal, Halil Ibrahim;Karakurt, Onur;Namli, Ersin;Turkan, Yusuf S.;Erdal, Hamit
    • Computers and Concrete
    • /
    • v.16 no.5
    • /
    • pp.741-757
    • /
    • 2015
  • In the last decade, several modeling approaches have been proposed and applied to estimate the high-performance concrete (HPC) slump flow. While HPC is a highly complex material, modeling its behavior is a very difficult issue. Thus, the selection and application of proper modeling methods remain therefore a crucial task. Like many other applications, HPC slump flow prediction suffers from noise which negatively affects the prediction accuracy and increases the variance. In the recent years, ensemble learning methods have introduced to optimize the prediction accuracy and reduce the prediction error. This study investigates the potential usage of bagging (Bag), which is among the most popular ensemble learning methods, in building ensemble models. Four well-known artificial intelligence models (i.e., classification and regression trees CART, support vector machines SVM, multilayer perceptron MLP and radial basis function neural networks RBF) are deployed as base learner. As a result of this study, bagging ensemble models (i.e., Bag-SVM, Bag-RT, Bag-MLP and Bag-RBF) are found superior to their base learners (i.e., SVM, CART, MLP and RBF) and bagging could noticeable optimize prediction accuracy and reduce the prediction error of proposed predictive models.

A Study on the Influence of Perceived Usefulness, Perceived Ease of Use, Self-Efficacy, and Depression on the Learning Satisfaction and Intention to Continue Studying in Distance Education Due to COVID-19 (코로나19로 인한 원격 교육에서 인지된 유용성과 인지된 사용용이성, 자기효능감, 우울이 대학생들의 학습만족도와 학업 지속의향에 미치는 영향에 관한 연구)

  • Kim, Hyojung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.18 no.1
    • /
    • pp.79-91
    • /
    • 2022
  • In this study, the effects of self-efficacy, perceived usefulness, perceived ease of use, and depression on college students' academic persistence in the COVID-19 epidemic and the resulting non-face-to-face education situation were identified as mediating effects on learning satisfaction. In the second semester of 2020, a survey was conducted on students enrolled in a four-year university in Daegu and the data were statistically analyzed. The path coefficient was estimated by the Smart PLS bootstrap method and the significance of the path coefficient was verified. The Sobel Test was conducted to verify the mediating effect of academic continuity intention as a parameter. The research results can be summarized as follows. First, it was found that self-efficacy and perceived usefulness had a significant influence in the relationship with learning satisfaction. Second, the relationship between learning satisfaction and academic continuity intention was found to have a significant influence. Third, depression and ease of use did not show any significant influence in the relationship between learning satisfaction. Finally, a Sobel Test was conducted to verify the mediating effect of academic continuity intention with self-efficacy, usefulness, ease of use, and depression as independent variables and learning satisfaction as parameters. As a result of both regression analyses, it was found that β values decreased, and learning satisfaction had a mediating effect. As a result of this study, it is suggested that research to increase learner satisfaction and develop various contents to increase the effectiveness of education that can increase self-efficacy and perceived usefulness should be conducted in parallel. I think this study can be used as basic data in establishing measures to continue studying for college students in natural disaster situations or psychological crisis situations called COVID-19.

Deep Learning Model Validation Method Based on Image Data Feature Coverage (영상 데이터 특징 커버리지 기반 딥러닝 모델 검증 기법)

  • Lim, Chang-Nam;Park, Ye-Seul;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.9
    • /
    • pp.375-384
    • /
    • 2021
  • Deep learning techniques have been proven to have high performance in image processing and are applied in various fields. The most widely used methods for validating a deep learning model include a holdout verification method, a k-fold cross verification method, and a bootstrap method. These legacy methods consider the balance of the ratio between classes in the process of dividing the data set, but do not consider the ratio of various features that exist within the same class. If these features are not considered, verification results may be biased toward some features. Therefore, we propose a deep learning model validation method based on data feature coverage for image classification by improving the legacy methods. The proposed technique proposes a data feature coverage that can be measured numerically how much the training data set for training and validation of the deep learning model and the evaluation data set reflects the features of the entire data set. In this method, the data set can be divided by ensuring coverage to include all features of the entire data set, and the evaluation result of the model can be analyzed in units of feature clusters. As a result, by providing feature cluster information for the evaluation result of the trained model, feature information of data that affects the trained model can be provided.