• Title/Summary/Keyword: bootstrap algorithm

Search Result 41, Processing Time 0.035 seconds

Improving an Ensemble Model by Optimizing Bootstrap Sampling (부트스트랩 샘플링 최적화를 통한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Internet Computing and Services
    • /
    • v.17 no.2
    • /
    • pp.49-57
    • /
    • 2016
  • Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving prediction accuracy. Bagging is one of the most popular ensemble learning techniques. Bagging has been known to be successful in increasing the accuracy of prediction of the individual classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then combines the predictions of these classifiers to get the final classification result. Bootstrap samples are simple random samples selected from the original training data, so not all bootstrap samples are equally informative, due to the randomness. In this study, we proposed a new method for improving the performance of the standard bagging ensemble by optimizing bootstrap samples. A genetic algorithm is used to optimize bootstrap samples of the ensemble for improving prediction accuracy of the ensemble model. The proposed model is applied to a bankruptcy prediction problem using a real dataset from Korean companies. The experimental results showed the effectiveness of the proposed model.

Construction of a Design Curve for Fatigue Model Using Bootstrap Method (붓스트랩방법을 이용한 피로모형의 설계곡선 설정)

  • 서순근;조유희
    • Journal of Korean Society for Quality Management
    • /
    • v.30 no.4
    • /
    • pp.106-119
    • /
    • 2002
  • The fatigue curve with estimated parameters represents the estimate of the median or mean life at a given applied stress But, in order to assist a designer in making decisions regarding the fatigue failure mode, it is common practice to construct a design curve on the lower or safe side of data. In this study, to overcome the limitations(i.e., no runout, equal variance, and quality of the approximation, etc) of Shen, Wirsching, and Cashman's method which suggested the approximate design curve for nonlinear models using tolerance interval constructed by Owen's method, an algorithm to find design curves under the fatigue model using a parametric bootstrap method, is proposed and illustrated with multiple fatigue data sets.

Improvement of Support Vector Clustering using Evolutionary Programming and Bootstrap

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.3
    • /
    • pp.196-201
    • /
    • 2008
  • Statistical learning theory has three analytical tools which are support vector machine, support vector regression, and support vector clustering for classification, regression, and clustering respectively. In general, their performances are good because they are constructed by convex optimization. But, there are some problems in the methods. One of the problems is the subjective determination of the parameters for kernel function and regularization by the arts of researchers. Also, the results of the learning machines are depended on the selected parameters. In this paper, we propose an efficient method for objective determination of the parameters of support vector clustering which is the clustering method of statistical learning theory. Using evolutionary algorithm and bootstrap method, we select the parameters of kernel function and regularization constant objectively. To verify improved performances of proposed research, we compare our method with established learning algorithms using the data sets form ucr machine learning repository and synthetic data.

Optimization of Blind Adaptive Decorrelating PIC Detector Performance in DS-CDMA System

  • Sirijiamrat, S.;Benjangkaprasert, C.;Sangaroon, O.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1962-1965
    • /
    • 2004
  • In this paper, the new algorithm for blind adaptive decorrelating parallel interference canceller detector in direct-sequence code division multiple access (DS-CDMA) synchronous communication systems is proposed. The goal of this paper is to improve the performance of the blind adaptive decorrelating parallel interference cancellation detector (BAD/PIC). The proposed blind adaptive decorrelating detector is using optimum step-size technique bootstrap algorithm as an initial stage of PIC, which does not require a training sequence. Therefore, this algorithm has a superior view of utilizing bandwidth and reduces the complexity of computation of inversion cross-correlation matrix. The computer simulation results show that the bit error rate performance of the proposed algorithm for the new structure of detector is better than that of the other detectors such as matched filters, the conventional PIC, and the blind adaptive decorrelating PIC detector.

  • PDF

Bootstrap Estimation for GEE Models (일반화추정방정식(GEE)에 대한 부스트랩의 적용)

  • Park, Chong-Sun;Jeon, Yong-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.207-216
    • /
    • 2011
  • Bootstrap is a resampling technique to find an estimate of parameters or to evaluate the estimate. This technique has been used in estimating parameters in linear model(LM) and generalized linear model(GLM). In this paper, we explore the possibility of applying Bootstrapping Residuals, Pairs, and an Estimating Equation that are most widely used in LM and GLM to the generalized estimating equation(GEE) algorithm for modelling repeatedly measured regression data sets. We compared three bootstrapping methods with coefficient and standard error estimates of GEE models from one simulated and one real data set. Overall, the estimates obtained from bootstrap methods are quite comparable, except that estimates from bootstrapping pairs are somewhat different from others. We conjecture that the strange behavior of estimates from bootstrapping pairs comes from the inconsistency of those estimates. However, we need a more thorough simulation study to generalize it since those results are coming from only two small data sets.

On Employing Nonparametric Bootstrap Technique in Oscillometric Blood Pressure Measurement for Confidence Interval Estimation

  • Lee, Yong-Kook;Lee, Im-Bong;Chang, Joon-Hyuk;Lee, Soo-Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.2
    • /
    • pp.200-207
    • /
    • 2014
  • Blood pressure (BP) is an important vital signal for determining the health of an individual subject. Although estimation of mean arterial blood pressure is possible using oscillometric blood pressure techniques, there are no established techniques in the literature for obtaining confidence interval (CI) for systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates obtained from such BP measurements. This paper proposes a nonparametric bootstrap technique to obtain CI with a small number of the BP measurements. The proposed algorithm uses pseudo measurements employing nonparametric bootstrap technique to derive the pseudo maximum amplitudes (PMA) and the pseudo envelopes (PE). The SBP and DBP are then derived using the new relationships between PMA and PE and the CIs for such estimates. Application of the proposed method on an experimental dataset of 85 patients with five sets of measurements for each patient has yielded a smaller Cl than the conventional student t-method.

Application of Bayesian Computational Techniques in Estimation of Posterior Distributional Properties of Lognormal Distribution

  • Begum, Mun-Ni;Ali, M. Masoom
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.1
    • /
    • pp.227-237
    • /
    • 2004
  • In this paper we presented a Bayesian inference approach for estimating the location and scale parameters of the lognormal distribution using iterative Gibbs sampling algorithm. We also presented estimation of location parameter by two non iterative methods, importance sampling and weighted bootstrap assuming scale parameter as known. The estimates by non iterative techniques do not depend on the specification of hyper parameters which is optimal from the Bayesian point of view. The estimates obtained by more sophisticated Gibbs sampler vary slightly with the choices of hyper parameters. The objective of this paper is to illustrate these tools in a simpler setup which may be essential in more complicated situations.

  • PDF

Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm (붓스트랩 기법과 유전자 알고리즘을 이용한 최적 군집 수 결정)

  • 박민재;전성해;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.263-266
    • /
    • 2002
  • 데이터의 군집화를 수행할 때 최적 군집수 결정은 군집 결과의 성능에 많은 영향을 미친다. 특히 K-means 방법에서는 초기 군집수 K에 따라 군집결과의 성능 차이가 많이 나타난다. 하지만 대다수의 군집분석에서 초기 군집수의 결정은 경험을 바탕으로 하여 주관적으로 결정된다. 이때 개체수와 속성수가 증가하면 이러한 결정은 더욱 어려워지며 이때 결정된 군집수가 최적이 된다는 보장도 없다. 본 논문에서는 군집의 수를 자동으로 결정하고 그 결과의 유효성을 보장하기 위해 유전자 알고리즘에 기반한 최적 군집수 결정 방안을 제안한다. 데이터의 속성에 근거한 초기 해 집단이 생성되고, 해 집단 내에서 최적화된 군집수를 찾기 위해 교차 연산이 이루어진다. 적합도 값은 전체 군집화의 비 유사성의 합의 역으로 결정되어 전체적인 군집화 성능이 향상되는 방향으로 수렴된다. 또한 지역 국소값을 해결하기 위해 돌연변이 연산이 사용된다. 그리고 유전자 알고리즘의 학습 시간의 비용을 줄이기 위해 붓스트랩 기법이 적용된다.

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.