• Title/Summary/Keyword: Bootstrap방법

Search Result 171, Processing Time 0.022 seconds

Resampling Methods on Frequency Domains for Time Series (시계열분석을 위한 주파수 공간상에서의 재표집 기법)

  • Yeo In-Kwon;Yoon Wha-Hyung;Cho Sin-Sup
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.121-134
    • /
    • 2006
  • This paper presents the resampling method for time series data in the frequency domain obtained by using discrete cosine transforms(DCT) The advantage of the proposed method is to generate bootstrap samples in time domain comparing with existing bootstrapping method. When time series are stationary, statistical properties of DCT coefficients are investigated and provide the verification of the proposed procedure.

Analysis of the Frailty Model with Many Ties (동측치가 많은 FRAILTY 모형의 분석)

  • Kim Yongdai;Park Jin-Kyung
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.67-81
    • /
    • 2005
  • Most of the previously proposed methods for the frailty model do not work well when there are many tied observations. This is partly because the empirical likelihood used is not suitable for tied observations. In this paper, we propose a new method for the frailty model with many ties. The proposed method obtains the posterior distribution of the parameters using the binomial form empirical likelihood and Bayesian bootstrap. The proposed method yields stable results and is computationally fast. To compare the proposed method with the maximum marginal likelihood approach, we do simulations.

A Study on the Relationship between Vertical Separation and Operational Efficiency of Railway Industry (철도산업의 수직분리와 운영효율성의 관련성에 관한 연구)

  • Kim, Seong-Ho;Choi, Tae-Sung
    • Journal of the Korean Society for Railway
    • /
    • v.12 no.6
    • /
    • pp.844-851
    • /
    • 2009
  • Since 1990s, the European railway sector has undergone both a vertical separation and a vertical integration. Recently Simar and Wilson (2008) provides a bootstrap test procedure for testing whether two groups' mean efficiencies are equivalent. The purpose of this paper is to ascertain the relationship between vertical separation and operational efficiency of railway industry using the Simar and Wilson's bootstrap test procedure not used in previous studies with a data set of 20 European countries from 1998 to 2005. From the value of test statistic it seems that the mean operational efficiencies of vertically separated railway industry were higher than those of vertically integrated railway industry. However the p-value indicates that the differences of mean operational efficiencies are not significat at any meaningful level.

Reproducibility of Hypothesis Testing and Confidence Interval (가설검정과 신뢰구간의 재현성)

  • Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.645-653
    • /
    • 2014
  • P-value is the probability of observing a current sample and possibly other samples departing equally or more extremely from the null hypothesis toward postulated alternative hypothesis. When p-value is less than a certain level called ${\alpha}$(= 0:05), researchers claim that the alternative hypothesis is supported empirically. Unfortunately, some findings discovered in that way are not reproducible, partly because the p-value itself is a statistic vulnerable to random variation. Boos and Stefanski (2011) suggests calculating the upper limit of p-value in hypothesis testing, using a bootstrap predictive distribution. To determine the sample size of a replication study, this study proposes thought experiments by simulating boosted bootstrap samples of different sizes from given observations. The method is illustrated for the cases of two-group comparison and multiple linear regression. This study also addresses the reproducibility of the points in the given 95% confidence interval. Numerical examples show that the center point is covered by 95% confidence intervals generated from bootstrap resamples. However, end points are covered with a 50% chance. Hence this study draws the graph of the reproducibility rate for each parameter in the confidence interval.

Gene Selection Based on Support Vector Machine using Bootstrap (붓스트랩 방법을 활용한 SVM 기반 유전자 선택 기법)

  • Song, Seuck-Heun;Kim, Kyoung-Hee;Park, Chang-Yi;Koo, Ja-Yong
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.3
    • /
    • pp.531-540
    • /
    • 2007
  • The recursive feature elimination for support vector machine is known to be useful in selecting relevant genes. Since the criterion for choosing relevant genes is the absolute value of a coefficient, the recursive feature elimination may suffer from a scaling problem. We propose a modified version of the recursive feature elimination algorithm using bootstrap. In our method, the criterion for determining relevant genes is the absolute value of a coefficient divided by its standard error, which accounts for statistical variability of the coefficient. Through numerical examples, we illustrate that our method is effective in gene selection.

Is There Timing Ability in Korean Equity Funds? (국내 주식형 펀드의 타이밍 능력은 존재하는가?)

  • Kim, Sang-Bae;Park, Jong-Goo
    • The Korean Journal of Financial Management
    • /
    • v.26 no.2
    • /
    • pp.93-112
    • /
    • 2009
  • The purpose of this study is to examine market timing and volatility timing abilities in Korean equity funds by distinguishing 'skill' and 'luck' for individual funds. In this study, we use the funds, which exist more than consecutive 24 month non-overlapping periods. This procedure leaves 545 funds among total 1,904 funds during sample priod January 2001 to December 2007. To derive the 'luck' distribution, the cross-sectional bootrap approach is adopted. From our results, it is found that when the traditional regression approach is adopted, only few Korean equity funds possess market timing and volatility timing abilities. However, based on the 'luck' distributions, which are derived from cross-sectional bootstrap approach, it is found that market timing and volatility timing abilities of Korean equity funds are merely from 'luck' rather than 'skill'.

  • PDF

Estimation of confidence interval in exponential distribution for the greenhouse gas inventory uncertainty by the simulation study (모의실험에 의한 온실가스 인벤토리 불확도 산정을 위한 지수분포 신뢰구간 추정방법)

  • Lee, Yung-Seop;Kim, Hee-Kyung;Son, Duck Kyu;Lee, Jong-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.825-833
    • /
    • 2013
  • An estimation of confidence intervals is essential to calculate uncertainty for greenhouse gases inventory. It is generally assumed that the population has a normal distribution for the confidence interval of parameters. However, in case data distribution is asymmetric, like nonnormal distribution or positively skewness distribution, the traditional estimation method of confidence intervals is not adequate. This study compares two estimation methods of confidence interval; parametric and non-parametric method for exponential distribution as an asymmetric distribution. In simulation study, coverage probability, confidence interval length, and relative bias for the evaluation of the computed confidence intervals. As a result, the chi-square method and the standardized t-bootstrap method are better methods in parametric methods and non-parametric methods respectively.

Comparison of Survival Function Estimators for the Cox's Regression Model using Bootstrap Method (Cox 회귀모형(回歸模型)에서 붓스트랩방법(方法)에 의한 생존함수추정량(生存函數推定量)의 비교연구(比較硏究))

  • Cha, Young-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.4
    • /
    • pp.1-11
    • /
    • 1993
  • The Cox's regression model is frequently used for covariate effects in survival data analysis, But, much of the statistical work has focused on asymptotic behavior so the small sample evaluation has been neglected. In this paper, we compare the small or moderate sample performances of the survival function estimators for the Cox's regression model using bootstrap method. The smoothed PL type estimator and the Link estimator are slightly better than corresponding the PL type estimator and the Nelson type estimator in the sense of the achieved error rates.

  • PDF

A Comparison of Alternative Approaches to Determinants of DEA Efficiency Scores (DEA효율성점수의 결정요인 분석방법 비교)

  • Kim, Seong-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.35 no.2
    • /
    • pp.19-35
    • /
    • 2010
  • Many papers have used a two-stage approach of first calculating DEA efficiency scores and then seeking to correlate these scores with various environmental variables. Most of the studies have not checked whether such a two-stage approach is statistically valid for identifying significant environmental variables. Recently Simar and Wilson (2007) (SW) introduce a sensible data generating process and bootstrap procedure based on truncated regression for the two-stage approach. Banker and Natarajan (2008) (BN) provide a statistical foundation for the two-stage approach comprising a DEA followed by an ordinary least squares or maximum likelihood estimation. Researchers have to identify an approach suitable for their research circumstances in terms of properties, merits, demerits, and robustness to plausible departures from its chosen data generating process. We summarize the foundations and properties of the two-stage procedures suggested by SW and BN. And we discuss merits and demerits of those procedures. Also using Monte Carlo simulation we assess their relative performance under several misspecified settings.

Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm (붓스트랩 기법과 유전자 알고리즘을 이용한 최적 군집 수 결정)

  • 박민재;전성해;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.263-266
    • /
    • 2002
  • 데이터의 군집화를 수행할 때 최적 군집수 결정은 군집 결과의 성능에 많은 영향을 미친다. 특히 K-means 방법에서는 초기 군집수 K에 따라 군집결과의 성능 차이가 많이 나타난다. 하지만 대다수의 군집분석에서 초기 군집수의 결정은 경험을 바탕으로 하여 주관적으로 결정된다. 이때 개체수와 속성수가 증가하면 이러한 결정은 더욱 어려워지며 이때 결정된 군집수가 최적이 된다는 보장도 없다. 본 논문에서는 군집의 수를 자동으로 결정하고 그 결과의 유효성을 보장하기 위해 유전자 알고리즘에 기반한 최적 군집수 결정 방안을 제안한다. 데이터의 속성에 근거한 초기 해 집단이 생성되고, 해 집단 내에서 최적화된 군집수를 찾기 위해 교차 연산이 이루어진다. 적합도 값은 전체 군집화의 비 유사성의 합의 역으로 결정되어 전체적인 군집화 성능이 향상되는 방향으로 수렴된다. 또한 지역 국소값을 해결하기 위해 돌연변이 연산이 사용된다. 그리고 유전자 알고리즘의 학습 시간의 비용을 줄이기 위해 붓스트랩 기법이 적용된다.