• 제목/요약/키워드: statistics based method

검색결과 2,135건 처리시간 0.027초

K-means Clustering using Grid-based Representatives

  • Park, Hee-Chang;Lee, Sun-Myung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권4호
    • /
    • pp.759-768
    • /
    • 2005
  • K-means clustering has been widely used in many applications, such that pattern analysis, data analysis, market research and so on. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters, because it is more primitive and explorative. In this paper we propose a new method of k-means clustering using the grid-based representative value(arithmetic and trimmed mean) for sample. It is more fast than any traditional clustering method and maintains its accuracy.

  • PDF

A Unit Root Test Based on Bootstrapping

  • Shin, Key-Il;Kang, Hee-Jeong
    • Communications for Statistical Applications and Methods
    • /
    • 제3권1호
    • /
    • pp.257-265
    • /
    • 1996
  • We consider nonstationary autoregressive autoregressive process with infinite variance of error. In the case of infinite cariance, the limiting distribution of the estimated coefficient is different from that under the finite cariance assumption. In this paper we show that the bootstrap method can be used to approximate the distribution of ordinary least squares estimator of the coefficient in the first order random walk process with infinite variance through some empirical studies and we suggest a test procedure based on bootstrap method for the unit root test.

  • PDF

Bayesian Multiple Comparison of Binomial Populations based on Fractional Bayes Factor

  • Kim, Dal-Ho;Kang, Sang-Gil;Lee, Woo-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권1호
    • /
    • pp.233-244
    • /
    • 2006
  • In this paper, we develop the Bayesian multiple comparisons procedure for the binomial distribution. We suggest the Bayesian procedure based on fractional Bayes factor when noninformative priors are applied for the parameters. An example is illustrated for the proposed method. For this example, the suggested method is straightforward for specifying distributionally and to implement computationally, with output readily adapted for required comparison. Also, some simulation was performed.

  • PDF

Bootstrap Confidence Intervals for a One Parameter Model using Multinomial Sampling

  • Jeong, Hyeong-Chul;Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • 제10권2호
    • /
    • pp.465-472
    • /
    • 1999
  • We considered a bootstrap method for constructing confidenc intervals for a one parameter model using multinomial sampling. The convergence rates or the proposed bootstrap method are calculated for model-based maximum likelihood estimators(MLE) using multinomial sampling. Monte Carlo simulation was used to compare the performance of bootstrap methods with normal approximations in terms of the average coverage probability criterion.

  • PDF

Probabilistic penalized principal component analysis

  • Park, Chongsun;Wang, Morgan C.;Mo, Eun Bi
    • Communications for Statistical Applications and Methods
    • /
    • 제24권2호
    • /
    • pp.143-154
    • /
    • 2017
  • A variable selection method based on probabilistic principal component analysis (PCA) using penalized likelihood method is proposed. The proposed method is a two-step variable reduction method. The first step is based on the probabilistic principal component idea to identify principle components. The penalty function is used to identify important variables in each component. We then build a model on the original data space instead of building on the rotated data space through latent variables (principal components) because the proposed method achieves the goal of dimension reduction through identifying important observed variables. Consequently, the proposed method is of more practical use. The proposed estimators perform as the oracle procedure and are root-n consistent with a proper choice of regularization parameters. The proposed method can be successfully applied to high-dimensional PCA problems with a relatively large portion of irrelevant variables included in the data set. It is straightforward to extend our likelihood method in handling problems with missing observations using EM algorithms. Further, it could be effectively applied in cases where some data vectors exhibit one or more missing values at random.

Classification via principal differential analysis

  • Jang, Eunseong;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제28권2호
    • /
    • pp.135-150
    • /
    • 2021
  • We propose principal differential analysis based classification methods. Computations of squared multiple correlation function (RSQ) and principal differential analysis (PDA) scores are reviewed; in addition, we combine principal differential analysis results with the logistic regression for binary classification. In the numerical study, we compare the principal differential analysis based classification methods with functional principal component analysis based classification. Various scenarios are considered in a simulation study, and principal differential analysis based classification methods classify the functional data well. Gene expression data is considered for real data analysis. We observe that the PDA score based method also performs well.

Carrier Phase Based Navigation Algorithm Design Using Carrier Phase Statistics in the Weak Signal Environment

  • Park, Sul Gee;Cho, Deuk Jae;Park, Chansik
    • Journal of Positioning, Navigation, and Timing
    • /
    • 제1권1호
    • /
    • pp.7-14
    • /
    • 2012
  • Due to inaccurate safe navigation estimates, maritime accidents have been occurring consistently. In order to solve this, the precise positioning technology using carrier phase information is used, but due to high buildings near inland waterways or inclination, satellite signals might become weak or blocked for some time. Under this weak signal environment for some time, the GPS raw measurements become less accurate so that it is difficult to search and maintain the integer ambiguity of carrier phase. In this paper, a method to generate code and carrier phase measurements under this environment and maintain resilient navigation is proposed. In the weak signal environment, the position of the receiver is estimated using an inertial sensor, and with this information, the distance between the satellite and the receiver is calculated to generate code measurements using IGS product and model. And, the carrier phase measurements are generated based on the statistics for generating fractional phase. In order to verify the performance of the proposed method, the proposed method was compared for a fixed blocked time. It was confirmed that in case of a weak or blocked satellite signals for 1 to 5 minutes, the proposed method showed more improved results than the inertial navigation only, maintaining stable positioning accuracy within 1 m.

다차원 임의 분할표 생성 (Generating Multidimensional Random Tables)

  • 최현집
    • 응용통계연구
    • /
    • 제19권3호
    • /
    • pp.545-554
    • /
    • 2006
  • 로그선형모형에 기반을 둔 다차원 임의 분할표를 생성하는 방법을 제안하였다. 이를 위해 Lee(1997)가 제안한 선형 결합에 의한 결합분포 생성 방법을 적용하였으며, Pearson 통계량을 연관성 측도로 사용하는 것을 제안하였다. 세 변수가 서로 완전한 연관을 갖는 삼차원 결합분포를 생성할 수 있으므로 본 연구에서 제안한 방법은 사차원 이상 다차원 임의 분할표를 생성하는 문제로 확장될 수 있다.

군집수의 예측에 관한 방법의 제안 및 비교 (A Comparative Study of Determining the Number of Clusters with a Method Proposed)

  • 채성산;임남규
    • 응용통계연구
    • /
    • 제18권2호
    • /
    • pp.329-341
    • /
    • 2005
  • 군집방법의 비교시 사용되는 Rand(1971)의 $C_k$, k = 2, 3, . . ., N-1 통계량에 대한 점근 결과를 이용하여 자료에 존재하는 군집수를 예측하는 방법을 제안하였다. 제안된 방법과 $C_k$ 통계량의 변화 형태에 따라 군집수를 예측하는 Chae와 Warde(1991)와 허명회와 이용구(2004)의 방법을 비교하기 위하여 모의실험을 하였다. 현실적인 문제를 고려하여 실제자료에 대해서는 계속적인 재표본의 형성을 위하여 붓스트랩방법을 사용하였다.

2차 통계값과 절대평균을 이용한 비최소 위상 FIR 시스템의 미상 식별 (Blind identification of nonminimum phase FIR systems from second-order statistics and absolute mean)

  • 박양수;박강민;송익호;김형명
    • 한국통신학회논문지
    • /
    • 제21권2호
    • /
    • pp.357-364
    • /
    • 1996
  • 이 논문에서는 고차통계값을 쓰지 않고 비최소 위상 FIR 시스템을 미상 식별(blind identification)할 수 있는 새로운 방법을 제안한다. 제안하는 방법은 2차 백색 신호의 절대평균으로 그 신호의 고차 백색성 여부를 판단할 수 있다는 관찰에서 얻어진다. 제안한 방법은 고차통계값을 쓰는 방법의 새로운 대안이 될 수 있다. 컴퓨터 모의실험을 통해서, 절대평균이 정확히 추정됨을 알 수 있었고 제안한 방법이 고차통계값을 쓰는 방법의 여러 단점을 해결할 수 있음을 보였다.

  • PDF