• Title/Summary/Keyword: statistics based method

Search Result 2,144, Processing Time 0.028 seconds

Credit Scoring Using Splines (스플라인을 이용한 신용 평점화)

  • Koo Ja-Yong;Choi Daewoo;Choi Min-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.543-553
    • /
    • 2005
  • Linear logistic regression is one of the most widely used method for credit scoring in credit risk management. This paper deals with credit scoring using splines based on Logistic regression. Linear splines and an automatic basis selection algorithm are adopted. The final model is an example of the generalized additive model. A simulation using a real data set is used to illustrate the performance of the spline method.

A Sampling Design for the livestock (Korean Native Beef Cattle, Milk Cow, Pig, Chicken) Statistics (가축통계 표본조사설계)

  • 윤기중;박상언
    • The Korean Journal of Applied Statistics
    • /
    • v.11 no.2
    • /
    • pp.233-246
    • /
    • 1998
  • We made a sample design for next 5 years, based on the population as of 1995, for livestock statistics. In the sample design, we used the stratified one stage sampling method where the sample size depends on the prefixed coefficient of variation. In stratifying the population, we considered the complete linkage method, and decided the number of strata to be the one which yields the minimum sample size. We listed here some difficulties we had for the better sample design in the future.

  • PDF

Statistical Fingerprint Recognition Matching Method with an Optimal Threshold and Confidence Interval

  • Hong, C.S.;Kim, C.H.
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1027-1036
    • /
    • 2012
  • Among various biometrics recognition systems, statistical fingerprint recognition matching methods are considered using minutiae on fingerprints. We define similarity distance measures based on the coordinate and angle of the minutiae, and suggest a fingerprint recognition model following statistical distributions. We could obtain confidence intervals of similarity distance for the same and different persons, and optimal thresholds to minimize two kinds of error rates for distance distributions. It is found that the two confidence intervals of the same and different persons are not overlapped and that the optimal threshold locates between two confidence intervals. Hence an alternative statistical matching method can be suggested by using nonoverlapped confidence intervals and optimal thresholds obtained from the distributions of similarity distances.

Geodesic Clustering for Covariance Matrices

  • Lee, Haesung;Ahn, Hyun-Jung;Kim, Kwang-Rae;Kim, Peter T.;Koo, Ja-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.4
    • /
    • pp.321-331
    • /
    • 2015
  • The K-means clustering algorithm is a popular and widely used method for clustering. For covariance matrices, we consider a geodesic clustering algorithm based on the K-means clustering framework in consideration of symmetric positive definite matrices as a Riemannian (non-Euclidean) manifold. This paper considers a geodesic clustering algorithm for data consisting of symmetric positive definite (SPD) matrices, utilizing the Riemannian geometric structure for SPD matrices and the idea of a K-means clustering algorithm. A K-means clustering algorithm is divided into two main steps for which we need a dissimilarity measure between two matrix data points and a way of computing centroids for observations in clusters. In order to use the Riemannian structure, we adopt the geodesic distance and the intrinsic mean for symmetric positive definite matrices. We demonstrate our proposed method through simulations as well as application to real financial data.

Maximum product of spacings under a generalized Type-II progressive hybrid censoring scheme

  • Young Eun, Jeon;Suk-Bok, Kang;Jung-In, Seo
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.665-677
    • /
    • 2022
  • This paper proposes a new estimation method based on the maximum product of spacings for estimating unknown parameters of the three-parameter Weibull distribution under a generalized Type-II progressive hybrid censoring scheme which guarantees a constant number of observations and an appropriate experiment duration. The proposed approach is appropriate for a situation where the maximum likelihood estimation is invalid, especially, when the shape parameter is less than unity. Furthermore, it presents the enhanced performance in terms of the bias through the Monte Carlo simulation. In particular, the superiority of this approach is revealed even under the condition where the maximum likelihood estimation satisfies the classical asymptotic properties. Finally, to illustrate the practical application of the proposed approach, the real data analysis is conducted, and the superiority of the proposed method is demonstrated through a simple goodness-of-fit test.

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

Sample size determination based on placements for non-inferiority trials (비열등성 시험에서 위치 방법에 기초한 표본 수 결정)

  • Kim, Jiyeon;Kim, Dongjae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1349-1357
    • /
    • 2013
  • In clinical research, sample size determination is one of the most important things. There are parametric method using t-test and non-parametric method suggested by Kim and Kim (2007) based on Wilcoxon's rank sum test for determining sample size in non-inferiority trials. In this paper, we propose sample size calculation method based on placements method suggested by Orban and Wolfe (1982) and using the power calculated by Kim (1994) in non-inferiority trials. We also compare proposed sample size with that using Kim and Kim (2007)'s formula and that of t-test for parametric methods. As the result, sample size calculated by proposed method based on placements is the smallest. Therefore, proposed method based on placements is better than parametric methods in case that it's hard to assume specific distribution function for population and also more efficient in terms of time and cost than method based on Wilcoxon's rank sum test.

Improvement of Genetic Programming Based Nonlinear Regression Using ADF and Application for Prediction MOS of Wind Speed (ADF를 사용한 유전프로그래밍 기반 비선형 회귀분석 기법 개선 및 풍속 예보 보정 응용)

  • Oh, Seungchul;Seo, Kisung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.12
    • /
    • pp.1748-1755
    • /
    • 2015
  • A linear regression is widely used for prediction problem, but it is hard to manage an irregular nature of nonlinear system. Although nonlinear regression methods have been adopted, most of them are only fit to low and limited structure problem with small number of independent variables. However, real-world problem, such as weather prediction required complex nonlinear regression with large number of variables. GP(Genetic Programming) based evolutionary nonlinear regression method is an efficient approach to attach the challenging problem. This paper introduces the improvement of an GP based nonlinear regression method using ADF(Automatically Defined Function). It is believed ADFs allow the evolution of modular solutions and, consequently, improve the performance of the GP technique. The suggested ADF based GP nonlinear regression methods are compared with UM, MLR, and previous GP method for 3 days prediction of wind speed using MOS(Model Output Statistics) for partial South Korean regions. The UM and KLAPS data of 2007-2009, 2011-2013 years are used for experimentation.

Analysis on Effects of Design Variable Uncertainty on the Performance of MEMS Gyroscope Based on Sample Statistics (샘플 통계에 근거한 MEMS 자이로스코프의 설계변수 불확정성이 성능에 미치는 영향 분석 방법)

  • Kim, Yong-Woo;Yoo, Hong-Hee
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2009.10a
    • /
    • pp.119-123
    • /
    • 2009
  • Recently, a MEMS gyroscope has been broadly fabricated and used due to development of a micromachining. However, there is a difference between the modeling design and the actual product and this difference can lead to the performance variation of a MEMS gyroscope. A classical design method does not exactly estimate the performance of a MEMS gyroscope. Therefore a design process considering the design variable uncertainty has to be employed to design MEMS gyroscope model. In this paper, the equation of motion of a MEMS gyroscope model is obtained to analyze the performance of a MEMS gyroscope and the effects of the design variables on the MEMS gyroscope performance are investigated. Finally the performance of MEMS gyroscope is estimated through a statistical analysis based on sample statistics.

  • PDF

Goodness-of-Fit Test for the Normality based on the Generalized Lorenz Curve

  • Cho, Youngseuk;Lee, Kyeongjun
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.4
    • /
    • pp.309-316
    • /
    • 2014
  • Testing normality is very important because the most common assumption is normality in statistical analysis. We propose a new plot and test statistic to goodness-of-fit test for normality based on the generalized Lorenz curve. We compare the new plot with the Q-Q plot. We also compare the new test statistic with the Kolmogorov-Smirnov (KS), Cramer-von Mises (CVM), Anderson-Darling (AD), Shapiro-Francia (SF), and Shapiro-Wilks (W) test statistic in terms of the power of the test through by Monte Carlo method. As a result, new plot is clearly classified normality and non-normality than Q-Q plot; in addition, the new test statistic is more powerful than the other test statistics for asymmetrical distribution. We check the proposed test statistic and plot using Hodgkin's disease data.