DOI QR코드

DOI QR Code

Broken-Stick 모형에 기초한 주성분 공헌도평가

Contribution of Principal Components Based on the Broken-Stick Model

  • Kang, Y.J. (Lotte Card Co., Ltd.) ;
  • Byun, J.H. (Department of Statistics, Korea University) ;
  • Ki, K.Y. (Department of Statistics, Korea University)
  • 투고 : 20100600
  • 심사 : 20100600
  • 발행 : 2010.08.31

초록

Broken-Stick 모형 (Barton과 David, 1956) 하에서 순서화된 분절구간의 기대길이를 기초로 유효차원의 개수를 결정하는 Frontier (1976)방법은 일관된 모의실험 결과를 제공하는 기준 중의 하나로 보고된 바 있다 (Jackson, 1993). 이 연구에서는 Broken-Stick 모형(BSM) 하에서 분절구간길이의 분포를 이용하여 주성분 상대공헌도의 크기를 확률적으로 평가하는 BSM 유의확률기준을 제안한다. 이에 부가하여 소득분포의 불균등성을 도식화한 로렌츠곡선과 이에 대응하는 지니계수를 통해 주성분 공헌도의 포괄적 균등성을 탐구한다.

Frontier (1976) suggested a criterion based on the expected length of ordered random intervals under the Broken-stick model (Barton and David, 1956) to determine the optimal number of principal components retained. It is considered to be one of the methods that provide the most consistent simulation results (Jackson, 1993). This study is aimed to propose a method using the distribution of ordered random intervals to evaluate the contribution of principal components. We also examine several types of Gini indices along with the corresponding Lorenz curves to visualize the overall equivalence of those contributions.

키워드

참고문헌

  1. Almorza, D. A. and Garcia, M. H. (2008). Results of exploratory data analysis in the broken Stick model, Journal of Applied Statistics, 35, 979-983. https://doi.org/10.1080/02664760802187536
  2. Anand, S. (1983). Inequality and Poverty in Malaysia, Oxford University Press, New York.
  3. Anderson, T. W. (1963). An asymptotic expansion for the distribution of the latent roots of an estimated covariance matrix, The Annals of Mathematical Statistics, 36, 1153-1173. https://doi.org/10.1214/aoms/1177699989
  4. Bartlett, M. S. (1950). Tests of significance in factor analysis, British Journal of Psychology(Statistical section), 3, 77-85. https://doi.org/10.1111/j.2044-8317.1950.tb00285.x
  5. Barton, D. E. and David, F. N. (1956). Some notes on ordered random intervals, Journal of the Royal Statistical Society, Series B, 18, 79-94.
  6. Benasseni, J. (2005). A concentration study of principal components, Journal of Applied Statistics, 32, 947-957. https://doi.org/10.1080/02664760500163664
  7. Box, G. E. P., Hunter, W. G., MacGregor, J. F. and Erjavaz, J. (1973). Some problems associated with the analysis of multiresponse data, Technometrics, 15, 33-51. https://doi.org/10.2307/1266823
  8. Cattell, R. B. (1966). The scree test for the number of factors, Multivariate Behavioral Research, 1, 245-276. https://doi.org/10.1207/s15327906mbr0102_10
  9. Frontier, S. (1976). Etude de la decroissance des valeurs propres dans une analyse en composantes principales: Comparison avec le modele du baton brise, Journal of Experimental Marine Biology and Ecology, 25, 67-75. https://doi.org/10.1016/0022-0981(76)90076-9
  10. Gini, C. (1912). Variabilita e mutabilita, Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Libreria Eredi Virgilio Veschi, 1955, Rome.
  11. Hastie, T., Buja, A. and Tibshirani, R. (1995). Penalized discriminant analysis, The Annals of Statistics, 23, 73-102. https://doi.org/10.1214/aos/1176324456
  12. Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis, Psychometrika, 30, 179-185. https://doi.org/10.1007/BF02289447
  13. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 24, 417-441. https://doi.org/10.1037/h0071325
  14. Jackson, D. A. (1993). Stopping rules in principal components analysis: A comparison of heuristical and statistical approaches, Ecological Society of America, 74, 2204-2214.
  15. Jackson, J. E. (1959). Quality control methods for several related variables, Technometrics, 1, 359-377. https://doi.org/10.2307/1266717
  16. Jackson, J. E. (1960). Multivariate analysis illustrated by Nike-Hercules, In Proceedings of the Thirtieth Conference on the Design of Experiments in Army Research, Development, and Testing, U.S. Army Research Office, Durham, N.C. 307-327.
  17. Jackson, J. E. (1991). A User's Guide to Principal Components, John Wiley & Sons, INC.
  18. Jolliffe, I. T. (1972). Discarding variables in a principal component analysis I: Artificial data, Journal of the Royal Statistical Society, Series C (Applied Statistics), 21, 160-173.
  19. King, R. J. and Jackson, D. A. (1999). Variable selection in large environmental data sets using principal components analysis, Environmetrics, 10, 67-77. https://doi.org/10.1002/(SICI)1099-095X(199901/02)10:1<67::AID-ENV336>3.0.CO;2-0
  20. Lambert, Z. V., Wildt, A. R. and Durand, R. M. (1990). Assessing sampling variation relative to number-of-factors criteria, Educational and Psychological Measurement, 50, 33-48. https://doi.org/10.1177/0013164490501004
  21. Lorenz, M. O. (1905). Methods of measuring the concentration of wealth, American Statistical Association, 9, 209-219. https://doi.org/10.2307/2276207
  22. MacArthur, R. H. (1957). On the relative abundance of bird species, In Proceedings of the National Academy of Sciences USA, 43, 293-295. https://doi.org/10.1073/pnas.43.3.293
  23. Pielou, E. C. (1975). Ecological Diversity, Willy, New York.
  24. Richard, C. and Alain, G. (2007). Component retention in principal component analysis with application to cDNA microarray data, Biology Direct, 2, 2. https://doi.org/10.1186/1745-6150-2-2
  25. Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations, Psychometrika, 41, 321-327. https://doi.org/10.1007/BF02293557
  26. Zwick, W. R. and Velicer, W. F. (1986). Comparison of five rules for determining the number of components to retain, Psychological Bulletin, 99, 432-442. https://doi.org/10.1037/0033-2909.99.3.432