Contribution of Principal Components Based on the Broken-Stick Model

Kang, Y.J.;Byun, J.H.;Ki, K.Y.;

doi:10.5351/KJAS.2010.23.4.767

응용통계연구 (The Korean Journal of Applied Statistics)

제23권4호
/
Pages.767-776
/
2010
/
1225-066X(pISSN)
/
2383-5818(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

Broken-Stick 모형에 기초한 주성분 공헌도평가

Contribution of Principal Components Based on the Broken-Stick Model

강유정 (롯데카드주식회사) ;
변자현 (고려대학교 통계학과) ;
김기영 (고려대학교 통계학과)

Kang, Y.J. (Lotte Card Co., Ltd.) ;
Byun, J.H. (Department of Statistics, Korea University) ;
Ki, K.Y. (Department of Statistics, Korea University)

투고 : 20100600
심사 : 20100600
발행 : 2010.08.31

https://doi.org/10.5351/KJAS.2010.23.4.767 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Broken-Stick 모형 (Barton과 David, 1956) 하에서 순서화된 분절구간의 기대길이를 기초로 유효차원의 개수를 결정하는 Frontier (1976)방법은 일관된 모의실험 결과를 제공하는 기준 중의 하나로 보고된 바 있다 (Jackson, 1993). 이 연구에서는 Broken-Stick 모형(BSM) 하에서 분절구간길이의 분포를 이용하여 주성분 상대공헌도의 크기를 확률적으로 평가하는 BSM 유의확률기준을 제안한다. 이에 부가하여 소득분포의 불균등성을 도식화한 로렌츠곡선과 이에 대응하는 지니계수를 통해 주성분 공헌도의 포괄적 균등성을 탐구한다.

Frontier (1976) suggested a criterion based on the expected length of ordered random intervals under the Broken-stick model (Barton and David, 1956) to determine the optimal number of principal components retained. It is considered to be one of the methods that provide the most consistent simulation results (Jackson, 1993). This study is aimed to propose a method using the distribution of ordered random intervals to evaluate the contribution of principal components. We also examine several types of Gini indices along with the corresponding Lorenz curves to visualize the overall equivalence of those contributions.

키워드

참고문헌

Almorza, D. A. and Garcia, M. H. (2008). Results of exploratory data analysis in the broken Stick model, Journal of Applied Statistics, 35, 979-983. https://doi.org/10.1080/02664760802187536
Anand, S. (1983). Inequality and Poverty in Malaysia, Oxford University Press, New York.
Anderson, T. W. (1963). An asymptotic expansion for the distribution of the latent roots of an estimated covariance matrix, The Annals of Mathematical Statistics, 36, 1153-1173. https://doi.org/10.1214/aoms/1177699989
Bartlett, M. S. (1950). Tests of significance in factor analysis, British Journal of Psychology(Statistical section), 3, 77-85. https://doi.org/10.1111/j.2044-8317.1950.tb00285.x
Barton, D. E. and David, F. N. (1956). Some notes on ordered random intervals, Journal of the Royal Statistical Society, Series B, 18, 79-94.
Benasseni, J. (2005). A concentration study of principal components, Journal of Applied Statistics, 32, 947-957. https://doi.org/10.1080/02664760500163664
Box, G. E. P., Hunter, W. G., MacGregor, J. F. and Erjavaz, J. (1973). Some problems associated with the analysis of multiresponse data, Technometrics, 15, 33-51. https://doi.org/10.2307/1266823
Cattell, R. B. (1966). The scree test for the number of factors, Multivariate Behavioral Research, 1, 245-276. https://doi.org/10.1207/s15327906mbr0102_10
Frontier, S. (1976). Etude de la decroissance des valeurs propres dans une analyse en composantes principales: Comparison avec le modele du baton brise, Journal of Experimental Marine Biology and Ecology, 25, 67-75. https://doi.org/10.1016/0022-0981(76)90076-9
Gini, C. (1912). Variabilita e mutabilita, Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Libreria Eredi Virgilio Veschi, 1955, Rome.
Hastie, T., Buja, A. and Tibshirani, R. (1995). Penalized discriminant analysis, The Annals of Statistics, 23, 73-102. https://doi.org/10.1214/aos/1176324456
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis, Psychometrika, 30, 179-185. https://doi.org/10.1007/BF02289447
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 24, 417-441. https://doi.org/10.1037/h0071325
Jackson, D. A. (1993). Stopping rules in principal components analysis: A comparison of heuristical and statistical approaches, Ecological Society of America, 74, 2204-2214.
Jackson, J. E. (1959). Quality control methods for several related variables, Technometrics, 1, 359-377. https://doi.org/10.2307/1266717
Jackson, J. E. (1960). Multivariate analysis illustrated by Nike-Hercules, In Proceedings of the Thirtieth Conference on the Design of Experiments in Army Research, Development, and Testing, U.S. Army Research Office, Durham, N.C. 307-327.
Jackson, J. E. (1991). A User's Guide to Principal Components, John Wiley & Sons, INC.
Jolliffe, I. T. (1972). Discarding variables in a principal component analysis I: Artificial data, Journal of the Royal Statistical Society, Series C (Applied Statistics), 21, 160-173.
King, R. J. and Jackson, D. A. (1999). Variable selection in large environmental data sets using principal components analysis, Environmetrics, 10, 67-77. https://doi.org/10.1002/(SICI)1099-095X(199901/02)10:1<67::AID-ENV336>3.0.CO;2-0
Lambert, Z. V., Wildt, A. R. and Durand, R. M. (1990). Assessing sampling variation relative to number-of-factors criteria, Educational and Psychological Measurement, 50, 33-48. https://doi.org/10.1177/0013164490501004
Lorenz, M. O. (1905). Methods of measuring the concentration of wealth, American Statistical Association, 9, 209-219. https://doi.org/10.2307/2276207
MacArthur, R. H. (1957). On the relative abundance of bird species, In Proceedings of the National Academy of Sciences USA, 43, 293-295. https://doi.org/10.1073/pnas.43.3.293
Pielou, E. C. (1975). Ecological Diversity, Willy, New York.
Richard, C. and Alain, G. (2007). Component retention in principal component analysis with application to cDNA microarray data, Biology Direct, 2, 2. https://doi.org/10.1186/1745-6150-2-2
Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations, Psychometrika, 41, 321-327. https://doi.org/10.1007/BF02293557
Zwick, W. R. and Velicer, W. F. (1986). Comparison of five rules for determining the number of components to retain, Psychological Bulletin, 99, 432-442. https://doi.org/10.1037/0033-2909.99.3.432

응용통계연구 (The Korean Journal of Applied Statistics)

Broken-Stick 모형에 기초한 주성분 공헌도평가

Contribution of Principal Components Based on the Broken-Stick Model

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)