Browse > Article
http://dx.doi.org/10.5351/KJAS.2005.18.2.395

Analysis of Large Tables  

Choi, Hyun-Jip (Department of Applied Information Statistics, Kyonggi University)
Publication Information
The Korean Journal of Applied Statistics / v.18, no.2, 2005 , pp. 395-410 More about this Journal
Abstract
For the analysis of large tables formed by many categorical variables, we suggest a method to group the variables into several disjoint groups in which the variables are completely associated within the groups. We use a simple function of Kullback-Leibler divergence as a similarity measure to find the groups. Since the groups are complete hierarchical sets, we can identify the association structure of the large tables by the marginal log-linear models. Examples are introduced to illustrate the suggested method.
Keywords
Large tables; Collapsibility; Kullback-Leibler divergence; Marginal log-linear models;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Edwards, D. (2000). Introduction to Graphical Modelling, Springer-Verlag
2 Erosheva, E. A., Fienberg, S. E., and Junker, B. W. (2002). Alternative statistical models and representations for large sparse multi-dimensional contingency tables, Annales de la Faculte de Sciences de Toulouse, 11, 485-505   DOI   ScienceOn
3 Fienberg, S. E. (2000). Contingency tables and log-linear models: Basic results and new developments, Journal of the American Statistical Association, 95, 643-647   DOI   ScienceOn
4 Giudici, P. and Passerone, G. (2002). Data mining of association structures to model consumer behaviour, Computational Statistics & Data Analysis, 38, 533-541   DOI   ScienceOn
5 Kojadinovic, I. (2004). Agglomerative hierarchical clustering of continuous variables based on mutual information, Computational Statistics & Data Analysis, 46, 269-294   DOI   ScienceOn
6 Law, G. R., Cox, D. R., Machonochie, N. E. S., E. Roman, J. S., and Carpenter, L. M. (2001). Large Tables, Biostatistics, 2, 163-171   DOI   ScienceOn
7 Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics, John Wiley & Sons
8 Bergsma, W. P. and Rudas, T. (2002). Marginal models for categorical data, Annals of Statistics, 30, 140-159   DOI   ScienceOn
9 Christensen, R. (1997). Log-Linear Models and Logistic Regression 2nd, Springer-Verlag
10 DuMouchel, W. (1999). Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system, The American Statistician, 53, 177-190   DOI   ScienceOn
11 Agresti, A., Lipsitz, S., and Lang, J. B. (1992). Comparing marginal distributions of large, sparse contingency tables, Computational Statistics & Data Analysis, 14, 55-73   DOI   ScienceOn
12 Kullback, S., Leibler, R. A. (1951). On information and sufficiency, Annals of Mathmatical Statistics, 22, 79-86   DOI   ScienceOn