DOI QR코드

DOI QR Code

An application of mutual information in mathematical statistics education

  • Yi, Seongbaek (Department of Statistics, Pukyong National University) ;
  • Jang, Dae-Heung (Department of Statistics, Pukyong National University)
  • 투고 : 2015.06.19
  • 심사 : 2015.07.20
  • 발행 : 2015.07.31

초록

In mathematical statistics education, we can use mutual information as a tool for evaluating the degree of dependency between two random variables. The ordinary correlation coefficient provides information only on linear dependency, not on nonlinear relationship between two random variables if any. In this paper as a measure of the degree of dependency between random variables, we suggest the use of symmetric uncertainty and ${\lambda}$ which are defined in terms of mutual information. They can be also considered as generalized correlation coefficients for both linear and non-linear dependence of random variables.

키워드

참고문헌

  1. Block, H. B. and Basu, A. P. (1974). A continuous bivariate exponential extension. Journal of the American Statistical Association, 69, 1031-1037.
  2. Chelikani, S., Purushothaman, K. and Duncan, J. S. (2003). Support vector machine density estimator as a generalized Parzen windows estimator for mutual information based image registration. Lecture Notes in Computer Science, 2879, 854-861.
  3. Cover, T. M. and Thomas, J. A. (1991). Elements of information theory, Wiley, New York.
  4. Darbellay, G. A. (1998). An adaptive histogram estimator for the estimator for the mutual information, Research Report no. 1936, UTIA, Academy of Science, Prague.
  5. Darbellay, G. A. (1999). An estimator of the mutual information based on a criterion for independence. Computational Statistics and Data Analysis, 32, 1-17. https://doi.org/10.1016/S0167-9473(99)00020-1
  6. Darbellay, G. A. and Vajda, I. (1999). Estimation of the information by an adaptive partition of the observation space. IEEE Transactions on Information Theory, 45, 1315-1321. https://doi.org/10.1109/18.761290
  7. Gomez-Verdejo, V., Martinez-Ramon, M., Florensa-Vila, J. and Oliviero, A. (2012). Analysis of fMRI time series with mutual information. Medical Image Analysis, 16, 451-458. https://doi.org/10.1016/j.media.2011.11.002
  8. Gumbel, E. J. (1961). Bivariate logistic distribution. Journal of the American Statistical Association, 56, 335-349. https://doi.org/10.1080/01621459.1961.10482117
  9. Hand, D. J., Daly, F., Lunn, A. D., McConway, K. J. and Ostrowski, E. (1994). A handbook of small data sets, Chapman and Hall, London.
  10. Harrold, T. I., Sharma, A. and Sheather, S. (2001). Selection of a kernel bandwidth for measuring dependence in hydrologic time series using the mutual information criterion. Stochastic Environmental Research and Risk Assessment, 15, 310-324. https://doi.org/10.1007/s004770100073
  11. Kraskov, A., Stogbauer, H. and Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138/1-066138/16.
  12. Vretos, N., Solachidis, V., and Pitas, I. (2011). A mutual information based face clustering algorithm for movie content analysis. Image and Vision Computing, 29, 693-705. https://doi.org/10.1016/j.imavis.2011.07.006
  13. Witten, I. H. and Frank, E. (2005). Data mining: Practical machine learning tools and techniques, Morgan Kaufmann Publishers Inc., San Francisco.
  14. Wu, E. H., Philip, L. H. and Li, W. K. (2009). A smoothed bootstrap test for independence based on mutual information. Computational Statistics and Data Analysis, 53, 2524-2536. https://doi.org/10.1016/j.csda.2008.11.032
  15. Zhou, G., Yang, L., Su, J. and Ji, D. (2005). Mutual information independence model using kernel density estimation for segmenting and labeling sequential data. Lecture Notes in Computer Science, 3406, 155-166.
  16. Zeng, J., Xie, L., Kruger, U. and Gao, C. (2012). A non-Gaussian regression algorithm based on mutual information maximization. Chemometrics and Intelligent Laboratory Systems, 111, 1-19. https://doi.org/10.1016/j.chemolab.2011.08.005