Direct Divergence Approximation between Probability Distributions and Its Applications in Machine Learning |
Sugiyama, Masashi
(Department of Computer Science, Tokyo Institute of Technology)
Liu, Song (Department of Computer Science, Tokyo Institute of Technology) du Plessis, Marthinus Christoffel (Department of Computer Science, Tokyo Institute of Technology) Yamanaka, Masao (Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology) Yamada, Makoto (NTT Communication Science Laboratories, NTT Corporation) Suzuki, Taiji (Department of Mathematical Informatics, The University of Tokyo) Kanamori, Takafumi (Department of Computer Science and Mathematical Informatics, Nagoya University) |
1 | M. Yamada and M. Sugiyama, "Direct density-ratio estimation with dimensionality reduction via hetero-distributional subspace analysis," in Proceedings of the 25th AAAI Conference on Artificial Intelligence, San Francisco, CA, 2011, pp. 549-554. |
2 | M. Yamada, M. Sugiyama, G. Wichern, and J. Simm, "Direct importance estimation with a mixture of probabilistic principal component analyzers," IEICE Transactions on Information and Systems, vol. 93, no. 10, pp. 2846-2849, 2010. |
3 |
A. Keziou, "Dual representation of |
4 | R. T. Rockafellar, Convex Analysis. Princeton, NJ: Princeton University Press, 1970. |
5 | R. Tibshirani, "Regression shrinkage and subset selection with the lasso," Journal of the Royal Statistical Society B, vol. 58, no. 1, pp. 267-288, 1996. |
6 | R. Tomioka, T. Suzuki, and M. Sugiyama, "Super-linear convergence of dual augmented Lagrangian algorithm for sparsity regularized estimation," Journal of Machine Learning Research, vol. 12, pp. 1537-1586, 2011. |
7 | B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, "Least angle regression," The Annals of Statistics, vol. 32, no. 2, pp. 407-499, 2004. DOI ScienceOn |
8 | O. Chapelle, B. Scholkopf, and A. Zien, Semi-Supervised Learning. Cambridge, MA: MIT Press, 2006. |
9 | R. Rifkin, G. Yeo, and T. Poggio, "Regularized least-squares classification," in Advances in Learning Theory: Methods, Models and Applications, J. A. K. Suykens, G. Horvath, S. Basu, C. Micchelli, and J. Vandewalle, Eds. Amsterdam, the Netherlands: IOS Press, 2003, pp. 131-154. |
10 | M. Sugiyama, M. Krauledat, and K. R. Muller, "Covariate shift adaptation by importance weighted cross validation," Journal of Machine Learning Research, vol. 8, pp. 985- 1005, 2007. |
11 | T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H. Y. Shum, "Learning to detect a salient object," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2, pp. 353-367, 2011. DOI ScienceOn |
12 | C. E. Shannon, "A mathematical theory of communication," Bell Systems Technical Journal, vol. 27, pp. 379-423, 1948. DOI |
13 | T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed., Hoboken, NJ: John Wiley & Sons Inc., 2006. |
14 | K. Torkkola, "Feature extraction by non-parametric mutual information maximization," Journal of Machine Learning Research, vol. 3, pp. 1415-1438, 2003. |
15 | J. Sainui and M. Sugiyama, "Direct approximation of quadratic mutual information and its application to dependencemaximization clustering," IEICE Transactions on Information and Systems, 2013, submitted for publication. |
16 | M. Sugiyama, M. Kawanabe, and P. L. Chui, "Dimensionality reduction for density ratio estimation in high-dimensional spaces," Neural Networks, vol. 23, no. 1, pp. 44-59, 2010. DOI ScienceOn |
17 | M. Sugiyama, M. Yamada, P. von Bunau, T. Suzuki, T. Kanamori, and M. Kawanabe, "Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search," Neural Networks, vol. 24, no. 2, pp. 183-198, 2011. DOI ScienceOn |
18 | M. Sugiyama, T. Suzuki, S. Nakajima, H. Kashima, P. von Bunau, and M. Kawanabe, "Direct importance estimation for covariate shift adaptation," Annals of the Institute of Statistical Mathematics, vol. 60, no. 4, pp. 699-746, 2008. DOI ScienceOn |
19 | T. Kanamori, S. Hido, and M. Sugiyama, "A least-squares approach to direct importance estimation," Journal of Machine Learning Research, vol. 10, pp. 1391-1445, 2009. |
20 | X. Nguyen, M. J. Wainwright, and M. I. Jordan, "Estimating divergence functionals and the likelihood ratio by convex risk minimization," IEEE Transactions on Information Theory, vol. 56, no. 11, pp. 5847-5861, 2010. DOI ScienceOn |
21 | M. Yamada, T. Suzuki, T. Kanamori, H. Hachiya, and M. Sugiyama, "Relative density-ratio estimation for robust distribution comparison," Neural Computation, vol. 25, no. 5, pp. 1324-1370, 2013. DOI ScienceOn |
22 | M. Sugiyama, T. Suzuki, T. Kanamori, M. C. du Plessis, S. Liu, and I. Takeuchi, "Density difference estimation," Neural Computation, 2013, to appear. |
23 | S. Kullback and R. A. Leibler, "On information and sufficiency," The Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79-86, 1951. DOI ScienceOn |
24 | S. Amari and H. Nagaoka, Methods of Information Geometry, Providence, RI: American Mathematical Society, 2000. |
25 | M. Sugiyama, T. Suzuki, and T. Kanamori, Density Ratio Estimation in Machine Learning, New York, NY: Cambridge University Press, 2012. |
26 | C. Cortes, Y. Mansour, and M. Mohri, "Learning bounds for importance weighting," in Advances in Neural Information Processing Systems 23, J. Lafferty, C. K. I. Williams, R. Zemel, J. Shawe-Taylor, and A. Culotta, Eds., La Jolla, CA: Neural Information Processing Systems, 2010, pp. 442-450. |
27 | K. Pearson, "On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling," Philosophical Magazine, vol. 50, no. 302, pp. 157-175, 1900. DOI |
28 | S. M. Ali and S. D. Silvey, "A general class of coefficients of divergence of one distribution from another," Journal of the Royal Statistical Society B, vol. 28, no. 1, pp. 131-142, 1966. |
29 | M. Sugiyama, T. Suzuki, and T. Kanamori, "Density ratio matching under the Bregman divergence: a unified framework of density ratio estimation," Annals of the Institute of Statistical Mathematics, vol. 64, no. 5, pp. 1009-1044, 2012. DOI |
30 | I. Csiszar, "Information-type measures of difference of probability distributions and indirect observation," Studia Scientiarum Mathematicarum Hungarica, vol. 2, pp. 299-318, 1967. |
31 | Y. Tsuboi, H. Kashima, S. Hido, S. Bickel, and M. Sugiyama, "Direct density ratio estimation for large-scale covariate shift adaptation," Information and Media Technologies, vol. 4, no. 2, pp. 529-546, 2009. |
32 | M. Yamada and M. Sugiyama, "Direct importance estima tion with Gaussian mixture models," IEICE Transactions on Information and Systems, vol. 92, no. 10, pp. 2159-2162, 2009. |
33 | M. Yamanaka, M. Matsugu, and M. Sugiyama, "Salient object detection based on direct density ratio estimation," IPSJ Transactions on Mathematical Modeling and Its Applications, 2013, to appear. |
34 | M. Yamanaka, M. Matsugu, and M. Sugiyama, "Detection of activities and events without explicit categorization," IPSJ Transactions on Mathematical Modeling and Its Applications, 2013, to appear. |
35 | S. Liu, M. Yamada, N. Collier, and M. Sugiyama, "Changepoint detection in time-series data by relative density-ratio estimation," Neural Networks, vol. 43, pp. 72-83, 2013. DOI ScienceOn |
36 | M. Sugiyama, "Machine learning with squared-loss mutual information," Entropy, vol. 15, no. 1, pp. 80-112, 2013. |
37 | M. Sugiyama and T. Suzuki, "Least-squares independence test," IEICE Transactions on Information and Systems, vol. 94, no. 6, pp. 1333-1336, 2011. |
38 | T. Suzuki, M. Sugiyama, T. Kanamori, and J. Sese, "Mutual information estimation reveals global associations between stimuli and biological processes," BMC Bioinformatics, vol. 10, no. 1, p. S52, 2009. DOI ScienceOn |
39 | T. Suzuki and M. Sugiyama, "Sufficient dimension reduction via squared-loss mutual information estimation," Neural Computation, vol. 25, no. 3, pp. 725-758, 2013. DOI ScienceOn |
40 | W. Jitkrittum, H. Hachiya, and M. Sugiyama, "Feature selection via L1-penalized squared-loss mutual information," IEICE Transactions on Information and Systems, 2013, to appear. |
41 | M. Yamada, G. Niu, J. Takagi, and M. Sugiyama, "Computationally efficient sufficient dimension reduction via squaredloss mutual information," JMLR Workshop and Conference Proceedings, vol. 20, pp. 247-262, 2011. |
42 | M. Karasuyama and Sugiyama, "Canonical dependency analysis based on squared-loss mutual information," Neural Networks, vol. 34, pp. 46-55, 2012. DOI ScienceOn |
43 | M. Yamada and M. Sugiyama, "Cross-domain object matching with model selection," JMLR Workshop and Conference Proceedings, vol. 15, pp. 807-815, 2011. |
44 | T. Suzuki and M. Sugiyama, "Least-squares independent component analysis," Neural Computation, vol. 23, no. 1, pp. 284-301, 2011. DOI ScienceOn |
45 | M. Sugiyama, M. Yamada, M. Kimura, and H. Hachiya, "On information-maximization clustering: tuning parameter selection and analytic solution," in Proceedings of the 28th International Conference on Machine Learning, Washington, DC, 2011, pp. 65-72. |
46 | M. Kimura and M. Sugiyama, "Dependence maximization clustering with least-squares mutual information," Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 15, no. 7, pp. 800-805, 2011. DOI |
47 | M. Yamada and M. Sugiyama, "Dependence minimizing regression with model selection for nonlinear causal inference under non-Gaussian noise," in Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, GA, 2010, pp. 643-648. |
48 | V. N. Vapnik, Statistical Learning Theory, New York, NY: Wiley, 1998. |
49 | T. Kanamori, T. Suzuki, and M. Sugiyama, "f-divergence estimation and two-sample homogeneity test under semiparametric density-ratio models," IEEE Transactions on Information Theory, vol. 58, no. 2, pp. 708-720, 2012. DOI ScienceOn |
50 | M. Sugiyama, T. Suzuki, Y. Itoh, T. Kanamori, and M. Kimura, "Least-squares two-sample test," Neural Networks, vol. 24, no. 7, pp. 735-751, 2011. DOI ScienceOn |
51 | Y. Kawahara and M. Sugiyama, "Sequential change-point detection based on direct density-ratio estimation," Statistical Analysis and Data Mining, vol. 5, no. 2, pp. 114-127, 2012. DOI |
52 | M. C. du Plessis and M. Sugiyama, "Semi-supervised learning of class balance under class-prior change by distribution matching," in Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, 2012, pp. 823-830. |
![]() |