DOI QR코드

DOI QR Code

STOCHASTIC GRADIENT METHODS FOR L2-WASSERSTEIN LEAST SQUARES PROBLEM OF GAUSSIAN MEASURES

  • YUN, SANGWOON (DEPARTMENT OF MATHEMATICS EDUCATION, SUNGKYUNKWAN UNIVERSITY) ;
  • SUN, XIANG (SCHOOL OF MATHEMATICAL SCIENCES, OCEAN UNIVERSITY OF CHINA) ;
  • CHOI, JUNG-IL (SCHOOL OF MATHEMATICS AND COMPUTING (COMPUTATIONAL SCIENCE & ENGINEERING), YONSEI UNIVERSITY)
  • Received : 2021.09.15
  • Accepted : 2021.12.08
  • Published : 2021.12.25

Abstract

This paper proposes stochastic methods to find an approximate solution for the L2-Wasserstein least squares problem of Gaussian measures. The variable for the problem is in a set of positive definite matrices. The first proposed stochastic method is a type of classical stochastic gradient methods combined with projection and the second one is a type of variance reduced methods with projection. Their global convergence are analyzed by using the framework of proximal stochastic gradient methods. The convergence of the classical stochastic gradient method combined with projection is established by using diminishing learning rate rule in which the learning rate decreases as the epoch increases but that of the variance reduced method with projection can be established by using constant learning rate. The numerical results show that the present algorithms with a proper learning rate outperforms a gradient projection method.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Ministry of Science, ICT & Future Planning (NRF-20151009350, NRF-2016R1-A5A1008055, NRF-2016R1D1A1B03934371 and NRF-2019R1F1A1057051).

References

  1. C. R. Givens and R. M. Shortt, A class of wasserstein metrics for probability distributions, Mich. Math. J. 31 (1984), 231-240.
  2. S. Kum and S. Yun, Gradient projection methods for the n-coupling problem, J. Korean Math. Soc. 56 (2019), 1001-1016. https://doi.org/10.4134/JKMS.J180517
  3. M. Agueh and G. Carlier, Barycenters in the wasserstein space, SIAM J. Math. Anal. 43 (2011), 904-924. https://doi.org/10.1137/100805741
  4. Y.-H. Kim and B. Pass, Multi-marginal optimal transport on Riemannian manifolds, Amer. J. Math. 137 (2015), 1045-1060. https://doi.org/10.1353/ajm.2015.0024
  5. B. Pass, The local structure of optimal measures in the multi-marginal optimal transportation problem, Calc. Var. Partial Differential Equations 43 (2012), 529-536. https://doi.org/10.1007/s00526-011-0421-z
  6. G. Carlier and I. Ekeland, Matching for teams, Econ. Theory 42 (2010), 397-418. https://doi.org/10.1007/s00199-008-0415-z
  7. G. Carlier, A. Oberman, and E. Oudet, Numerical methods for matching for teams and wasserstein barycenters, ESAIM: M2AN 49 (2015), 1621-1642. https://doi.org/10.1051/m2an/2015033
  8. A. Mallasto and A. Feragen, Learning from uncertain curves: The 2-wasserstein metric for gaussian processes, In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5660-5670. Curran Associates, Inc., 2017.
  9. J. Rabin, G. Peyre, J. Delon, and M. Bernot, Wasserstein barycenter and its application to texture mixing, In Proceedings of the Third International Conference on Scale Space and Variational Methods in Computer Vision, SSVM'11, pages 435-446, Berlin, Heidelberg, 2012.
  10. S. Srivastava, C. Li, and D. B. Dunson, Scalable bayes via barycenter in wasserstein space, J. Mach. Learn. Res. 19 (2018), 312-346.
  11. P. C. Alvarez Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matran, A fixed-point approach to barycenters in wasserstein space, J. Math. Anal. Appl. 441 (2016), 744-762. https://doi.org/10.1016/j.jmaa.2016.04.045
  12. R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Adv. Neural Inf. Process. Syst. 26, NIPS'13, USA 2013.
  13. D. P. Bertsekas, Incremental proximal methods for large scale convex optimization, Math. Program. Ser. B 129 (2011), 163--195. https://doi.org/10.1007/s10107-011-0472-0
  14. L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM J. Optim. 24 (2014), 2057--2075. https://doi.org/10.1137/140961791
  15. A. S. Lewis and J. Malick, Alternating projections on manifolds, Math. Oper. Res. 33 (2008), 216-234. https://doi.org/10.1287/moor.1070.0291
  16. R. Bhatia, T. Jain, and Y. Lim, On the bures-wasserstein distance between positive definite matrices, Expositiones Mathematicae 37 (2019), 165-191. https://doi.org/10.1016/j.exmath.2018.01.002