1 |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, Cambridge, MA: The MIT Press, Mar. 1998.
|
2 |
O. Chapelle and L. Li, "An empirical evaluation of Thompson sampling," in. Proc. of Advances in Neural Information Processing Systems, pp. 2249-2257, 2011.
|
3 |
J. Komiyama, J. Honda, and H. Nakagawa, "Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays," in. Proc. of the 32nd International Conference on Machine Learning, pp. 1152-1161, 2015.
|
4 |
M. Ali, S. Qaisar, M. Naeem, W. Ejaz, and N. Kvedaraite, "LTE-U WiFi HetNets: Enabling Spectrum Sharing for 5G/Beyond 5G Systems," IEEE Internet of Things Magazine, vol. 3, no. 4, pp. 60-65, Dec. 2020.
DOI
|
5 |
Y. Xing, J. Han, K. Xue, J. Liu, M. Pan, and P. Hong, "MPTCP Meets Big Data: Customizing Transmission Strategy for Various Data Flows," IEEE Network, vol. 34, no. 4, pp. 35-41, Jul./Aug. 2020.
DOI
|
6 |
B. C. Chung and H. Park, "Path selection algorithm for multi-path system based on deep Q learning," Journal of the Korea Institute of Information and Communication Engineering, vol. 25, no. 1, pp. 50-55, Jan. 2021.
DOI
|
7 |
M. S. Kim, J. Y. Lee, and B. C. Kim, "Design of MPTCP congestion control based on BW measurement for wireless networks," Journal of the Korea Institute of Information and Communication Engineering, vol. 21, no. 6, pp. 1127-1136, Jun. 2017.
DOI
|