1 |
J. Peters and S. Schaal, “Reinforcement learning of motor skills with policy gradients,” Neural networks, Vol. 21, No. 4, pp. 682-697, May 2008.
DOI
|
2 |
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.
|
3 |
R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," Vol. 1, MIT press Cambridge, 1998.
|
4 |
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic policy gradient algorithms," In ICML, June 2014.
|
5 |
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al, "Human-level control through deep reinforcement learning," Nature, Vol. 518, pp. 529-533, Feb. 2015.
DOI
|
6 |
C.-L. Hwang, H.-M.Wu, and C.-L. Shih, “Fuzzy sliding-mode underactuated control for autonomous dynamic balance of an electrical bicycle,” IEEE transactions on control systems technology, Vol. 17, No. 3, pp. 658-670, May 2009.
DOI
|
7 |
G. E. Uhlenbeck and L. S. Ornstein, “On the theory of the brownian motion,” Physical review, Vol. 36, No. 5, pp. 823-841, Sep. 1930.
DOI
|
8 |
D. P. Kingma and J. Ba. Adam, "A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
|
9 |
S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration," In International Conference on Machine Learning, pp. 2829-2838, June 2016.
|
10 |
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017.
|
11 |
J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, "Trust region policy optimization," In International Conference on Machine Learning, pp. 1889-1897, 2015.
|
12 |
M. Lu and X. Li, "Deep reinforcement learning policy in Hex game system," 2018 Chinese Control And Decision Conference (CCDC), pp. 6623-6626, 2018.
|
13 |
E. Bejar and A. Moran, "Deep reinforcement learning based neuro-control for a two-dimensional magnetic positioning system," 2018 4th International Conference on Control, Automation and Robotics (ICCAR), pp. 268-273, 2018.
|
14 |
T. Yasuda and K. Ohkura, "Collective Behavior Acquisition of Real Robotic Swarms Using Deep Reinforcement Learning," 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 179-180, 2018.
|
15 |
L. Keo and M. Yamakita, "Controlling balancer and steering for bicycle stabilization," 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4541-4546, Oct. 2009.
|
16 |
J. P. Meijaard, J. M. Papadopoulos, A. Ruina, and A. L. Schwab, "Linearized dynamics equations for the balance and steer of a bicycle: a benchmark and review," In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, Vol. 463, No. 2084, pp. 1955-1982. The Royal Society, Aug. 2007.
DOI
|
17 |
A. Schwab, J. Meijaard, and J. Kooijman, "Some recent developments in bicycle dynamics," In Proceedings of the 12th World Congress in Mechanism and Machine Science, pp. 1-6, 2007.
|
18 |
J. Tan, Y. Gu, C. K. Liu, and G. Turk, “Learning bicycle stunts,” ACM Transactions on Graphics (TOG), Vol. 33, No. 4, pp. 1-16, 2014.
|
19 |
Google Nederland, "Introducing the self-driving bicycle in the netherlands," March, 2017.
|
20 |
J. Randlv and P. Alstrm, "Learning to drive a bicycle using reinforcement learning and shaping," Proceeding ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463-471, 1998.
|
21 |
L. P. Tuyen and T. Chung, "Controlling bicycle using deep deterministic policy gradient algorithm," In Ubiquitous Robots and Ambient Intelligence (URAI), 2017 14th International Conference on, pp. 413-417. IEEE, 2017.
|