Fig. 1. A3C Model Architecture
Fig. 2. Structure of Dataset Xt for Window Size n
Fig. 3. Portfolio Management Process Example
Fig. 4. Weight Distribution Ratio of Back-Test #3-2
Table 1. Example of Abnormal Data in the Collected BTC Data
Table 2. Data ranges for back-test
Table 3. The Final PVVR Comparison between Models A3C-DPG, DPG, and Random
Table 4. The TE and IR Values of A3C-DPG Model based on DPG Model
References
- Nakamoto, Satoshi, Bitcoin: A Peer-to-Peer Electronic Cash System, Cryptography Mailing list at https://metzdowd.com, 2009.
- "GUNBOT - Crypto Trading Bot," GUNBOT, https://www.gunbot.com, 2018.
- "start [ProfitTrailer Wiki]", ProfitTrailer, https://wiki.profittrailer.com/doku.php?id=start, 2018.
- I. Kaastra and M. Boyd, “Designing a neural network for forecasting financial and economic time series,” Neurocomputing, Vol. 10, No. 3, pp. 215-236, 1996. https://doi.org/10.1016/0925-2312(95)00039-9
- Candela, "Dataset shift in machine learning," London: MIT Press, 006.3 CAN, 2009.
- Y. B, Kim, "Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies," PLoS ONE, Vol. 11, No. 8, e0161197, 2016. https://doi.org/10.1371/journal.pone.0161197
- Sean McNally, "Predicting the Price of Bitcoin Using Machine Learning," 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 339-343, Mar. 2018.
- R. Sutton and A. Barto, "Reinforcement Learning: an Introduction," MIT Press, 1998.
- Volodymyr Mnih, "Asynchronous Methods for Deep Reinforcement Learning," Proceedings of the 33rd International Conference on MachineLearning, New York, NY, USA, 2016. JMLR: W&CP volume48.
- Arun Nair, "Massively Parallel Methods for Deep Reinforcement Learning," at Deep Learning Workshop, International Conference on Machine Learning, Lille, France, 2015.
- Zhengyao Jiang, "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem," In JMLR, 30 pages, 5 figures, 2017.
- Christopher JCH Watkins and Peter Dayan. "Q-Learning," Machine Learning, Vol. 8, No. 3-4, pp. 279-292, 1992. https://doi.org/10.1023/A:1022676722315
- Kai Arulkumaran, "A Brief Survey of Deep Reinforcement Learning," in IEEE Signal Processing Magazine Special Issue On Deep Learning For Image Understanding, 2017.
- Hado van Hasselt, "Deep Reinforcement Learning with Double Q-learning," Proceedings of 30th AAAI Conference on Artificial Intelligence (AAAI-16).
- David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Ried-miller, "Deterministic Policy Gradient Algorithms," ICML(International Conference on Machine Learning) Proceedings of the 31st, pp. 387-395, 2014.
- K. Chepuri, T. Homem de Mello, "Solving the vehicle routing problem with stochastic demands using the cross entropy method," Annals of Operations Research, 2004.
- G. Alon, D. P. Kroese, T. Raviv, and R. Y. Rubinstein, "Application of the Cross-entropy method to the buffer allocation problem in a simulation-based environment," Annals of Operations Research, 2004.
- Gaivoronski, "Stochastic nonstationary optimization for finding universal portfolios," in Annals of Operations Research, Vol. 100, No. 1, pp. 165-188, 2000. https://doi.org/10.1023/A:1019271201970
- Agarwal, A., "Algorithms for portfolio management basedon the newton method," in ICML, New York, NY, USA (2006)
- Bin Li, Peilin Zhao, Steven C. H. Hoi, and Vivekanand Gopalkrishnan. "Passive aggressive mean reversion strategy for portfolio selection," PSMR, Machine Learning, Vol. 87, No. 2, pp. 221-258, 2012. https://doi.org/10.1007/s10994-012-5281-z
- Seyed Taghi Akhavan Niaki and Saeid Hoseinzade. "Forecasting S&P 500 index using artificial neural networks and design of experiments," Journal of Industrial Engineering International, Vol. 9, No. 1, p.1, 2013. https://doi.org/10.1186/2251-712X-9-1
- Katia Sycara, K. Decker and Dajun Zeng, "Designing a Multi-Agent Portfolio Management System," Proceedings of the AAAI Workshop on Internet Information Systems, 1995.
- K. Sycara, A. Pannu, M. Willamson, Dajun Zeng, K. Decker, "Distributed intelligent agents," IEEE Expert, Vol. 11, Issue 6, Dec. 1996.
- Hiroshi Takahashi, "Analyzing the Effectiveness of Investment Strategies through Agent-based Modelling: Overconfident Investment Decision Making and Passive Investment Strategies," eKNOW, 6th International Conference, 2014.
- "API - Bithumb," Bithumb, https://www.bithumb.com/u1/US127, 2018.
- Mnih, Volodymyr, “Human-level control through deep reinforcementlearning,” Nature, Vol. 518, No. 7540, pp. 529-533, 2015. https://doi.org/10.1038/nature14236
- Mu Li, "Efficient Mini-batch Training for Stochastic Optimization," In 2014 ACM, 978-1-4503-2956-9, 2014.
- Edward Qian, “Active Risk And Information Ratio,” Journal of Investment Management, Vol. 2, No. 3, pp. 1-15, 2004. https://doi.org/10.11648/j.jim.20130201.11