1 |
M. Sridharan and G. Tesauro, 'Multi-agent Q-learning and regression trees for automated pricing decisions,' Proc. 17th Int'l Conf. Machine Learning, Stanford, CA, 2000
|
2 |
S. Haykin, Neural Network, 2ndEd, Prentice-Hall, New Jersey, 1999, p. 625
|
3 |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press/Bradford Books, 1998, pp. 4-5, 26-27
|
4 |
G. Cybenko, R. Gray and K. Moizumi, 'Q-Learning: A Tutorial and Extensions,' Proc. Conf. Mathematics of Artificial Neural Networks, Oxford University, England, July, 1995
|
5 |
J. Hu and M. P. Wellman, 'Multiagent reinforcement learning: theoretical framework and an algorithm,' Proc. Int'l Conf. Machine Learning, 1998
|
6 |
G. Tesauro, 'Pricing in agent economies using neural networks and multi-agent Q-learning,' Proc. IJCAI-99 Workshop, Learning About, From and With Other Agents, Stockholm, Sweden, Aug. 1999
|
7 |
A. Greenwald, J. Kephart, and G. Tesauro, 'Strategic Pricebot Dynamics,' Proc. 1st ACM Conf. Electronic Commerce, Oct. 1999
DOI
|
8 |
A. Greenwald and J. O. Kephart, 'Shopbots and Pricebots,' Proc. Int'l J Conf. Artifical Intelligence, Stockholm, Sweden, 1999
|
9 |
C. J. C. H. Watkins, 'Learning from delayed rewards,' Ph. D. thesis, Cambridge University, 1989
|
10 |
G. Tesauro and J. O. Kephart, 'Pricing in agent economies using multi-agent Q-learning,' Proc. Workshop, Game Theoretic and Decision Theoretic Agents, London, England, July, 1999
|
11 |
G. J. Tesauro and J. O. Kephart, 'Foresight-based pricing algorithms in an economy of software agents,' Proc. ICE-93, 1998, pp. 37-44
DOI
|
12 |
T. M. Mitchell, Machine Learning, McGraw-Hill, 1997, pp. 378-379, p. 382
|