1 |
Barto, A. G., Mahadevan, S., 'Recent advances in hierarchical reinforcement learning,' Discrete Event Systems Journal, Vol.13, pp. 41-77, 2003
DOI
ScienceOn
|
2 |
Fritzke, B., 'A growing neural gas network learns topologies,' In Proc. of the 7th Neural Information Processing Systems, pp. 625-632, 1995
|
3 |
Littman, M. L., Dean, T. L., Kaelbling, L. P., 'On the complexity of solving Markov decision problems,' Uncertainty in Artificial Intelligence, pp. 394-402, 1995
|
4 |
Dietterich, T. G., 'Hierarchical reinforcement learning with the MAXQ value function decomposition,' Journal of Artificial Intelligence Research, Vol.13, pp. 227-303, 2000
|
5 |
McGovern, A., Barto, A. G., 'Subgoal discovery for hierarchical reinforcement learning using learned policies,' In Proc. of the International Conference on Machine Learning, pp. 361-368, 2001
|
6 |
Jong, N.K., Stone, P., 'State abstraction discovery from irrelevant state variables,' In proc. of the 19th International Joint Conferences on Artificial Intelligence, pp. 752-757, 2005
|
7 |
da F. Costa, L., Rodrigues, F. A., Travieso, G., Boas, P. R. V., 'Characterization of complex networks: A survey of measurements,' 2005
|
8 |
Watkins, C. J., Dayan, P., 'Q-learning,' Machine Learning, Vol.8, pp. 279-292, 1992
|
9 |
Watts, D. J., Strogatz, S. H., 'Collective dynamics of 'small-world' networks,' Nature, Vol.393, pp. 404-407, 1998
PUBMED
|
10 |
Erdos, P., Renyi, A., 'On random graphs,' Publicationes Mathemticae (Debrecen), Vol.6, pp. 290- 297, 1959
|
11 |
Simsek, O., Wolfe, A. P., Barto, A. G., 'Identifying useful subgoals in reinforcement learning by local graph partitioning,' In Proc. of the 22nd International Conference on Machine Learning, pp. 816- 823, 2005
|
12 |
Adamic, L. A., Lukose, R. M., Puniyani, A. R., Huberman, B. A., 'Search in power-law networks,' Phys. Rev. E, Vol.64, pp. 46135-46143, 2001
DOI
|
13 |
Beleznay, F., Grobler, T., Szepesvari, C., 'Comparing value-function estimation algorithms in undiscounted problems,' 1999
|
14 |
Digney, B., 'Learning hierarchical control structure for multiple tasks and changing environments,' In Proc. of the 5th Conference on the Simulation of
Adaptive Behavior, 1998
|
15 |
Kleinberg, J., 'The Small-World Phenomenon: An Algorithmic Perspective,' In Proc. of the 32nd ACM Symposium on Theory of Computing, pp. 163-170, 2000
|
16 |
Jose del R. Millan, Posenato, D., Dedieu, E., 'Continuous- action q-learning,' Machine Learning, Vol.49, pp. 241-265, 2002
|
17 |
Barabasi, A.L., Albert, R., 'Emergence of scaling in random networks,' Science, Vol.286, pp. 509- 512, 1999
DOI
PUBMED
ScienceOn
|
18 |
Pickett, M., Barto, A. G., 'Policyblocks: An algo-rithm for creating useful macroactions in reinforcement learning,' In Proc. of the 9th International Conference on Machine Learning, pp. 506- 513, 2002
|
19 |
Sutton, R. S., Precup, D., Singh, S. P., 'Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning,' Artificial Intelligence, Vol.112, pp. 181-211, 1999
DOI
ScienceOn
|
20 |
Sutton, R. S., 'Reinforcement learning: A survey,' Journal of Artificial Intelligence Research, Vol.4, pp. 237-285, 1996
|