과제정보
연구 과제 주관 기관 : 한국연구재단
참고문헌
- M. L. Puterman, "Markov Decision Processes", John Wiley & Sons, 1994.
- L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, "Planning and acting in partially observable stochastic domains", Artificial Intelligence, 101:99-134, 1998. https://doi.org/10.1016/S0004-3702(98)00023-X
- A. R. Cassandra, L. P. Kaelbling, and J. A. Kurien, "Acting under uncertainty: Discrete Bayesian models for mobile robot navigation", In Proceedings of IEEE/ RSJ International Concerence on Intelligent Robots and Systems, 1996.
- C. Papadimitriou, and J. N. Tsitsiklis, "The complexity of Markov decision processes", Mathematics of Operations Research, 12(3):441-450, 1987. https://doi.org/10.1287/moor.12.3.441
- E. J. Sondik, "The optimal control of partially observable Markov processes", PhD thesis, Stanford University, 1971.
- G. E. Monahan, "A survey of partially observable Markov decision processes: Theory, models and algorithms", Management Science, 28(1):1-16, 1982. https://doi.org/10.1287/mnsc.28.1.1
- W. Zhang, "Algorithms for partially observable Markov decision processes", PhD thesis, University of British Columbia, 1988.
- A. R. Cassandra, L. P. Kaelbling, and M. L. Littman, "Acting optimally in partially observable stochastic domains", In Proceedings of the 12th National Conference on Artificial Intelligence, 1994.
- N. L. Zhang, and W. Liu, "Planning in stochastic domains: Problem characteristics and approximation", Technical Report HKUST-CS96-31, Hong Kong University of Science and Technology, 1996.
- J. Pineau, G. Gordon, and S. Thrun, "Point-based value iteration: an anytime algorithm for POMDPs", In Proceedings of IJCAI, 2003.
- J. Pineau, G. Gordon, and S. Thrun, "Anytime pointbased approximations for large POMDPs", Journal of Artificial Intelligence Research, 27:335-380, 2006.
- M. T. J. Spaan and N. Vlassis, "Perseus: Randomized point-based value iteration for POMDPs", Journal of Artificial Intelligence Research, 24:195-220. 2005.
- T. Smith, and R. Simmons, "Heuristic search value iteration for POMDPs", In Proceedings of UAI, 2004.
- T. Smith, and R. Simmons, "Point-based POMDP algorithms: improved analysis and implementation", In Proceedings of UAI, 2005.
- J. D. Williams and S. Young, "Partially observable Markov decision processes for spoken dialog systems", Computer Speech and Language 21(2):393-422, 2007. https://doi.org/10.1016/j.csl.2006.06.008
- T. Lane, "A Decision Theoretic, Semi-Supervised Model for Intrusion Detection", In M. Maloof, ed., Machine learning and data mining for computer security: Methods and applications, Springer-Verlag, 2006.
- Q. Zhao, L. Tong, A. Swami, and Y. Chen, "Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework", IEEE Journal on Selected Areas in Communications, 25(3):589-600, 2007. https://doi.org/10.1109/JSAC.2007.070409
- J. Park, K.-E. Kim, and S. Jo, "A POMDP approach to P300-based brain-computer interfaces", In Proceedings of ACM International Conference on Intelligence User Interfaces (IUI), 2010
- N. Fraser, "Assessment of Interactive Systems", In Handbook of Standards and Resources for Spoken Language Systems, pages 564-614. Mouton de Gruyter, 1997.
- C. Boutilier and D. Poole, "Computing optimal policies for partially observable decision processes using compact representations", In Proceedings of AAAI, 1996.
- J. D. Williams and S. Young, "Scaling POMDPs for spoken dialog management", IEEE Transactions on Audio Speech and Language Process, 15(7):2116-2129, 2007. https://doi.org/10.1109/TASL.2007.902050
- H. S. Sim, K.-E. Kim, J. H. Kim, D.-S. Chang, and M.-W. Koo, "Symbolic heuristic search value iteration for factored POMDPs", In Proceedings of AAAI, 2008.
- D. Kim, H. S. Sim, K.-E. Kim, J. H. Kim, H. Kim, and J. W. Sung, "Effects of user modeling on POMDPbased dialogue systems", In Proceedings of Interspeech, 2008.
- D. Kim, J. H. Kim, and K.-E. Kim, "Robust evaluation of POMDP-based dialogue systems", IEEE Transactions on Audio Speech and Language Process, to be published.
- J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M. Vaughan, "Brain-computer interfaces for communication and control", Clin. Neurophysiol. 113, 2002.
- R. Fazel-Rezai, "Human error in P300 speller paradigm for brain-computer interface", In Proceedings of 29th Ann. Int. Conf. of IEEE Trans. EMBS, pp 2516-2519, 2007.
- D. J. Krusienski, E. W. Sellers, D. J. McFarland, T. M. Vaughan, and J. R. Wolpaw, "Toward enhanced p300 speller performance", J. Neurosci. Methods 167:15-21, 2008. https://doi.org/10.1016/j.jneumeth.2007.07.017
- J. Pineau, G. Gordon, and S. Thrun, "Policy-contingent abstraction for robust robot control", In Proceedings of UAI, 2003.
- A. P. Wolfe, "POMDP homomorphisms", In Proceedings of NIPS, 2006.
- K.-E. Kim, "Exploiting symmetries in POMDPs for point-based algorithms", In Proceedings of AAAI, 2008.
- S. Sanner and C. Boutilier, "Practical solution techniques for first-order MDPs", Artificial Intelligence, 173:748-788, 2009. https://doi.org/10.1016/j.artint.2008.11.003
- S. Sanner and K. Kersting, "Symbolic dynamic programming for first-order POMDPs", In Proceedings of AAAI, 2010.
- Y. Virin, G. Shani, S. E. Shimony, and R. I. Brafman, "Scaling up: Solving POMDPs through value based clustering", In Proceedings of AAAI, 2007.
- G. Shani, R. I. Brafman, and S. E. Shimony, "Forward search value iteration for POMDPs", In Proceedings of IJCAI, 2007.
- T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable Markov decision problems", In Proceedings of NIPS, 1995.
- J. Baxter and P. L. Bartlett, "Reinforcement learning in POMDPs via direct gradient ascent", In Proceedings of ICML, 2000.
- R. Jaulmes, J. Pineau, and D. Precup, "Active learning in partially observable Markov decision processes", In Proceedings of ECML, 2005.
- G. Shani, R. I. Brafman, and S. E. Shimony, "Modelbased online learning of POMDPs", In Proceedings of ECML, 2005.
- D. Wierstra, and M. Wiering, "Utile distinction hidden Markov models", In Proceedings of ICML, 2004.
- S. Ross, J. Pineau, and B. Chaib-draa, "Bayes-adaptive POMDPs", In Proceedings of NIPS, 2007.
- C. Cai, X. Liao, and L. Carin, "Learning to explore and exploit in POMDPs", In Proceedings of NIPS, 2009.
- S. Russell, "Learning agents for uncertain environments", In Proceedings of COLT, 1998.
- A. Y. Ng and S. Russell, "Algorithms for inverse reinforcement learning", In Proceedings of ICML, 2000.
- D. Ramachandran and E. Amir, "Bayesian inverse reinforcement learning", In Proceedings of IJCAI, 2007.
- J. Choi and K.-E. Kim, "Inverse reinforcement learning in partially observable environments", In Proceedings of IJCAI, 2009.
- E. A. Hansen, "Finite-memory control of partially observable systems", PhD thesis, University of Massachusetts at Amherst, 1998.