A Heuristic Search Algorithm for Solving Partially-Observable, Non-Deterministic Planning Problems

부분적으로 관측가능하고 비결정적인 계획문제를 풀기 위한 휴리스틱 탐색 알고리즘

  • 김현식 (경기대학교 컴퓨터과학과) ;
  • 박찬영 (경기대학교 컴퓨터과학과) ;
  • 김인철 (경기대학교 컴퓨터과학과)
  • Published : 2009.10.15

Abstract

In this paper, we present a new heuristic search algorithm, HSCP, that can solve conditional/contingent planning problems with nondeterministic actions as well as partial observations. The algorithm repeats its AND-OR search trials until a complete solution graph can be found. However, unlike existing heuristic AND-OR search algorithms such as$AO^*$ and $LAO^*$, the AND-OR search trial conducted by HSCP concentrates on only a single candidate of solution subgraphs to expand it into a complete solution graph. Moreover, unlike real-time dynamic programming algorithms such as RTDP and LRTDP, the AND-OR search trial of HSCP finds a solution immediately when it possible without delaying it until the estimated value of every state converges. Therefore, the HSCP search algorithm has the advantage that it can find a sub-optimal conditional plan very efficiently.

본 논문에서는 불완전한 인식과 비결정적 동작을 함께 포함한 조건부 계획문제를 풀기 위한 새로운 휴리스틱 탐색 알고리즘 HSCP를 소개한다. HSCP 탐색 알고리즘은 하나의 완전한 해 그래프가 구해질 때까지 AND-OR 탐색시도를 반복한다. HSCP 알고리즘의 AND-OR 탐색시도는, 기존의 휴리스틱 AND-OR 탐색 알고리즘들인 $AO^*$$LAO^*$와는 달리, 오직 하나의 후보 해 그래프를 확장하는데 집중한다. 또한, 실시간 동적 프로그래밍 알고리즘들인 RTDP와 LRTDP와는 달리, 모든 상태들의 가치 평가치가 수렴할 때까지 미루지 않고 바로 해를 구한다. 따라서 HSCP 탐색 알고리즘은 양질의 조건부 계획을 매우 효율적으로 구해줄 수 있다는 장점이 있다.

Keywords

References

  1. S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 2nd Edition, Prentice Hall, 2003
  2. M. Ghallib, D. Nau, and P. Traverso, Automated Planning: Theory and Practice, Morgan Kaufmann, 2004
  3. H. L. S, Younes and M. Littman, "PPDDL1.0: An Extension to PDDL for Expressing Planning Do-mains wit Probabilistic Effects," Technical Report CMU-CS-04-167, 2004
  4. M. Fox and D. Long, “PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains,” Journal of Artificial Intelligence Research, vol.20, pp.61-124, 2003
  5. E. Hansen and S. Zilberstein, "LAO$\ast$: A Heuristic Search Algorithm that Finds Solutions with Loops," Artificial Intelligence, vol.129, no.1-2, pp.35-62, 2001 https://doi.org/10.1016/S0004-3702(01)00106-0
  6. B. Bonet and H. Geffner, "Labeled RTDP: Im-proving the Convergence of Real-Time Dynamic Programming," Proceeding of the ICAPS'03, pp.12-21, 2003
  7. U. Kuter, D. Nau, E. Reisner and R. Goldman, “Conditionalization: Adapting Forward-Chaining Planners to Partially Observable Environments,” Proceeding of the ICAPS'07, 2007
  8. U. Kuter and D. Nau, “Forward-Chaining Planning in Nondeterministic Domains,” Proceedings of the National Conference on Artflcial Intelligence (AAAI), pp.513-518, 2004