You Are Now >> Home >> Resaerch >> Artificial Intelligence >> A Heuristic Search Planner for First-Order MDPs


A Heuristic Search Planner for First-Order MDPs



 

Abstract

We present a heuristic search algorithm for solving first-order Markov Decision Processes (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, in contrast to existing systems, which start with propositionalizing the FOMDP and then perform state abstraction on its propositionalized version we apply state abstraction directly on the FOMDP avoiding propositionalization. This kind of abstraction is referred to as first-order state abstraction. Secondly, guided by an admissible heuristic, the search is restricted to those states that are reachable from the initial state. We demonstrate the usefulness of the above techniques for solving FOMDPs with a system, referred to as FluCaP (formerly, FCPlanner), that entered the probabilistic track of the 2004 International Planning Competition (IPC’2004) and demonstrated an advantage over other planners on the problems represented in first-order terms.

Introduction

Markov decision processes (MDPs) have been adopted as a representational and computational model for decision-theoretic planning problems in much recent work, e.g., by Barto, Bradtke, and Singh (1995). The basic solution techniques for MDPs rely on the dynamic programming (DP) principle (Boutilier, Dean, & Hanks, 1999). Unfortunately, classical dynamic programming algorithms require explicit enumeration of the state space that grows exponentially with the number of variables relevant to the planning domain. Therefore, these algorithms do not scale up to complex AI planning problems.

However, several methods that avoid explicit state enumeration have been developed recently. One technique, referred to as state abstraction, exploits the structure of the factored MDP representation to solve problems efficiently, circumventing explicit state space enumeration (Boutilier et al., 1999). Another technique, referred to as heuristic search, restricts the computation to states that are reachable from the initial state, e.g., RTDP by Barto et al. (1995), envelope DP by Dean, Kaelbling, Kirman, and Nicholson (1995) and LAO by Feng and Hansen (2002). One existing approach that combines both these techniques is the symbolic LAO algorithm by Feng and Hansen (2002) which performs heuristic search symbolically for factored MDPs. It exploits state abstraction, i.e., manipulates sets of states instead of individual states. More precisely, following the SPUDD approach by Hoey, St-Aubin, Hu, and Boutilier (1999), all MDP components, value functions, policies, and admissible heuristic functions are compactly represented using algebraic decision diagrams (ADDs). This allows computations of the LAO algorithm to be performed efficiently using ADDs.

Following ideas of symbolic LAO

, given an initial state, we use an admissible heuristic to restrict search only to those states that are reachable from the initial state. Moreover, we exploit state abstraction in order to avoid evaluating states individually. Thus, our work is very much in the spirit of symbolic LAO but extends it in an important way. Whereas the symbolic LAO algorithm starts with propositionalization of the FOMDP, and only after that performs state abstraction on its propositionalized version by means of propositional ADDs, we apply state abstraction directly on the structure of the FOMDP, avoiding propositionalization. This kind of abstraction is referred to as first-order state abstractio

 


Sponsored link:

Read Full Research
Research Person : Steffen H¨olldobler, Eldar Karabaev, Olga Skvortsova
Contact Person : Steffen H¨olldobler
sh@iccl.tu-dresden.de

Eldar Karabaev
eldar@iccl.tu-dresden.de

Olga Skvortsova
skvortsova@iccl.tu-dresden.de

International Center for Computational Logic Technische Universit¨at Dresden, Dresden, Germany
Year : 2006

Category: Artificial Intelligence
Related Research:
Related Product:
Related News:
Related Dictionay: