A Heuristic Search Planner for First-Order MDPs
Abstract
We present a heuristic search algorithm for solving first-order Markov Decision Processes (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, in contrast to existing systems, which start with propositionalizing the FOMDP and then perform state abstraction on its propositionalized version we apply state abstraction directly on the FOMDP avoiding propositionalization. This kind of abstraction is referred to as first-order state abstraction. Secondly, guided by an admissible heuristic, the search is restricted to those states that are reachable from the initial state. We demonstrate the usefulness of the above techniques for solving FOMDPs with a system, referred to as FluCaP (formerly, FCPlanner), that entered the probabilistic track of the 2004 International Planning Competition (IPC’2004) and demonstrated an advantage over other planners on the problems represented in first-order terms.
Introduction
Markov decision processes (MDPs) have been adopted as a representational and computational model for decision-theoretic planning problems in much recent work, e.g., by Barto, Bradtke, and Singh (1995). The basic solution techniques for MDPs rely on the dynamic programming (DP) principle (Boutilier, Dean, & Hanks, 1999). Unfortunately, classical dynamic programming algorithms require explicit enumeration of the state space that grows exponentially with the number of variables relevant to the planning domain. Therefore, these algorithms do not scale up to complex AI planning problems.
However, several methods that avoid explicit state enumeration have been developed recently. One technique, referred to as state abstraction, exploits the structure of the factored MDP representation to solve problems efficiently, circumventing explicit state space enumeration (Boutilier et al., 1999). Another technique, referred to as heuristic search, restricts the computation to states that are reachable from the initial state, e.g., RTDP by Barto et al. (1995), envelope DP by Dean, Kaelbling, Kirman, and Nicholson (1995) and LAO by Feng and Hansen (2002). One existing approach that combines both these techniques is the symbolic LAO algorithm by Feng and Hansen (2002) which performs heuristic search symbolically for factored MDPs. It exploits state abstraction, i.e., manipulates sets of states instead of individual states. More precisely, following the SPUDD approach by Hoey, St-Aubin, Hu, and Boutilier (1999), all MDP components, value functions, policies, and admissible heuristic functions are compactly represented using algebraic decision diagrams (ADDs). This allows computations of the LAO algorithm to be performed efficiently using ADDs.
Following ideas of symbolic LAO
, given an initial state, we use an admissible heuristic to restrict search only to those states that are reachable from the initial state. Moreover, we exploit state abstraction in order to avoid evaluating states individually. Thus, our work is very much in the spirit of symbolic LAO but extends it in an important way. Whereas the symbolic LAO algorithm starts with propositionalization of the FOMDP, and only after that performs state abstraction on its propositionalized version by means of propositional ADDs, we apply state abstraction directly on the structure of the FOMDP, avoiding propositionalization. This kind of abstraction is referred to as first-order state abstractio
Sponsored link:
sh@iccl.tu-dresden.de
Eldar Karabaev
eldar@iccl.tu-dresden.de
Olga Skvortsova
skvortsova@iccl.tu-dresden.de
International Center for Computational Logic Technische Universit¨at Dresden, Dresden, Germany
Category: Artificial Intelligence
- 8-Valent Fuzzy Logic for Iris Recognition and Biometry
- A Visual Entity-Relationship Model for Constraint-Based
- Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processe
- Asynchronous Partial Overlay: A New Algorithm for Solving Distributed Constraint Satisfaction Problems
- Breaking Instance-Independent Symmetries in Exact Graph Coloring

Artificial Intelligence