Table of Contents
Fetching ...

Learning Admissible Heuristics for A*: Theory and Practice

Ehsan Futuhi, Nathan R. Sturtevant

TL;DR

This paper poses heuristic learning as a constrained optimization problem and introduces Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training and provides the first generalization guarantees for goal-dependent heuristics.

Abstract

Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution optimality. Recent deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data. This paper addresses both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training. On the Rubik's Cube domain, this method yields near-admissible heuristics with significantly stronger guidance than compressed pattern database (PDB) heuristics. Theoretically, we study the sample complexity of learning heuristics. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik's Cube, we tighten the bound on the number of training samples needed for A-star to generalize. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network's width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for goal-dependent heuristics.

Learning Admissible Heuristics for A*: Theory and Practice

TL;DR

This paper poses heuristic learning as a constrained optimization problem and introduces Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training and provides the first generalization guarantees for goal-dependent heuristics.

Abstract

Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution optimality. Recent deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data. This paper addresses both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training. On the Rubik's Cube domain, this method yields near-admissible heuristics with significantly stronger guidance than compressed pattern database (PDB) heuristics. Theoretically, we study the sample complexity of learning heuristics. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik's Cube, we tighten the bound on the number of training samples needed for A-star to generalize. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network's width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for goal-dependent heuristics.

Paper Structure

This paper contains 38 sections, 17 theorems, 46 equations, 9 figures, 5 tables, 1 algorithm.

Key Result

Proposition 1

Let $H > 0$, $\mathcal{H} \subseteq {[0, H]}^\mathcal{Y}$, and $\mathcal{D}$ be a probability distribution over $\mathcal{Y}$. Suppose we draw $\{y_1,\dots,y_N\} \sim \mathcal{D}^N$ i.i.d. Then, with probability at least $1-\delta$ over this random draw, the following holds for all $h \in \mathcal{H

Figures (9)

  • Figure 1: A solved $3 \times 3$ Rubik’s Cube with each group of cubies shown separately: (a) center cubies, (b) corner cubies, and (c) edge cubies.
  • Figure 2: Generalization error vs. training size for the 8-corner PDB.
  • Figure 3: Comparison between the full state-space graph of the $3 \times 3$ Rubik’s Cube and the abstraction generated using only the eight corner cubies.
  • Figure 4: Neural Network structure.
  • Figure 5: The $(\eta,\beta)$ values used throughout the training for each PDB.
  • ...and 4 more figures

Theorems & Definitions (31)

  • Definition 1
  • Proposition 1
  • Lemma 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 2: Neural networks chengsample
  • Theorem 5
  • Definition 3
  • ...and 21 more