A Training Data Recipe to Accelerate A* Search with Language Models

Devaansh Gupta; Boyang Li

A Training Data Recipe to Accelerate A* Search with Language Models

Devaansh Gupta, Boyang Li

TL;DR

This work empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task, and finds an overlap between their requirements; A* requires more accurate predictions on search nodes near the goal, and LLM need the same set of nodes for effective generalisation.

Abstract

Combining Large Language Models (LLMs) with heuristic search algorithms like A* holds the promise of enhanced LLM reasoning and scalable inference. To accelerate training and reduce computational demands, we investigate the coreset selection problem for the training data of LLM heuristic learning. Few methods to learn the heuristic functions consider the interaction between the search algorithm and the machine learning model. In this work, we empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task. Surprisingly, we find an overlap between their requirements; A* requires more accurate predictions on search nodes near the goal, and LLMs need the same set of nodes for effective generalisation. With these insights, we derive a data-selection distribution for learning LLM-based heuristics. On three classical planning domains, maze navigation, Sokoban and sliding tile puzzles, our technique reduces the number of iterations required to find the solutions by up to 15x, with a wall-clock speed-up of search up to 5x. The codebase is at https://github.com/devaansh100/a_star.

A Training Data Recipe to Accelerate A* Search with Language Models

TL;DR

Abstract

Paper Structure (51 sections, 7 equations, 5 figures, 11 tables, 2 algorithms)

This paper contains 51 sections, 7 equations, 5 figures, 11 tables, 2 algorithms.

Introduction
Related Works
Learning Heuristics for Planning
Machine Learning Techniques
Search-aware Techniques
Large Language Models in Search
Tree Creation by LLMs
LLMs with External Planners
Improving LLM-based Heuristics
Optimising Training Data
Coreset Selection
Preliminaries
A* Search
Selection
Expansion
...and 36 more sections

Figures (5)

Figure 1: Validation MAE of models trained on the Initial, Middle, End, and All splits, and their corresponding exclusion sets. A lower value shows better generalisation.
Figure 2: Puzzle representation and legend of a training puzzle from Sokoban.
Figure 3: Puzzle representation and legend of a training puzzle from the maze dataset.
Figure 4: Puzzle representation and legend of a training puzzle from the stp dataset.
Figure 5: Prompt used while training the language model. {curly braces} denote a placeholder.

A Training Data Recipe to Accelerate A* Search with Language Models

TL;DR

Abstract

A Training Data Recipe to Accelerate A* Search with Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)