Table of Contents
Fetching ...

Can LLMs plan paths with extra hints from solvers?

Erik Wu, Sayan Mitra

TL;DR

The results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach.

Abstract

Large Language Models (LLMs) have shown remarkable capabilities in natural language processing, mathematical problem solving, and tasks related to program synthesis. However, their effectiveness in long-term planning and higher-order reasoning has been noted to be limited and fragile. This paper explores an approach for enhancing LLM performance in solving a classical robotic planning task by integrating solver-generated feedback. We explore four different strategies for providing feedback, including visual feedback, we utilize fine-tuning, and we evaluate the performance of three different LLMs across a 10 standard and 100 more randomly generated planning problems. Our results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach. The study provides detailed analysis of the effects of the different hinting strategies and the different planning tendencies of the evaluated LLMs.

Can LLMs plan paths with extra hints from solvers?

TL;DR

The results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach.

Abstract

Large Language Models (LLMs) have shown remarkable capabilities in natural language processing, mathematical problem solving, and tasks related to program synthesis. However, their effectiveness in long-term planning and higher-order reasoning has been noted to be limited and fragile. This paper explores an approach for enhancing LLM performance in solving a classical robotic planning task by integrating solver-generated feedback. We explore four different strategies for providing feedback, including visual feedback, we utilize fine-tuning, and we evaluate the performance of three different LLMs across a 10 standard and 100 more randomly generated planning problems. Our results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach. The study provides detailed analysis of the effects of the different hinting strategies and the different planning tendencies of the evaluated LLMs.
Paper Structure (17 sections, 6 figures, 4 tables)

This paper contains 17 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: 2D path planning problems with initial set (blue), goal (green), and obstacles (red).
  • Figure 2: Examples of different LLM solutions using collision hints are shown. Gemini and Claude failed to solve the Canyon problem with collision hints after 20 feedback iterations (every 4th path shown). GPT-4o successfully solved the problem with collision hints in 6 iterations (all paths shown). In both cases, darker paths represent higher iteration numbers, with the final path depicted in black.
  • Figure 3: Flowchart of prompt generation, LLM, and solver feedback loop used in our analysis.
  • Figure 4: On the left, the blue points are the free space hints showing possible points a correct path can go through and they are generated by vertically slicing the space. GPT-4o utilizes the free space hints on the right to find a solution.
  • Figure 5: Examples of randomly generated 2D path planning problems
  • ...and 1 more figures