Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies

Andrey Borro; Patricia J Riddle; Michael W Barley; Michael J Witbrock

Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies

Andrey Borro, Patricia J Riddle, Michael W Barley, Michael J Witbrock

TL;DR

The paper addresses the high cost and limited accessibility of scalable reasoning in large language models by proposing two complementary strategies to empower weaker models: (i) injecting a generalised problem-solving strategy generated by a stronger LLM, and (ii) applying iterative error correction to proposed solutions. Across BlocksWorld planning and a Type 3 CRT math reasoning task, the methods enable weaker models to match or approach the performance of stronger baselines at a fraction of the cost, with notable reductions in reasoning tokens when strategies are included. The results demonstrate that generalised strategies and error correction substantially improve success rates, and handwritten strategies can yield near-optimal performance, albeit with a gap to human-generated guidance. The work has practical implications for making sophisticated reasoning more affordable and accessible, and suggests future directions in scaling to more domains and integrating with hierarchical planning for even larger tasks.

Abstract

Recent advancements in the reasoning skills of Large Language Models (LLMs) demonstrate an increase in the ability of LLMs to solve simple planning tasks. However, as long as the driving force behind improved reasoning capability is the size and complexity of the model, the financial and computational costs associated with running them will also increase. This trend raises questions about continued accessibility and whether these improvements will increase at the same pace as models continue to grow in size and expense. We propose two approaches to enhance the reasoning ability of less resource-intensive LLMs. (1) Provide them with a generalised strategy for solving tasks within a given domain, generated by a more resource-intensive LLM. (2) Exploit their cost-effectiveness by iteratively prompting these models to correct errors in their proposed solutions. Our empirical results from planning and mathematical reasoning tasks demonstrate that these methods improve the performance of less resource-intensive LLMs to levels comparable with their more resource-intensive counterparts, at a fraction of the cost. Additionally, we show that the utilisation of generalised strategies in our experiments reduced the cost of the less resource-intensive model by nearly 30 percent on average.

Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies

TL;DR

Abstract

Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)

Theorems & Definitions (1)