ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback

Qinzhuo Wu; Wei Liu; Jian Luan; Bin Wang

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback

Qinzhuo Wu, Wei Liu, Jian Luan, Bin Wang

TL;DR

This work constructed a training dataset called MGToolBench, which contains statement and category-level instructions to better reflect real-world scenarios and proposes ToolPlanner, a two-stage reinforcement learning framework that utilizes path planning and two feedback mechanisms to enhance the LLM's task completion and instruction-following capabilities.

Abstract

Recently, tool-augmented LLMs have gained increasing attention. Given an instruction, tool-augmented LLMs can interact with various external tools in multiple rounds and provide a final answer. However, previous LLMs were trained on overly detailed instructions, which included API names or parameters, while real users would not explicitly mention these API details. This leads to a gap between trained LLMs and real-world scenarios. In addition, most works ignore whether the interaction process follows the instruction. To address these issues, we constructed a training dataset called MGToolBench, which contains statement and category-level instructions to better reflect real-world scenarios. In addition, we propose ToolPlanner, a two-stage reinforcement learning framework that utilizes path planning and two feedback mechanisms to enhance the LLM's task completion and instruction-following capabilities. Experimental results show that ToolPlanner significantly improves the Match Rate, Pass Rate and Win Rate by 26.8%, 20.2%, and 5.6% compared to the SOTA model. Human evaluation verifies that the multi-granularity instructions can better align with users' usage habits. Our data and code will be released upon acceptance.

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback

TL;DR

Abstract

Paper Structure (46 sections, 5 equations, 9 figures, 27 tables)

This paper contains 46 sections, 5 equations, 9 figures, 27 tables.

Introduction
Related Work
Dataset Construction
Multi-Granularity Instruction Mechanism
MGToolBench Dataset
Models
Problem Definition
Stage 1: Supervised Finetuning
Tag Extraction
Solution Path Planning
Solution Tree Generation
Stage 2: Reinforcement Learning
Task Completion and Instruction Following Metrics
Pairing pairwise responses
Training
...and 31 more sections

Figures (9)

Figure 1: Several instructions and their granularity levels from real users, ToolBench, and MGToolBench. Real users tend to provide instructions at a higher level, such as Statement or Category, while ToolBench often consists of more detailed instructions at the API level.
Figure 2: Descriptions and examples of instructions at different granularity levels.
Figure 3: MGToolBench Dataset Pipeline.
Figure 4: (Top) The overview of our proposed ToolPlanner. (Bottom Left): An external tool pool with 6 candidate APIs. (Bottom Right): Results of 7 candidate solutions on our metrics.
Figure 5: Two solution tree and their pairwise responses for a tool-level instruction.
...and 4 more figures

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback

TL;DR

Abstract

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback

Authors

TL;DR

Abstract

Table of Contents

Figures (9)