Table of Contents
Fetching ...

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Shen Gao, Zhengliang Shi, Minghang Zhu, Bowen Fang, Xin Xin, Pengjie Ren, Zhumin Chen, Jun Ma, Zhaochun Ren

TL;DR

The paper tackles the challenge of enabling LLMs to selectively and effectively use a large, real-world toolset. It introduces Confucius, a two-pronged framework combining a multi-stage easy-to-difficult curriculum with Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically refine training data based on model introspection. Experimental results show Confucius outperforms both tuning-free and tuning-based baselines on seen and unseen toolsets, with solid generalization to different base models. The work advances practical, scalable tool-use in real-world applications by improving tool selection, handling complex tools, and maintaining performance as tool catalogs expand.

Abstract

Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

TL;DR

The paper tackles the challenge of enabling LLMs to selectively and effectively use a large, real-world toolset. It introduces Confucius, a two-pronged framework combining a multi-stage easy-to-difficult curriculum with Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically refine training data based on model introspection. Experimental results show Confucius outperforms both tuning-free and tuning-based baselines on seen and unseen toolsets, with solid generalization to different base models. The work advances practical, scalable tool-use in real-world applications by improving tool selection, handling complex tools, and maintaining performance as tool catalogs expand.

Abstract

Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).
Paper Structure (30 sections, 6 equations, 6 figures, 10 tables)

This paper contains 30 sections, 6 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: Comparison between the existing tuning-based tool learning methods and our Confucius. Instead of using a pre-constructed dataset, we propose an iterative data construction framework with multi-stage learning to train the tool-use model effectively.
  • Figure 2: The overall architecture of our framework consists of multi-stage learning and iterative self-instruct from introspective feedback. We denote the $\mathcal{M}^{i}$ as the target model trained on $i$-th epoch and the $\mathcal{M}^{i+1}$ as the target model trained on $i+1$-th epoch.
  • Figure 3: Comparison between Confucius with ISIF and a variant model which randomly sample tools to generate new instances without the introspective feedback.
  • Figure 4: The qualitative analysis for update percentage.
  • Figure 5: The input and output length distribution of our initial dataset.
  • ...and 1 more figures