Table of Contents
Fetching ...

LLM With Tools: A Survey

Zhuocheng Shen

TL;DR

The paper surveys how large language models can be augmented with external tools, proposing a standardized paradigm that maps user instructions to intents, plans, execution, feedback, and adaptation. It reviews a broad research landscape—from fine-tuning with a purpose-built C* dataset and diverse data augmentation to in-context learning with and without retrieval and even LLM-driven tool creation—highlighting methods to improve tool use, reduce hallucinations, and expand capabilities. It delves into core challenges (timing, tool selection, reasoning robustness, efficiency, and generalization) and outlines future directions in scheduling, real-time optimization, continual learning, and pretraining tool-enabled LLMs. The inclusion of an experimental replication of Chameleon on ScienceQA demonstrates practical evaluation of tool-use frameworks and provides architectural insights into the Chameleon pipeline.

Abstract

The integration of tools in augmenting large language models presents a novel approach toward enhancing the efficiency and accuracy of these models in handling specific, complex tasks. This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools, thereby pushing the boundaries of their capabilities beyond pre-existing knowledge bases. We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans and their execution, emphasizing the significance of understanding user intent, tool selection, and dynamic plan adjustment. Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes. In addressing these challenges, we investigate techniques within the context of fine-tuning and incontext learning paradigms, highlighting innovative approaches to ensure diversity, augment datasets, and improve generalization.Furthermore, we investigate a perspective on enabling LLMs to not only utilize but also autonomously create tools, which may redefine their role from mere tool users to tool creators. Finally,we reproduced Chameleon's results on ScienceQA and analyzed the code structure.

LLM With Tools: A Survey

TL;DR

The paper surveys how large language models can be augmented with external tools, proposing a standardized paradigm that maps user instructions to intents, plans, execution, feedback, and adaptation. It reviews a broad research landscape—from fine-tuning with a purpose-built C* dataset and diverse data augmentation to in-context learning with and without retrieval and even LLM-driven tool creation—highlighting methods to improve tool use, reduce hallucinations, and expand capabilities. It delves into core challenges (timing, tool selection, reasoning robustness, efficiency, and generalization) and outlines future directions in scheduling, real-time optimization, continual learning, and pretraining tool-enabled LLMs. The inclusion of an experimental replication of Chameleon on ScienceQA demonstrates practical evaluation of tool-use frameworks and provides architectural insights into the Chameleon pipeline.

Abstract

The integration of tools in augmenting large language models presents a novel approach toward enhancing the efficiency and accuracy of these models in handling specific, complex tasks. This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools, thereby pushing the boundaries of their capabilities beyond pre-existing knowledge bases. We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans and their execution, emphasizing the significance of understanding user intent, tool selection, and dynamic plan adjustment. Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes. In addressing these challenges, we investigate techniques within the context of fine-tuning and incontext learning paradigms, highlighting innovative approaches to ensure diversity, augment datasets, and improve generalization.Furthermore, we investigate a perspective on enabling LLMs to not only utilize but also autonomously create tools, which may redefine their role from mere tool users to tool creators. Finally,we reproduced Chameleon's results on ScienceQA and analyzed the code structure.
Paper Structure (31 sections, 2 equations, 7 figures, 2 tables, 2 algorithms)

This paper contains 31 sections, 2 equations, 7 figures, 2 tables, 2 algorithms.

Figures (7)

  • Figure 1: Whole process of LLM using tools.
  • Figure 2: Diversity of w/o and w/ image content.
  • Figure 3: Example of MultiTool-CoT prompt.
  • Figure 4: Process of online planning.
  • Figure 5: Chameleon's state transition diagram.
  • ...and 2 more figures