Table of Contents
Fetching ...

Alignment for Efficient Tool Calling of Large Language Models

Hongshen Xu, Zihan Wang, Zichen Zhu, Lei Pan, Xingyu Chen, Lu Chen, Kai Yu

TL;DR

This work tackles when LLMs should call external tools by modeling knowledge boundaries as a probabilistic, uncertain region rather than a binary known/unknown state. It introduces a multi-objective alignment framework that balances helpfulness with tool cost, supported by two knowledge-boundary estimation methods (consistency-based and absolute) and two training strategies (implicit and explicit modeling). Empirical results across calculator, retrieval-based QA, and complex reasoning tasks show substantial reductions in unnecessary tool usage while maintaining or improving accuracy, with explicit modeling offering flexible at-inference control. The framework advances practical tool intelligence by enabling dynamic, cost-aware tool invocation suitable for real-world deployment.

Abstract

Recent advancements in tool learning have enabled large language models (LLMs) to integrate external tools, enhancing their task performance by expanding their knowledge boundaries. However, relying on tools often introduces tradeoffs between performance, speed, and cost, with LLMs sometimes exhibiting overreliance and overconfidence in tool usage. This paper addresses the challenge of aligning LLMs with their knowledge boundaries to make more intelligent decisions about tool invocation. We propose a multi objective alignment framework that combines probabilistic knowledge boundary estimation with dynamic decision making, allowing LLMs to better assess when to invoke tools based on their confidence. Our framework includes two methods for knowledge boundary estimation, consistency based and absolute estimation, and two training strategies for integrating these estimates into the model decision making process. Experimental results on various tool invocation scenarios demonstrate the effectiveness of our framework, showing significant improvements in tool efficiency by reducing unnecessary tool usage.

Alignment for Efficient Tool Calling of Large Language Models

TL;DR

This work tackles when LLMs should call external tools by modeling knowledge boundaries as a probabilistic, uncertain region rather than a binary known/unknown state. It introduces a multi-objective alignment framework that balances helpfulness with tool cost, supported by two knowledge-boundary estimation methods (consistency-based and absolute) and two training strategies (implicit and explicit modeling). Empirical results across calculator, retrieval-based QA, and complex reasoning tasks show substantial reductions in unnecessary tool usage while maintaining or improving accuracy, with explicit modeling offering flexible at-inference control. The framework advances practical tool intelligence by enabling dynamic, cost-aware tool invocation suitable for real-world deployment.

Abstract

Recent advancements in tool learning have enabled large language models (LLMs) to integrate external tools, enhancing their task performance by expanding their knowledge boundaries. However, relying on tools often introduces tradeoffs between performance, speed, and cost, with LLMs sometimes exhibiting overreliance and overconfidence in tool usage. This paper addresses the challenge of aligning LLMs with their knowledge boundaries to make more intelligent decisions about tool invocation. We propose a multi objective alignment framework that combines probabilistic knowledge boundary estimation with dynamic decision making, allowing LLMs to better assess when to invoke tools based on their confidence. Our framework includes two methods for knowledge boundary estimation, consistency based and absolute estimation, and two training strategies for integrating these estimates into the model decision making process. Experimental results on various tool invocation scenarios demonstrate the effectiveness of our framework, showing significant improvements in tool efficiency by reducing unnecessary tool usage.

Paper Structure

This paper contains 48 sections, 6 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Our method effectively enables LLMs to switch between answering independently and calling tools (upper part), thereby reducing the model's over-reliance and overconfidence in tools (lower part).
  • Figure 2: The overall pipeline of knowledge boundary modeling methods.
  • Figure 3: Trade-off between overconfidence and over-tool-reliance with different SFT data ratios.
  • Figure 4: Performance vs. inference time (seconds).
  • Figure 5: Effect of SFT Data Ratio on Utility. The ratio represents the proportion of training samples in which the model invokes a tool rather than answering directly.
  • ...and 2 more figures