ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
Xingshan Zeng, Weiwen Liu, Xu Huang, Zezhong Wang, Lingzhi Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruiming Tang, Qun Liu
TL;DR
ToolACE-R presents a model-aware iterative training framework that continuously aligns training data with a model's evolving tool-use capabilities, coupled with an adaptive self-refinement inference mechanism that autonomously decides when to stop refinements. The approach relies on a model-aware difficulty metric to curate training samples, along with self-refinement data augmentation that preserves stopping behavior. Empirical results on BFCL, ACEBench, API-Bank, and ToolAlpaca show ToolACE-R achieves competitive performance with strong open-source backbones and can surpass API-based models when combined with adaptive refinement, while demonstrating robustness across backbones and model sizes. The work highlights a scalable path to efficient tool learning, reducing reliance on external feedback and enabling effective tool invocation in resource-constrained settings. Overall, ToolACE-R advances tool learning by tightly integrating data selection, self-refinement, and adaptive inference to maximize LLM utility in tool-rich tasks.
Abstract
Tool learning, which allows Large Language Models (LLMs) to leverage external tools for solving complex user tasks, has emerged as a promising avenue for extending model capabilities. However, existing approaches primarily focus on data synthesis for fine-tuning LLMs to invoke tools effectively, largely ignoring how to fully stimulate the potential of the model. In this paper, we propose ToolACE-R, a novel framework that includes both model-aware iterative training and adaptive refinement for tool learning. ToolACE-R features a model-aware iterative training procedure that progressively adjust training samples based on the model's evolving capabilities to maximize its potential. Additionally, it incorporates self-refinement training corpus which emphasizes LLM's ability to iteratively refine their tool calls, optimizing performance without requiring external feedback. Furthermore, we introduce adaptive self-refinement mechanism for efficient test-time scaling, where the trained model can autonomously determine when to stop the process based on iterative self-refinement. We conduct extensive experiments across several benchmark datasets, showing that ToolACE-R achieves competitive performance compared to advanced API-based models. The performance of tool invocation can be further improved efficiently through adaptive self-refinement. These results highlight the effectiveness and generalizability of ToolACE-R, offering a promising direction for more efficient and scalable tool learning.
