Table of Contents
Fetching ...

Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models

Jonggeun Lee, Woojung Song, Jongwook Han, Haesung Pyun, Yohan Jo

TL;DR

The paper tackles the problem that small language models struggle with tool use due to schema misalignment, where pretrained naming patterns cause plausible but incorrect tool names to be invoked. It introduces PA-Tool, a training-free pipeline that renames tool schema components to pretraining-aligned names using a three-stage process driven by a peakedness signal from contamination detection. Across MetaTool and RoTBench, PA-Tool yields up to 17 percentage-point gains and up to 80% reduction in schema-misalignment errors, bringing small models closer to closed-source baselines while remaining computationally efficient. The work demonstrates that schema-level interventions can unlock tool-use capabilities in resource-limited models, and it provides a practical, one-time schema-mapping approach that can complement supervised fine-tuning.

Abstract

Small language models (SLMs) offer significant computational advantages for tool-augmented AI systems, yet they struggle with tool-use tasks, particularly in selecting appropriate tools and identifying correct parameters. A common failure mode is schema misalignment: models hallucinate plausible but non-existent tool names that reflect naming conventions internalized during pretraining but absent from the provided tool schema. Rather than forcing models to adapt to arbitrary schemas, we propose adapting schemas to align with models' pretrained knowledge. We introduce PA-Tool (Pretraining-Aligned Tool Schema Generation), a training-free method that leverages peakedness-a signal from contamination detection indicating pretraining familiarity-to automatically rename tool components. By generating multiple candidates and selecting those with highest output concentration across samples, PA-Tool identifies pretrain-aligned naming patterns. Experiments on MetaTool and RoTBench show improvements of up to 17% points, with schema misalignment errors reduced by 80%. PA-Tool enables small models to approach state-of-the-art performance while maintaining computational efficiency for adaptation to new tools without retraining. Our work demonstrates that schema-level interventions can unlock the tool-use potential of resource-efficient models by adapting schemas to models rather than models to schemas.

Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models

TL;DR

The paper tackles the problem that small language models struggle with tool use due to schema misalignment, where pretrained naming patterns cause plausible but incorrect tool names to be invoked. It introduces PA-Tool, a training-free pipeline that renames tool schema components to pretraining-aligned names using a three-stage process driven by a peakedness signal from contamination detection. Across MetaTool and RoTBench, PA-Tool yields up to 17 percentage-point gains and up to 80% reduction in schema-misalignment errors, bringing small models closer to closed-source baselines while remaining computationally efficient. The work demonstrates that schema-level interventions can unlock tool-use capabilities in resource-limited models, and it provides a practical, one-time schema-mapping approach that can complement supervised fine-tuning.

Abstract

Small language models (SLMs) offer significant computational advantages for tool-augmented AI systems, yet they struggle with tool-use tasks, particularly in selecting appropriate tools and identifying correct parameters. A common failure mode is schema misalignment: models hallucinate plausible but non-existent tool names that reflect naming conventions internalized during pretraining but absent from the provided tool schema. Rather than forcing models to adapt to arbitrary schemas, we propose adapting schemas to align with models' pretrained knowledge. We introduce PA-Tool (Pretraining-Aligned Tool Schema Generation), a training-free method that leverages peakedness-a signal from contamination detection indicating pretraining familiarity-to automatically rename tool components. By generating multiple candidates and selecting those with highest output concentration across samples, PA-Tool identifies pretrain-aligned naming patterns. Experiments on MetaTool and RoTBench show improvements of up to 17% points, with schema misalignment errors reduced by 80%. PA-Tool enables small models to approach state-of-the-art performance while maintaining computational efficiency for adaptation to new tools without retraining. Our work demonstrates that schema-level interventions can unlock the tool-use potential of resource-efficient models by adapting schemas to models rather than models to schemas.

Paper Structure

This paper contains 31 sections, 4 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Effect of schema alignment on tool invocation. Top: Models learn tool documentation during pretraining. Bottom-Left: When schemas misalign with these internalized patterns, models generate plausible but non-existent tools. Bottom-Right: Schema alignment with pretrained knowledge prevents such errors.
  • Figure 2: Overview of our PA-Tool framework.
  • Figure 3: Error type distribution for Llama3.1-8B on MetaTool tool selection tasks.
  • Figure 4: Impact of hyperparameters on PA-Tool across different models. All results are averaged across four MetaTool subtasks. Top: Effect of the number of candidates ($N$). Middle: Effect of similarity threshold ($\alpha$). Bottom: Effect of sampling temperature ($t$).
  • Figure 5: Component name generation prompt for Stage(1) in Figure \ref{['fig_overview']}.
  • ...and 1 more figures