Dynamic Tool Dependency Retrieval for Efficient Function Calling
Bhrij Patel, Davide Belli, Amir Jalalirad, Maximilian Arnold, Aleksandr Ermolov, Bence Major
TL;DR
Dynamic Tool Dependency Retrieval (DTDR) tackles the challenge of selecting relevant tools for on-device LLM function calling by conditioning tool retrieval on both the user query and the evolving plan history. It introduces two lightweight variants, DTDR-C (clustering-based) and DTDR-L (learned linear), to infer a minimal, task-specific dependency subgraph and inject it into prompts. Across multiple datasets and model backbones, DTDR improves retrieval precision, function calling accuracy, and end-to-end success rates while reducing prompt length, demonstrating strong practical impact for memory- and latency-constrained agents. The work lays a foundation for robust on-device tool use and invites extensions to multimodal tools and evolving tool ecosystems.
Abstract
Function calling agents powered by Large Language Models (LLMs) select external tools to automate complex tasks. On-device agents typically use a retrieval module to select relevant tools, improving performance and reducing context length. However, existing retrieval methods rely on static and limited inputs, failing to capture multi-step tool dependencies and evolving task context. This limitation often introduces irrelevant tools that mislead the agent, degrading efficiency and accuracy. We propose Dynamic Tool Dependency Retrieval (DTDR), a lightweight retrieval method that conditions on both the initial query and the evolving execution context. DTDR models tool dependencies from function calling demonstrations, enabling adaptive retrieval as plans unfold. We benchmark DTDR against state-of-the-art retrieval methods across multiple datasets and LLM backbones, evaluating retrieval precision, downstream task accuracy, and computational efficiency. Additionally, we explore strategies to integrate retrieved tools into prompts. Our results show that dynamic tool retrieval improves function calling success rates between $23\%$ and $104\%$ compared to state-of-the-art static retrievers.
