Table of Contents
Fetching ...

In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents

Seungkyu Lee, Nalim Kim, Yohan Jo

TL;DR

This work tackles the challenge of tool agents needing accurate API dependencies by introducing In-N-Out, a high-quality, expert-annotated dataset of parameter-level API graphs built from real-world benchmarks. By showing that fine-tuning LLMs on In-N-Out improves graph construction and generalizes to unseen APIs, the authors demonstrate substantial gains in tool retrieval and structured multi-tool query generation, with automated graphs capturing most of the benefit. The results across API graph construction, retrieval, and subset generation—plus end-to-end tau-bench evaluations—highlight the practical value of explicit API graphs for enabling robust, scalable tool agents. The dataset and code release provide a valuable resource to advance graph-based reasoning for real-world API ecosystems.

Abstract

Tool agents--LLM-based systems that interact with external APIs--offer a way to execute real-world tasks. However, as tasks become increasingly complex, these agents struggle to identify and call the correct APIs in the proper order. To tackle this problem, we investigate converting API documentation into a structured API graph that captures API dependencies and leveraging it for multi-tool queries that require compositional API calls. To support this, we introduce In-N-Out, the first expert-annotated dataset of API graphs built from two real-world API benchmarks and their documentation. Using In-N-Out significantly improves performance on both tool retrieval and multi-tool query generation, nearly doubling that of LLMs using documentation alone. Moreover, graphs generated by models fine-tuned on In-N-Out close 90% of this gap, showing that our dataset helps models learn to comprehend API documentation and parameter relationships. Our findings highlight the promise of using explicit API graphs for tool agents and the utility of In-N-Out as a valuable resource. We release our dataset and code at https://github.com/holi-lab/In-N-Out-API-Graph.

In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents

TL;DR

This work tackles the challenge of tool agents needing accurate API dependencies by introducing In-N-Out, a high-quality, expert-annotated dataset of parameter-level API graphs built from real-world benchmarks. By showing that fine-tuning LLMs on In-N-Out improves graph construction and generalizes to unseen APIs, the authors demonstrate substantial gains in tool retrieval and structured multi-tool query generation, with automated graphs capturing most of the benefit. The results across API graph construction, retrieval, and subset generation—plus end-to-end tau-bench evaluations—highlight the practical value of explicit API graphs for enabling robust, scalable tool agents. The dataset and code release provide a valuable resource to advance graph-based reasoning for real-world API ecosystems.

Abstract

Tool agents--LLM-based systems that interact with external APIs--offer a way to execute real-world tasks. However, as tasks become increasingly complex, these agents struggle to identify and call the correct APIs in the proper order. To tackle this problem, we investigate converting API documentation into a structured API graph that captures API dependencies and leveraging it for multi-tool queries that require compositional API calls. To support this, we introduce In-N-Out, the first expert-annotated dataset of API graphs built from two real-world API benchmarks and their documentation. Using In-N-Out significantly improves performance on both tool retrieval and multi-tool query generation, nearly doubling that of LLMs using documentation alone. Moreover, graphs generated by models fine-tuned on In-N-Out close 90% of this gap, showing that our dataset helps models learn to comprehend API documentation and parameter relationships. Our findings highlight the promise of using explicit API graphs for tool agents and the utility of In-N-Out as a valuable resource. We release our dataset and code at https://github.com/holi-lab/In-N-Out-API-Graph.

Paper Structure

This paper contains 38 sections, 3 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: Illustration of how parameter-level API graphs support (a) Tool Retrieval, where the agent identifies prerequisite APIs needed to supply inputs for a target API (e.g., FollowArtist), and (b) Multi-Tool Query Generation, where the agent selects interdependent APIs to construct coherent multi-tool queries.
  • Figure 2: Construction process of the In-N-Out dataset: (1) refine API documentation, (2) filter candidate parameter pairs (rule-based, semantic, context-aware), (3) annotate edges by compatibility and naturalness.
  • Figure 3: Gold API Graphs for (a) NESTful and (b) AppWorld datasets. Nodes with the same color belong to the same domain.
  • Figure 4: Confusion matrices of edge classification results with (a) NESTful, (b) AppWorld dataset.
  • Figure 5: Predefined structural patterns (Chain, Fork, Collider) illustrated for 3-, 4-, and 5-API cases.
  • ...and 1 more figures