Table of Contents
Fetching ...

Factored Agents: Decoupling In-Context Learning and Memorization for Robust Tool Use

Nicholas Roth, Christopher Hidey, Lucas Spangher, William F. Arnold, Chang Ye, Nick Masiewicki, Jinoo Baek, Peter Grabowski, Eugene Ie

TL;DR

The paper tackles the fragility of monolithic agent architectures in dynamic tool-use environments by proposing a factored agent design that separates memorization from in-context learning. It implements a two-model system: a large planner that handles high-level reasoning and prompt adaptation, and a smaller memorizer that stores tool formats and outputs, connected via a structured hand-off workflow. Through extensive experiments on TauBench and in-context learning benchmarks (BBH and GSM8k), the authors show that decoupling these roles improves planning accuracy and error resilience, while revealing trade-offs between memory and adaptation. The results suggest that factored agents can serve as a robust, scalable foundation for next-generation agentic AI, with potential integration with retrieval, prompting, and planning enhancements.

Abstract

In this paper, we propose a novel factored agent architecture designed to overcome the limitations of traditional single-agent systems in agentic AI. Our approach decomposes the agent into two specialized components: (1) a large language model (LLM) that serves as a high level planner and in-context learner, which may use dynamically available information in user prompts, (2) a smaller language model which acts as a memorizer of tool format and output. This decoupling addresses prevalent issues in monolithic designs, including malformed, missing, and hallucinated API fields, as well as suboptimal planning in dynamic environments. Empirical evaluations demonstrate that our factored architecture significantly improves planning accuracy and error resilience, while elucidating the inherent trade-off between in-context learning and static memorization. These findings suggest that a factored approach is a promising pathway for developing more robust and adaptable agentic AI systems.

Factored Agents: Decoupling In-Context Learning and Memorization for Robust Tool Use

TL;DR

The paper tackles the fragility of monolithic agent architectures in dynamic tool-use environments by proposing a factored agent design that separates memorization from in-context learning. It implements a two-model system: a large planner that handles high-level reasoning and prompt adaptation, and a smaller memorizer that stores tool formats and outputs, connected via a structured hand-off workflow. Through extensive experiments on TauBench and in-context learning benchmarks (BBH and GSM8k), the authors show that decoupling these roles improves planning accuracy and error resilience, while revealing trade-offs between memory and adaptation. The results suggest that factored agents can serve as a robust, scalable foundation for next-generation agentic AI, with potential integration with retrieval, prompting, and planning enhancements.

Abstract

In this paper, we propose a novel factored agent architecture designed to overcome the limitations of traditional single-agent systems in agentic AI. Our approach decomposes the agent into two specialized components: (1) a large language model (LLM) that serves as a high level planner and in-context learner, which may use dynamically available information in user prompts, (2) a smaller language model which acts as a memorizer of tool format and output. This decoupling addresses prevalent issues in monolithic designs, including malformed, missing, and hallucinated API fields, as well as suboptimal planning in dynamic environments. Empirical evaluations demonstrate that our factored architecture significantly improves planning accuracy and error resilience, while elucidating the inherent trade-off between in-context learning and static memorization. These findings suggest that a factored approach is a promising pathway for developing more robust and adaptable agentic AI systems.

Paper Structure

This paper contains 21 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Representation of the factored agents model with a flow of information between users, with hand-offs between various models.
  • Figure 2: In this plot, we compare the DeepSeek-R1-Distill-Qwen base model to the tool-use model across two benchmarks, BigBenchHard and GSM8k. The x-axis increases the number of shots present in few-shot test time, and the y axis measures proportion of scored answers that are an exact match.