Table of Contents
Fetching ...

Learning to Share: Selective Memory for Efficient Parallel Agentic Systems

Joseph Fioresi, Parth Parag Kulkarni, Ashmal Vayani, Song Wang, Mubarak Shah

TL;DR

Learning to Share introduces a global memory bank and a lightweight memory controller to selectively admit intermediate steps across parallel agent teams, reducing redundant computation without sacrificing task performance. The approach uses stepwise reinforcement learning with usage-aware reward shaping to learn which steps are globally useful, balancing memory growth and utility. Empirical results on GAIA and AssistantBench show substantial wall-clock time reductions and improved or comparable task success across backbones, with ablations confirming the value of selective admission over naive sharing. The work demonstrates that learned memory admission is an effective strategy to improve the efficiency of parallel agentic systems in long-horizon, tool-intensive tasks.

Abstract

Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results. To improve robustness and solution quality, recent approaches deploy multiple agent teams running in parallel to explore diverse reasoning trajectories. However, parallel execution comes at a significant computational cost: when different teams independently reason about similar sub-problems or execute analogous steps, they repeatedly perform substantial overlapping computation. To address these limitations, in this paper, we propose Learning to Share (LTS), a learned shared-memory mechanism for parallel agentic frameworks that enables selective cross-team information reuse while controlling context growth. LTS introduces a global memory bank accessible to all teams and a lightweight controller that decides whether intermediate agent steps should be added to memory or not. The controller is trained using stepwise reinforcement learning with usage-aware credit assignment, allowing it to identify information that is globally useful across parallel executions. Experiments on the AssistantBench and GAIA benchmarks show that LTS significantly reduces overall runtime while matching or improving task performance compared to memory-free parallel baselines, demonstrating that learned memory admission is an effective strategy for improving the efficiency of parallel agentic systems. Project page: https://joefioresi718.github.io/LTS_webpage/

Learning to Share: Selective Memory for Efficient Parallel Agentic Systems

TL;DR

Learning to Share introduces a global memory bank and a lightweight memory controller to selectively admit intermediate steps across parallel agent teams, reducing redundant computation without sacrificing task performance. The approach uses stepwise reinforcement learning with usage-aware reward shaping to learn which steps are globally useful, balancing memory growth and utility. Empirical results on GAIA and AssistantBench show substantial wall-clock time reductions and improved or comparable task success across backbones, with ablations confirming the value of selective admission over naive sharing. The work demonstrates that learned memory admission is an effective strategy to improve the efficiency of parallel agentic systems in long-horizon, tool-intensive tasks.

Abstract

Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results. To improve robustness and solution quality, recent approaches deploy multiple agent teams running in parallel to explore diverse reasoning trajectories. However, parallel execution comes at a significant computational cost: when different teams independently reason about similar sub-problems or execute analogous steps, they repeatedly perform substantial overlapping computation. To address these limitations, in this paper, we propose Learning to Share (LTS), a learned shared-memory mechanism for parallel agentic frameworks that enables selective cross-team information reuse while controlling context growth. LTS introduces a global memory bank accessible to all teams and a lightweight controller that decides whether intermediate agent steps should be added to memory or not. The controller is trained using stepwise reinforcement learning with usage-aware credit assignment, allowing it to identify information that is globally useful across parallel executions. Experiments on the AssistantBench and GAIA benchmarks show that LTS significantly reduces overall runtime while matching or improving task performance compared to memory-free parallel baselines, demonstrating that learned memory admission is an effective strategy for improving the efficiency of parallel agentic systems. Project page: https://joefioresi718.github.io/LTS_webpage/
Paper Structure (38 sections, 12 equations, 7 figures, 4 tables)

This paper contains 38 sections, 12 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Shared memory reduces redundant computation in parallel agentic execution. Comparison of parallel agent teams solving a long-horizon task without (top) and with (bottom) shared memory. (a) Without shared memory, teams independently repeat overlapping intermediate steps (e.g., web search, table parsing, code writing), and errors in one branch propagate additional retries, increasing overall latency. (b) With a shared memory bank, teams reuse previously discovered intermediate results, avoiding redundant work and reducing error overhead. As a result, the system converges in fewer total steps and lower wall-clock time.
  • Figure 2: Learning to Share: selective shared memory for parallel agentic systems.(a) Parallel agent teams execute independently while interacting with a central Shared Memory Bank. After each agent step, a learned Memory Controller evaluates the intermediate result and selectively admits high-utility information into shared memory as a key-value pair (step summary, agent output) or discards it. Teams may query stored keys to reuse previously discovered results to reduce redundant computation (shown only for team 3, but all teams follow the same memory retrieval). (b) The memory controller receives embeddings of the task query, existing memory keys, and the current step (agent input, output, and summary) for context. These are projected into a shared token space and processed by a lightweight controller LLM, which emits a single binary decision indicating whether the step should be stored. Selective admission maintains a high-quality shared memory while accelerating convergence.
  • Figure 3: Cumulative distribution of wall-clock completion times on AssistantBench. Our LTS shared-memory approach shifts the runtime distribution left relative to memory-free M1-Parallel, indicating faster completion for a larger fraction of tasks. Selectively sharing intermediate results reduces redundant computation and lowers overall latency.
  • Figure S1: Cumulative distribution of wall-clock completion times on AssistantBench for shared-memory variants. All shared-memory variants shift the runtime distribution left relative to memory-free M1-Parallel, indicating reduced wall-clock latency due to cross-team reuse of intermediate results. Alternate admission strategies achieve larger runtime gains but exhibit lower task accuracy.
  • Figure S2: Runtime and step-count distributions for shared-memory variants on AssistantBench. Top row shows histograms of completion time per task, and bottom row shows the corresponding number of execution steps. Bin counts determined by Freedman–Diaconis rule. While all memory-enabled variants shift the runtime distribution toward shorter completion times, naive admission increases variance in step count and occasionally induces longer executions. In contrast, LTS achieves a consistent leftward shift in runtime while maintaining compact step-count distributions, indicating efficient reuse of intermediate results without introducing noisy or redundant steps.
  • ...and 2 more figures