Table of Contents
Fetching ...

MemCollab: Cross-Agent Memory Collaboration via Contrastive Trajectory Distillation

Yurui Chang, Yiran Wu, Qingyun Wu, Lu Lin

Abstract

Large language model (LLM)-based agents rely on memory mechanisms to reuse knowledge from past problem-solving experiences. Existing approaches typically construct memory in a per-agent manner, tightly coupling stored knowledge to a single model's reasoning style. In modern deployments with heterogeneous agents, a natural question arises: can a single memory system be shared across different models? We found that naively transferring memory between agents often degrades performance, as such memory entangles task-relevant knowledge with agent-specific biases. To address this challenge, we propose MemCollab, a collaborative memory framework that constructs agent-agnostic memory by contrasting reasoning trajectories generated by different agents on the same task. This contrastive process distills abstract reasoning constraints that capture shared task-level invariants while suppressing agent-specific artifacts. We further introduce a task-aware retrieval mechanism that conditions memory access on task category, ensuring that only relevant constraints are used at inference time. Experiments on mathematical reasoning and code generation benchmarks demonstrate that MemCollab consistently improves both accuracy and inference-time efficiency across diverse agents, including cross-modal-family settings. Our results show that the collaboratively constructed memory can function as a shared reasoning resource for diverse LLM-based agents.

MemCollab: Cross-Agent Memory Collaboration via Contrastive Trajectory Distillation

Abstract

Large language model (LLM)-based agents rely on memory mechanisms to reuse knowledge from past problem-solving experiences. Existing approaches typically construct memory in a per-agent manner, tightly coupling stored knowledge to a single model's reasoning style. In modern deployments with heterogeneous agents, a natural question arises: can a single memory system be shared across different models? We found that naively transferring memory between agents often degrades performance, as such memory entangles task-relevant knowledge with agent-specific biases. To address this challenge, we propose MemCollab, a collaborative memory framework that constructs agent-agnostic memory by contrasting reasoning trajectories generated by different agents on the same task. This contrastive process distills abstract reasoning constraints that capture shared task-level invariants while suppressing agent-specific artifacts. We further introduce a task-aware retrieval mechanism that conditions memory access on task category, ensuring that only relevant constraints are used at inference time. Experiments on mathematical reasoning and code generation benchmarks demonstrate that MemCollab consistently improves both accuracy and inference-time efficiency across diverse agents, including cross-modal-family settings. Our results show that the collaboratively constructed memory can function as a shared reasoning resource for diverse LLM-based agents.
Paper Structure (29 sections, 12 equations, 8 figures, 7 tables, 2 algorithms)

This paper contains 29 sections, 12 equations, 8 figures, 7 tables, 2 algorithms.

Figures (8)

  • Figure 1: Accuracy on MATH500 dataset with memory from different resources.
  • Figure 2: Framework of MemCollab. Left: Using different agents to generate trajectory pairs for the same task. Middle: Contrasting the trajectory pairs to construct the memory. Right: Augmenting inference-time generation with retrieved memory.
  • Figure 3: Case study to compare the trajectories with and without the extracted memory guidance. Left: The contrasted trajectories pairs given a same question and the corresponding extracted memory guidance. Right: Changes in the agent's output after incorporating the memory guidance. Note: all Key step and Issue are summarized, see full output is in Appendix \ref{['appendix:full_traj']}.
  • Figure 4: Ablation on the number of retrieved memories
  • Figure 5: Task-category similarity of error-type distributions on MATH500.
  • ...and 3 more figures