Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents
Xiang Chen, Yuling Shi, Qizhen Lan, Yuchao Qiu, Xiaodong Gu
TL;DR
Fed-SE tackles privacy-constrained, cross-environment evolution of LLM agents by combining local, trajectory-filtered self-improvement with global, low-rank aggregation of adapter updates. By freezing the base model and updating lightweight LoRA adapters, it preserves general reasoning while specializing to environments; global aggregation in a low-rank subspace mitigates negative transfer. Empirical results across five heterogeneous tasks show an ~18% improvement over federated baselines and strong gains in long-horizon, reasoning-heavy tasks like Maze. The work demonstrates a practical, communication-efficient pathway to scalable, privacy-preserving continual learning for distributed LLM agents.
Abstract
LLM agents are widely deployed in complex interactive tasks, yet privacy constraints often preclude centralized optimization and co-evolution across dynamic environments. While Federated Learning (FL) has proven effective on static datasets, its extension to the open-ended self-evolution of agents remains underexplored. Directly applying standard FL is challenging: heterogeneous tasks and sparse, trajectory-level rewards introduce severe gradient conflicts, destabilizing the global optimization process. To bridge this gap, we propose Fed-SE, a Federated Self-Evolution framework for LLM agents. Fed-SE establishes a local evolution-global aggregation paradigm. Locally, agents employ parameter-efficient fine-tuning on filtered, high-return trajectories to achieve stable gradient updates. Globally, Fed-SE aggregates updates within a low-rank subspace that disentangles environment-specific dynamics, effectively reducing negative transfer across clients. Experiments across five heterogeneous environments demonstrate that Fed-SE improves average task success rates by approximately 18% over federated baselines, validating its effectiveness in robust cross-environment knowledge transfer in privacy-constrained deployments.
