Tapas Are Free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments
Jinwei Hu, Yi Dong, Youcheng Sun, Xiaowei Huang
TL;DR
This work tackles the challenge of adapting autonomous agents in safety-critical, dynamic environments without retraining. It introduces TAPA, a training-free framework that uses LLMs to moderate the symbolic action space by operating over interpretable logical primitives and synthesizing modular programs for each primitive. Through a meta-agent, multi-scenario program pools, shadow validation, and a provenance chain stored in a RAG knowledge base, TAPA achieves rapid, interpretable adaptation in cyber defense and swarm formation control, demonstrated by high uptime and robust consensus under adversarial and environmental disturbances. The results suggest a paradigm shift from policy retraining to dynamic action adaptation, enabling scalable reliability and safety in evolving operational contexts.
Abstract
Autonomous agents in safety-critical applications must continuously adapt to dynamic conditions without compromising performance and reliability. This work introduces TAPA (Training-free Adaptation of Programmatic Agents), a novel framework that positions large language models (LLMs) as intelligent moderators of the symbolic action space. Unlike prior programmatic agents typically generate a monolithic policy program or rely on fixed symbolic action sets, TAPA synthesizes and adapts modular programs for individual high-level actions, referred to as logical primitives. By decoupling strategic intent from execution, TAPA enables meta-agents to operate over an abstract, interpretable action space while the LLM dynamically generates, composes, and refines symbolic programs tailored to each primitive. Extensive experiments across cybersecurity and swarm intelligence domains validate TAPA's effectiveness. In autonomous DDoS defense scenarios, TAPA achieves 77.7% network uptime while maintaining near-perfect detection accuracy in unknown dynamic environments. In swarm intelligence formation control under environmental and adversarial disturbances, TAPA consistently preserves consensus at runtime where baseline methods fail. This work promotes a paradigm shift for autonomous system design in evolving environments, from policy adaptation to dynamic action adaptation.
