Behavior Trees Enable Structured Programming of Language Model Agents
Richard Kelley
TL;DR
Language-model agents are powerful but brittle in real-world deployments; this work proposes behavior trees as a unifying, modular framework to structure and compose language-enabled agents. It introduces Dendron, a Python library that integrates causal and vision-language actions with a blackboard for data sharing, enabling safe, interpretable, and edge-friendly agent architectures. Through three case studies—a chat agent, robot visual inspection, and a safety-focused BT defense against prompt-based attacks—the paper demonstrates modularity, reusability, and practical safety guarantees afforded by behavior-tree orchestration. The findings indicate that structured programming with BTs can harness modern transformers while mitigating hallucinations, multimodal integration challenges, and information leakage, supporting scalable, trustworthy language-model agents in dynamic environments.
Abstract
Language models trained on internet-scale data sets have shown an impressive ability to solve problems in Natural Language Processing and Computer Vision. However, experience is showing that these models are frequently brittle in unexpected ways, and require significant scaffolding to ensure that they operate correctly in the larger systems that comprise "language-model agents." In this paper, we argue that behavior trees provide a unifying framework for combining language models with classical AI and traditional programming. We introduce Dendron, a Python library for programming language model agents using behavior trees. We demonstrate the approach embodied by Dendron in three case studies: building a chat agent, a camera-based infrastructure inspection agent for use on a mobile robot or vehicle, and an agent that has been built to satisfy safety constraints that it did not receive through instruction tuning or RLHF.
