Watchdogs and Oracles: Runtime Verification Meets Large Language Models for Autonomous Systems
Angelo Ferrando
TL;DR
The paper addresses safety and trust in autonomous systems with learning-enabled components, where traditional formal methods struggle with incomplete models and dynamic environments while LLMs are reliable for language tasks but lack guarantees. It proposes a symbiotic framework in which RV enforces safety around LLM-driven autonomy and LLMs assist RV via specification capture, predictive reasoning, and handling uncertainty, including anticipatory monitoring. The contributions include a concrete vision and architecture for LLM-assisted specification, predictive RV, and RV-as-guardrail for LLMs, along with a discussion of certification challenges and a roadmap for future work. The work aims to enable dependable autonomy across high-stakes domains by providing dynamic runtime assurance and more auditable, domain-specific certification evidence.
Abstract
Assuring the safety and trustworthiness of autonomous systems is particularly difficult when learning-enabled components and open environments are involved. Formal methods provide strong guarantees but depend on complete models and static assumptions. Runtime verification (RV) complements them by monitoring executions at run time and, in its predictive variants, by anticipating potential violations. Large language models (LLMs), meanwhile, excel at translating natural language into formal artefacts and recognising patterns in data, yet they remain error-prone and lack formal guarantees. This vision paper argues for a symbiotic integration of RV and LLMs. RV can serve as a guardrail for LLM-driven autonomy, while LLMs can extend RV by assisting specification capture, supporting anticipatory reasoning, and helping to handle uncertainty. We outline how this mutual reinforcement differs from existing surveys and roadmaps, discuss challenges and certification implications, and identify future research directions towards dependable autonomy.
