Osprey: Production-Ready Agentic AI for Safety-Critical Control Systems
Thorsten Hellert, João Montenegro, Antonin Sulc
TL;DR
The paper tackles the challenge of safely deploying agentic AI in large-scale, safety-critical facilities by introducing Osprey, a plan-first orchestration framework. It combines a structured four-layer architecture—plan-first orchestration, capability classification, connector abstractions, and a safety-enforced execution layer—with MCP-backed tooling and containerized execution to ensure auditable, production-ready workflows. The key contributions are the design principles, the detailed orchestration and integration strategies, and two substantive case studies (an ALS production deployment and a tutorial) that demonstrate safe, scalable AI-assisted control. The work is significant for enabling repeatable, transparent, and safety-compliant AI automation in complex facilities, with broad applicability beyond accelerators to other safety-critical domains.
Abstract
Operating large-scale scientific facilities requires coordinating diverse subsystems, translating operator intent into precise hardware actions, and maintaining strict safety oversight. Language model-driven agents offer a natural interface for these tasks, but most existing approaches are not yet reliable or safe enough for production use. In this paper, we introduce Osprey, a framework for using agentic AI in large, safety-critical facility operations. Osprey is built around the needs of control rooms and addresses these challenges in four ways. First, it uses a plan-first orchestrator that generates complete execution plans, including all dependencies, for human review before any hardware is touched. Second, a coordination layer manages complex data flows, keeps data types consistent, and automatically downsamples large datasets when needed. Third, a classifier dynamically selects only the tools required for a given task, keeping prompts compact as facilities add capabilities. Fourth, connector abstractions and deployment patterns work across different control systems and are ready for day-to-day use. We demonstrate the framework through two case studies: a control-assistant tutorial showing semantic channel mapping and historical data integration, and a production deployment at the Advanced Light Source, where Osprey manages real-time operations across hundreds of thousands of control channels. These results establish Osprey as a production-ready framework for deploying agentic AI in complex, safety-critical environments.
