Table of Contents
Fetching ...

Osprey: Production-Ready Agentic AI for Safety-Critical Control Systems

Thorsten Hellert, João Montenegro, Antonin Sulc

TL;DR

The paper tackles the challenge of safely deploying agentic AI in large-scale, safety-critical facilities by introducing Osprey, a plan-first orchestration framework. It combines a structured four-layer architecture—plan-first orchestration, capability classification, connector abstractions, and a safety-enforced execution layer—with MCP-backed tooling and containerized execution to ensure auditable, production-ready workflows. The key contributions are the design principles, the detailed orchestration and integration strategies, and two substantive case studies (an ALS production deployment and a tutorial) that demonstrate safe, scalable AI-assisted control. The work is significant for enabling repeatable, transparent, and safety-compliant AI automation in complex facilities, with broad applicability beyond accelerators to other safety-critical domains.

Abstract

Operating large-scale scientific facilities requires coordinating diverse subsystems, translating operator intent into precise hardware actions, and maintaining strict safety oversight. Language model-driven agents offer a natural interface for these tasks, but most existing approaches are not yet reliable or safe enough for production use. In this paper, we introduce Osprey, a framework for using agentic AI in large, safety-critical facility operations. Osprey is built around the needs of control rooms and addresses these challenges in four ways. First, it uses a plan-first orchestrator that generates complete execution plans, including all dependencies, for human review before any hardware is touched. Second, a coordination layer manages complex data flows, keeps data types consistent, and automatically downsamples large datasets when needed. Third, a classifier dynamically selects only the tools required for a given task, keeping prompts compact as facilities add capabilities. Fourth, connector abstractions and deployment patterns work across different control systems and are ready for day-to-day use. We demonstrate the framework through two case studies: a control-assistant tutorial showing semantic channel mapping and historical data integration, and a production deployment at the Advanced Light Source, where Osprey manages real-time operations across hundreds of thousands of control channels. These results establish Osprey as a production-ready framework for deploying agentic AI in complex, safety-critical environments.

Osprey: Production-Ready Agentic AI for Safety-Critical Control Systems

TL;DR

The paper tackles the challenge of safely deploying agentic AI in large-scale, safety-critical facilities by introducing Osprey, a plan-first orchestration framework. It combines a structured four-layer architecture—plan-first orchestration, capability classification, connector abstractions, and a safety-enforced execution layer—with MCP-backed tooling and containerized execution to ensure auditable, production-ready workflows. The key contributions are the design principles, the detailed orchestration and integration strategies, and two substantive case studies (an ALS production deployment and a tutorial) that demonstrate safe, scalable AI-assisted control. The work is significant for enabling repeatable, transparent, and safety-compliant AI automation in complex facilities, with broad applicability beyond accelerators to other safety-critical domains.

Abstract

Operating large-scale scientific facilities requires coordinating diverse subsystems, translating operator intent into precise hardware actions, and maintaining strict safety oversight. Language model-driven agents offer a natural interface for these tasks, but most existing approaches are not yet reliable or safe enough for production use. In this paper, we introduce Osprey, a framework for using agentic AI in large, safety-critical facility operations. Osprey is built around the needs of control rooms and addresses these challenges in four ways. First, it uses a plan-first orchestrator that generates complete execution plans, including all dependencies, for human review before any hardware is touched. Second, a coordination layer manages complex data flows, keeps data types consistent, and automatically downsamples large datasets when needed. Third, a classifier dynamically selects only the tools required for a given task, keeping prompts compact as facilities add capabilities. Fourth, connector abstractions and deployment patterns work across different control systems and are ready for day-to-day use. We demonstrate the framework through two case studies: a control-assistant tutorial showing semantic channel mapping and historical data integration, and a production deployment at the Advanced Light Source, where Osprey manages real-time operations across hundreds of thousands of control channels. These results establish Osprey as a production-ready framework for deploying agentic AI in complex, safety-critical environments.

Paper Structure

This paper contains 20 sections, 6 figures.

Figures (6)

  • Figure 1: Osprey provides agentic orchestration with human-in-the-loop safety review, translating natural language requests into approved, isolated execution on facility control systems.
  • Figure 2: Detailed workflow of the Osprey orchestration layer. Multi-turn conversational inputs and facility-specific data sources (channel databases, archiver systems, operational memory) are transformed into structured task descriptions with resolved control system context. The classifier dynamically identifies relevant capabilities from the available set (channel finding, data retrieval, machine operations, etc.) and passes selected capabilities to the orchestrator. The orchestrator generates a complete execution plan with explicit dependencies and safety annotations, which undergoes pattern detection to identify control system write operations. Plans requiring hardware interaction pause for operator approval before execution. The agent then executes each step with context tracking, artifact management, and containerized isolation. The capabilities illustrated represent the ALS Accelerator Assistant deployment managing hundreds of thousands of control channels; the architecture supports facility-specific capability sets through the registry system.
  • Figure 3: Jupyter Notebook view in OpenWebUI, exposing the Python script that produced the plotted archiver data and allowing users to inspect or rerun any Python-based execution.
  • Figure 4: Execution plan generated by Osprey for the request "Give me a time series and a correlation plot of all horizontal BPM positions over the last 24 hours." The natural language query is first converted to a structured task through task extraction. After capability classification, the orchestrator produces a five-step plan to parse the time range, resolve channel addresses, retrieve historical data, generate correlation plots, and deliver the final operator-facing response.
  • Figure 5: Example output from the Control Assistant tutorial for the request "Give me a time series and a correlation plot of all horizontal BPM positions over the last 24 hours." The framework resolves the relevant channels, retrieves historical data via the mock archiver, and executes generated Python code in an isolated Jupyter environment to produce the plots.
  • ...and 1 more figures