ARGOS: Automated Functional Safety Requirement Synthesis for Embodied AI via Attribute-Guided Combinatorial Reasoning

Dongsheng Chen; Yuxuan Li; Yi Lin; Guanhua Chen; Jiaxin Zhang; Xiangyu Zhao; Lei Ma; Xin Yao; Xuetao Wei

ARGOS: Automated Functional Safety Requirement Synthesis for Embodied AI via Attribute-Guided Combinatorial Reasoning

Dongsheng Chen, Yuxuan Li, Yi Lin, Guanhua Chen, Jiaxin Zhang, Xiangyu Zhao, Lei Ma, Xin Yao, Xuetao Wei

TL;DR

This work tackles the scalability and grounding challenges of functional safety in Embodied AI by introducing ARGOS, a two-stage framework that connects open-ended user instructions to physical risk attributes and regulatory-aligned safety requirements. Stage I grounds semantic entities into fine-grained attributes and uses combinatorial reasoning to discover long-tail hazards, while Stage II constrains hazard reasoning with ISO-like standards and hardware capabilities to synthesize testable FSRs. Extensive experiments demonstrate that ARGOS outperforms baselines in hazard discovery quality, long-tail risk coverage, and FSR generation, with robust results across backbones and favorable human-algorithm alignment. By shifting from semantic label-to-label mappings to attribute-based deduction and constrained synthesis, ARGOS offers a scalable, physically grounded path toward safe industrial deployment of Embodied AI.

Abstract

Ensuring functional safety is essential for the deployment of Embodied AI in complex open-world environments. However, traditional Hazard Analysis and Risk Assessment (HARA) methods struggle to scale in this domain. While HARA relies on enumerating risks for finite and pre-defined function lists, Embodied AI operates on open-ended natural language instructions, creating a challenge of combinatorial interaction risks. Whereas Large Language Models (LLMs) have emerged as a promising solution to this scalability challenge, they often lack physical grounding, yielding semantically superficial and incoherent hazard descriptions. To overcome these limitations, we propose a new framework ARGOS (AttRibute-Guided cOmbinatorial reaSoning), which bridges the gap between open-ended user instructions and concrete physical attributes. By dynamically decomposing entities from instructions into these fine-grained properties, ARGOS grounds LLM reasoning in causal risk factors to generate physically plausible hazard scenarios. It then instantiates abstract safety standards, such as ISO 13482, into context-specific Functional Safety Requirements (FSRs) by integrating these scenarios with robot capabilities. Extensive experiments validate that ARGOS produces high-quality FSRs and outperforms baselines in identifying long-tail risks. Overall, this work paves the way for systematic and grounded functional safety requirement generation, a critical step toward the safe industrial deployment of Embodied AI.

ARGOS: Automated Functional Safety Requirement Synthesis for Embodied AI via Attribute-Guided Combinatorial Reasoning

TL;DR

Abstract

Paper Structure (38 sections, 4 equations, 4 figures, 5 tables)

This paper contains 38 sections, 4 equations, 4 figures, 5 tables.

Introduction
Related Work
Safety Engineering in the Open World
From Software Requirements to Embodied AI
Automated Hazard Analysis
Our Framework: ARGOS
Stage I:Attribute-Guided Hazard Discovery
Semantic Parsing and Attribute Injection.
Combinatorial Hazard Inference
Stage II: Scenario-Anchored Requirement Synthesis
Regulatory Alignment & Constraint Injection
Physics-Aware Requirement Generation
Experiments
Experimental setup
Baselines
...and 23 more sections

Figures (4)

Figure 1: The proposed ARGOS framework: A two-stage pipeline for automated FSR synthesis. Stage I focuses on decomposing semantic entities into physical attributes for combinatorial hazard discovery, while Stage II aligns these hazards with regulatory standards and hardware constraints to generate requirements.
Figure 2: Statistical analysis of generation quality. The violin plot illustrates the score density across methods.
Figure 3: Qualitative Analysis: Semantic Diversity and Evaluation Alignment.
Figure 4: Visualizations for the GPT-4o Backbone. (a) The violin plot confirms that our method maintains a "top-heavy" high-quality distribution even on the stronger GPT-4o model, whereas baselines still exhibit long-tail risks. (b) The t-SNE projection shows that our method (red) covers a distinct and broader semantic space compared to the baselines, consistent with the findings on DeepSeek-V3.2.

ARGOS: Automated Functional Safety Requirement Synthesis for Embodied AI via Attribute-Guided Combinatorial Reasoning

TL;DR

Abstract

ARGOS: Automated Functional Safety Requirement Synthesis for Embodied AI via Attribute-Guided Combinatorial Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)