Table of Contents
Fetching ...

Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception

Wanjing Huang, Tongjie Pan, Yalan Ye

TL;DR

This paper addresses unsafe open-world robotic task planning by integrating a Graphormer-based risk perception module with LLM-driven planning. It introduces a dynamic spatio-semantic safety graph to detect hazards online and trigger adaptive task replanning. Experiments in AI2-THOR show higher risk detection recall, more timely safety notices, and improved task adaptability over static-rule and LLM-only baselines. The work demonstrates a scalable framework for proactive safety in robotics, with code available at the provided GitHub link.

Abstract

Recent advancements in large language models (LLMs) have expanded their role in robotic task planning. However, while LLMs have been explored for generating feasible task sequences, their ability to ensure safe task execution remains underdeveloped. Existing methods struggle with structured risk perception, making them inadequate for safety-critical applications where low-latency hazard adaptation is required. To address this limitation, we propose a Graphormer-enhanced risk-aware task planning framework that combines LLM-based decision-making with structured safety modeling. Our approach constructs a dynamic spatio-semantic safety graph, capturing spatial and contextual risk factors to enable online hazard detection and adaptive task refinement. Unlike existing methods that rely on predefined safety constraints, our framework introduces a context-aware risk perception module that continuously refines safety predictions based on real-time task execution. This enables a more flexible and scalable approach to robotic planning, allowing for adaptive safety compliance beyond static rules. To validate our framework, we conduct experiments in the AI2-THOR environment. The experiments results validates improvements in risk detection accuracy, rising safety notice, and task adaptability of our framework in continuous environments compared to static rule-based and LLM-only baselines. Our project is available at https://github.com/hwj20/GGTP

Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception

TL;DR

This paper addresses unsafe open-world robotic task planning by integrating a Graphormer-based risk perception module with LLM-driven planning. It introduces a dynamic spatio-semantic safety graph to detect hazards online and trigger adaptive task replanning. Experiments in AI2-THOR show higher risk detection recall, more timely safety notices, and improved task adaptability over static-rule and LLM-only baselines. The work demonstrates a scalable framework for proactive safety in robotics, with code available at the provided GitHub link.

Abstract

Recent advancements in large language models (LLMs) have expanded their role in robotic task planning. However, while LLMs have been explored for generating feasible task sequences, their ability to ensure safe task execution remains underdeveloped. Existing methods struggle with structured risk perception, making them inadequate for safety-critical applications where low-latency hazard adaptation is required. To address this limitation, we propose a Graphormer-enhanced risk-aware task planning framework that combines LLM-based decision-making with structured safety modeling. Our approach constructs a dynamic spatio-semantic safety graph, capturing spatial and contextual risk factors to enable online hazard detection and adaptive task refinement. Unlike existing methods that rely on predefined safety constraints, our framework introduces a context-aware risk perception module that continuously refines safety predictions based on real-time task execution. This enables a more flexible and scalable approach to robotic planning, allowing for adaptive safety compliance beyond static rules. To validate our framework, we conduct experiments in the AI2-THOR environment. The experiments results validates improvements in risk detection accuracy, rising safety notice, and task adaptability of our framework in continuous environments compared to static rule-based and LLM-only baselines. Our project is available at https://github.com/hwj20/GGTP

Paper Structure

This paper contains 18 sections, 9 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison between the static rule-based method and our approach. LTL-based safety enforcement relies on predefined constraints, which inherently lack spatial perception and fail to capture unenumerated risks. As a result, when a robot executes a task such as retrieving ingredients from the refrigerator, it is unable to recognize a child approaching a high-temperature oven. Consequently, LTL transmits an incorrect "safe" signal to the LLM planner, leading to a failure in proactive hazard avoidance. In contrast, our method models spatial-semantic relationships, enabling real-time risk assessment and adaptive task modification.
  • Figure 2: Graphormer Overview. Our framework integrates Graphormer-based risk modeling with LLM-driven task planning to enable instantaneous safety adaptation. The system constructs a context-aware spatio-semantic safety graph from environmental observations, where high-risk interactions are identified using attention-weighted edge representations.The Graphormer will auto translates the dangerous edges into natural language.
  • Figure 3: Precision-Recall analysis of our method. Due to the severe class imbalance in our dataset, random guessing yields near-zero precision. In contrast, by setting a decision threshold of $0.21$, our method achieves a precision of 30% while maintaining a recall above 90%. This means that in a dataset with 10,000 edges, where only 100 are hazardous, our model identifies 300 edges as potentially dangerous, successfully capturing 90 out of the 100 true hazardous edges. This balance ensures that critical risks are detected while minimizing false alarms.
  • Figure 4: Stages of a complete cooking task in AI2-THOR (FloorPlan2) from the perspective of the executing agent. The task includes picking up ingredients and placing them into a pan. The first stage, "HandleSafetyIssue" for the "Baby", is not natively supported in AI2-THOR; however, we implement a script-based check to determine whether the model actively resolves safety issues.