Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

Haohong Lin; Wenhao Ding; Zuxin Liu; Yaru Niu; Jiacheng Zhu; Yuming Niu; Ding Zhao

Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

Haohong Lin, Wenhao Ding, Zuxin Liu, Yaru Niu, Jiacheng Zhu, Yuming Niu, Ding Zhao

TL;DR

Empirical evidence indicates that FUSION significantly enhances the safety and generalizability of autonomous driving agents, even in the face of challenging and unseen environments, and ablation studies reveal noticeable improvements in the integration of causal representation into the offline safe RL algorithm.

Abstract

In the domain of autonomous driving, the offline Reinforcement Learning~(RL) approaches exhibit notable efficacy in addressing sequential decision-making problems from offline datasets. However, maintaining safety in diverse safety-critical scenarios remains a significant challenge due to long-tailed and unforeseen scenarios absent from offline datasets. In this paper, we introduce the saFety-aware strUctured Scenario representatION (FUSION), a pioneering representation learning method in offline RL to facilitate the learning of a generalizable end-to-end driving policy by leveraging structured scenario information. FUSION capitalizes on the causal relationships between the decomposed reward, cost, state, and action space, constructing a framework for structured sequential reasoning in dynamic traffic environments. We conduct extensive evaluations in two typical real-world settings of the distribution shift in autonomous vehicles, demonstrating the good balance between safety cost and utility reward compared to the current state-of-the-art safe RL and IL baselines. Empirical evidence in various driving scenarios attests that FUSION significantly enhances the safety and generalizability of autonomous driving agents, even in the face of challenging and unseen environments. Furthermore, our ablation studies reveal noticeable improvements in the integration of causal representation into the offline safe RL algorithm. Our code implementation is available at: https://sites.google.com/view/safe-fusion/.

Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

TL;DR

Abstract

Paper Structure (12 sections, 7 equations, 6 figures, 2 tables, 2 algorithms)

This paper contains 12 sections, 7 equations, 6 figures, 2 tables, 2 algorithms.

Introduction
Related Works
Problem Formulation
Methodology
Causal Ensemble World Model Learning
Safety-aware Bisimulation Learning
Experiments
Experiment Setup
Evalation Environment
Baselines
Results and Analysis
Conclusions

Figures (6)

Figure 1: Diagram depicting offline-to-online generalization via a modular reasoning framework. The agent learns a causal abstraction from offline demonstration trajectories and then applies it to different environmental components during online implementation. The distribution shift between offline datasets and online environment can lead to unsatisfying safety or efficiency in driving performance. This abstracted representation enables learning agile agents for unseen scenarios in a zero-shot manner while enhancing safety and efficiency.
Figure 2: Overview of Safety-aware structural Scenario Representation Framework. The diagram on the left shows a safety-aware decision transformer that conducts sequential decision-making based on the temporal contexts. The right diagram shows the general form of the graphical model in the CEWM and Policy Learning modules in FUSION, where the connection between different timesteps will be determined by the attention weights in the causal transformer. The nodes in a later timestep depend on their parental nodes in the previous timesteps.
Figure 3: Safety-aware bisimulation metrics with the distribution distance in transition dynamics, rewards, and safety cost.
Figure 4: The figure shows both birds-eye-view (BEV) and third-person-view (3PV) images of two case studies in roundabouts. The first case is a merge-in behavior from normal traffic, and the ego vehicles controlled by FUSION will decelerate reasonably to keep the distance from the front vehicle. The second case is an adversarial driver trying to cut in from the wrong side of the roundabout exit, FUSION manages to yield to it safely.
Figure 5: The figure shows the comparison of FUSION on different road configurations with baselines. The larger lidar plot on each coordinate stands for the safer performance in each safety metric. (AR: Arrival, NS: Not speeding, IT: In-time, CF: Collision-free, SL: Stay in-lane.)
...and 1 more figures

Theorems & Definitions (5)

Definition 1
Definition 2: Factorizable State Space
Definition 3
Definition 4: Safety-aware Bisimulation Relation
Definition 5: Safety-aware Bisimulation Metrics

Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

TL;DR

Abstract

Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (6)

Theorems & Definitions (5)