GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Hsin-Jung Yang; Joe Beck; Md Zahid Hasan; Ekin Beyazit; Subhadeep Chakraborty; Tichakorn Wongpiromsarn; Soumik Sarkar

GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Hsin-Jung Yang, Joe Beck, Md Zahid Hasan, Ekin Beyazit, Subhadeep Chakraborty, Tichakorn Wongpiromsarn, Soumik Sarkar

Abstract

In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and reinforcement learning techniques to systematically generate naturalistic edge cases. By simulating challenging conditions that mimic the real-world situations, our framework aims to rigorously test entire system's safety and reliability. Although demonstrated within the autonomous driving application, our methodology is adaptable across diverse autonomous systems. Our experimental validation, conducted on high-fidelity simulator underscores the overall effectiveness of this framework.

GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Abstract

Paper Structure (35 sections, 2 equations, 3 figures)

This paper contains 35 sections, 2 equations, 3 figures.

Introduction
Related Works
Background
Deep Reinforcement Learning
Rulebook
Methodology
DRL Problem Formulation
State space
Action space
Reward
GENESIS-RL Framework
DRL agent
Initial scene generator
Simulator
System
...and 20 more sections

Figures (3)

Figure 1: Architectural overview of the proposed framework. At each step $t$, the DRL agent observes a state $s_t$ (1) and executes an action $a_t$ (2). The simulator then updates the simulated world accordingly and creates an updated frame. The updated frame (3) is then processed by the system to generate vehicle control signals (4). The control signals are then applied to the simulated world to update the vehicle trajectory (5). The reward calculator evaluates the performance by comparing the ego vehicle's trajectory against the rulebook and also computes the learning module loss, issuing a scalar reward $r_t$ (6) that guides the DRL agent's learning process.
Figure 2: Violation scores and the sum of minimum following distance deficit (based on RSS) across three testing scenarios - the system operates under sunny weather, random weather and under the GENESIS-RL policy. The results presented are averages across 50 runs with randomly selected initial scenes.
Figure 3: Examples of the system failure modes based on vehicle telemetry. (a) Example of a non-detection collision - where the system crashed into the vehicle in front at full/high speed due to the non-detection of the other vehicle. (b) Intermittent detection collision - where the intermittent detection prevents the ego vehicle from stopping in time, leading to a lower-speed collision. (c) Delayed detection collision - where the system detects the lead car too late to stop in time. In the tests depicted in this figure, the ego vehicle successfully avoided collisions across both sunny and random weather scenarios and only failed under the conditions generated by GENESIS-RL.

Theorems & Definitions (2)

Remark
Remark

GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Abstract

GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Authors

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (2)