Table of Contents
Fetching ...

Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Rong Gu

TL;DR

This paper proposes using formal methods to model AD systems and demonstrates how model checking (MC) can be used in RL for AD and shows different facets where MC can strengthen RL.

Abstract

Most reinforcement learning (RL) platforms use high-level programming languages, such as OpenAI Gymnasium using Python. These frameworks provide various API and benchmarks for testing RL algorithms in different domains, such as autonomous driving (AD) and robotics. These platforms often emphasise the design of RL algorithms and the training performance but neglect the correctness of models and reward functions, which can be crucial for the successful application of RL. This paper proposes using formal methods to model AD systems and demonstrates how model checking (MC) can be used in RL for AD. Most studies combining MC and RL focus on safety, such as safety shields. However, this paper shows different facets where MC can strengthen RL. First, an MC-based model pre-analysis can reveal bugs with respect to sensor accuracy and learning step size. This step serves as a preparation of RL, which saves time if bugs exist and deepens users' understanding of the target system. Second, reward automata can benefit the design of reward functions and greatly improve learning performance especially when the learning objectives are multiple. All these findings are supported by experiments.

Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

TL;DR

This paper proposes using formal methods to model AD systems and demonstrates how model checking (MC) can be used in RL for AD and shows different facets where MC can strengthen RL.

Abstract

Most reinforcement learning (RL) platforms use high-level programming languages, such as OpenAI Gymnasium using Python. These frameworks provide various API and benchmarks for testing RL algorithms in different domains, such as autonomous driving (AD) and robotics. These platforms often emphasise the design of RL algorithms and the training performance but neglect the correctness of models and reward functions, which can be crucial for the successful application of RL. This paper proposes using formal methods to model AD systems and demonstrates how model checking (MC) can be used in RL for AD. Most studies combining MC and RL focus on safety, such as safety shields. However, this paper shows different facets where MC can strengthen RL. First, an MC-based model pre-analysis can reveal bugs with respect to sensor accuracy and learning step size. This step serves as a preparation of RL, which saves time if bugs exist and deepens users' understanding of the target system. Second, reward automata can benefit the design of reward functions and greatly improve learning performance especially when the learning objectives are multiple. All these findings are supported by experiments.

Paper Structure

This paper contains 18 sections, 11 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: CommonUppRoad, a platform for model-checking-based AD motion planning, verification, and visualisation. Figure adopted from the literature gu2024commonupproad.
  • Figure 2: UPPAAL model templates of AD vehicles. In (a), (b), and (c), s, d, and p are broadcast synchronisation channels, sense(), can(), and act() are C-like code functions, a, c, and actID are integers, and P, C1, C2, D, U, and MACT are constant integers, and t is a clock. In (d), variables x, y, v, acc, and head are hybrid clocks, and iJerk and iTurn are integers, equations like v'==acc define the derivatives of the hybrid clocks, and i2d is a C-like code function.
  • Figure 3: Obstacle
  • Figure 4: An illustration of potential problems when running RL without pre-analysis.
  • Figure 5: An example of reward automata. Hybrid clocks sa, co, and pr represent the rewards for safety, comfort, and progress, respectively.
  • ...and 2 more figures