Table of Contents
Fetching ...

Fault-Tolerant Design and Multi-Objective Model Checking for Real-Time Deep Reinforcement Learning Systems

Guoxin Su, Thomas Robinson, Hoa Khanh Dam, Li Liu, David S. Rosenblum

Abstract

Deep reinforcement learning (DRL) has emerged as a powerful paradigm for solving complex decision-making problems. However, DRL-based systems still face significant dependability challenges particularly in real-time environments due to the simulation-to-reality gap, out-of-distribution observations, and the critical impact of latency. Latency-induced faults, in particular, can lead to unsafe or unstable behaviour, yet existing fault-tolerance approaches to DRL systems lack formal methods to rigorously analyse and optimise performance and safety simultaneously in real-time settings. To address this, we propose a formal framework for designing and analysing real-time switching mechanisms between DRL agents and alternative controllers. Our approach leverages Timed Automata (TAs) for explicit switch logic design, which is then syntactically converted to a Markov Decision Process (MDP) for formal analysis. We develop a novel convex query technique for multi-objective model checking, enabling the optimisation of soft performance objectives while ensuring hard safety constraints for MDPs. Furthermore, we present MOPMC, a GPU-accelerated software tool implementing this technique, demonstrating superior scalability in both model size and objective numbers.

Fault-Tolerant Design and Multi-Objective Model Checking for Real-Time Deep Reinforcement Learning Systems

Abstract

Deep reinforcement learning (DRL) has emerged as a powerful paradigm for solving complex decision-making problems. However, DRL-based systems still face significant dependability challenges particularly in real-time environments due to the simulation-to-reality gap, out-of-distribution observations, and the critical impact of latency. Latency-induced faults, in particular, can lead to unsafe or unstable behaviour, yet existing fault-tolerance approaches to DRL systems lack formal methods to rigorously analyse and optimise performance and safety simultaneously in real-time settings. To address this, we propose a formal framework for designing and analysing real-time switching mechanisms between DRL agents and alternative controllers. Our approach leverages Timed Automata (TAs) for explicit switch logic design, which is then syntactically converted to a Markov Decision Process (MDP) for formal analysis. We develop a novel convex query technique for multi-objective model checking, enabling the optimisation of soft performance objectives while ensuring hard safety constraints for MDPs. Furthermore, we present MOPMC, a GPU-accelerated software tool implementing this technique, demonstrating superior scalability in both model size and objective numbers.
Paper Structure (42 sections, 7 theorems, 7 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 42 sections, 7 theorems, 7 equations, 9 figures, 5 tables, 1 algorithm.

Key Result

Lemma 4.1

Let $\Psi\subseteq \mathbb{R}^n$ be a convex set of points. For any $\boldsymbol{v}\not\in \Psi$, there is a weight vector $\boldsymbol{w}$ such that $\boldsymbol{w}\cdot\boldsymbol{v}> \boldsymbol{w}\cdot\boldsymbol{x}$ for all $\boldsymbol{x}\in \Psi$. We say that $\boldsymbol{w}$separates$\boldsy

Figures (9)

  • Figure 1: DRL Agent for Autonomous Driving
  • Figure 2: Real-time Switch Architecture
  • Figure 3: Overview of the Switch Design Approach
  • Figure 4: $\mathcal{A}_{\mathit{sw}0}$: Base Model of a Real-time Switch
  • Figure 5: TAs for Synchronised Composition with $\mathcal{A}_{\mathit{SW}0}$
  • ...and 4 more figures

Theorems & Definitions (14)

  • Definition 3.1: Timed automata
  • Definition 3.3: Synchronised composition
  • Definition 3.4: Markov decision process
  • Definition 3.5: Scheduler
  • Definition 3.6: Reward function
  • Definition 3.7: Total reward
  • Lemma 4.1: Separating and supporting hyperplane theorems boyd2004convex
  • Lemma 4.2: Polytope for multi-objective MDP Etessami2008Forejt2011
  • Definition 4.3: Convex query
  • Theorem 4.4: Correctness of Algorithm \ref{['alg:main-convex-query']}
  • ...and 4 more