Table of Contents
Fetching ...

Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management

M. Saifullah, K. G. Papakonstantinou, C. P. Andriotis, S. M. Stoffels

TL;DR

The paper tackles lifecycle inspection and maintenance for large transportation networks under uncertainty by casting the problem as a constrained POMDP and solving it with a scalable multi-agent DRL approach. It introduces DDMAC-CTDE, a fully centralized-training, decentralized-execution framework that assigns one agent per component and uses a centralized critic to guide learning, enabling near-optimal cross-asset decisions. The authors demonstrate substantial cost savings over Condition-Based Maintenance (CBM) and VDOT baselines on a detailed Hampton Roads network, while meeting hard budget and soft performance constraints. This work advances practical, constraint-aware DRL for infrastructure management and provides a comprehensive modeling environment linking pavements and bridges through CCI and IRI indices, gamma-process deterioration, and network-wide risk metrics.

Abstract

We present a multi-agent Deep Reinforcement Learning (DRL) framework for managing large transportation infrastructure systems over their life-cycle. Life-cycle management of such engineering systems is a computationally intensive task, requiring appropriate sequential inspection and maintenance decisions able to reduce long-term risks and costs, while dealing with different uncertainties and constraints that lie in high-dimensional spaces. To date, static age- or condition-based maintenance methods and risk-based or periodic inspection plans have mostly addressed this class of optimization problems. However, optimality, scalability, and uncertainty limitations are often manifested under such approaches. The optimization problem in this work is cast in the framework of constrained Partially Observable Markov Decision Processes (POMDPs), which provides a comprehensive mathematical basis for stochastic sequential decision settings with observation uncertainties, risk considerations, and limited resources. To address significantly large state and action spaces, a Deep Decentralized Multi-agent Actor-Critic (DDMAC) DRL method with Centralized Training and Decentralized Execution (CTDE), termed as DDMAC-CTDE is developed. The performance strengths of the DDMAC-CTDE method are demonstrated in a generally representative and realistic example application of an existing transportation network in Virginia, USA. The network includes several bridge and pavement components with nonstationary degradation, agency-imposed constraints, and traffic delay and risk considerations. Compared to traditional management policies for transportation networks, the proposed DDMAC-CTDE method vastly outperforms its counterparts. Overall, the proposed algorithmic framework provides near optimal solutions for transportation infrastructure management under real-world constraints and complexities.

Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management

TL;DR

The paper tackles lifecycle inspection and maintenance for large transportation networks under uncertainty by casting the problem as a constrained POMDP and solving it with a scalable multi-agent DRL approach. It introduces DDMAC-CTDE, a fully centralized-training, decentralized-execution framework that assigns one agent per component and uses a centralized critic to guide learning, enabling near-optimal cross-asset decisions. The authors demonstrate substantial cost savings over Condition-Based Maintenance (CBM) and VDOT baselines on a detailed Hampton Roads network, while meeting hard budget and soft performance constraints. This work advances practical, constraint-aware DRL for infrastructure management and provides a comprehensive modeling environment linking pavements and bridges through CCI and IRI indices, gamma-process deterioration, and network-wide risk metrics.

Abstract

We present a multi-agent Deep Reinforcement Learning (DRL) framework for managing large transportation infrastructure systems over their life-cycle. Life-cycle management of such engineering systems is a computationally intensive task, requiring appropriate sequential inspection and maintenance decisions able to reduce long-term risks and costs, while dealing with different uncertainties and constraints that lie in high-dimensional spaces. To date, static age- or condition-based maintenance methods and risk-based or periodic inspection plans have mostly addressed this class of optimization problems. However, optimality, scalability, and uncertainty limitations are often manifested under such approaches. The optimization problem in this work is cast in the framework of constrained Partially Observable Markov Decision Processes (POMDPs), which provides a comprehensive mathematical basis for stochastic sequential decision settings with observation uncertainties, risk considerations, and limited resources. To address significantly large state and action spaces, a Deep Decentralized Multi-agent Actor-Critic (DDMAC) DRL method with Centralized Training and Decentralized Execution (CTDE), termed as DDMAC-CTDE is developed. The performance strengths of the DDMAC-CTDE method are demonstrated in a generally representative and realistic example application of an existing transportation network in Virginia, USA. The network includes several bridge and pavement components with nonstationary degradation, agency-imposed constraints, and traffic delay and risk considerations. Compared to traditional management policies for transportation networks, the proposed DDMAC-CTDE method vastly outperforms its counterparts. Overall, the proposed algorithmic framework provides near optimal solutions for transportation infrastructure management under real-world constraints and complexities.
Paper Structure (40 sections, 34 equations, 12 figures, 32 tables, 3 algorithms)

This paper contains 40 sections, 34 equations, 12 figures, 32 tables, 3 algorithms.

Figures (12)

  • Figure 1: Constrained Deep Decentralized Multi-agent Actor Critic (DDMAC) with Centralized Training Decentralized Execution (CTDE) architecture.
  • Figure 2: Modeled mean CCI for different levels of traffic.
  • Figure 3: (a) Fitted gamma model, (b) Scatter plot for CCI corresponding to traffic level A.
  • Figure 4: Transition probabilities for Traffic level A, with (a) starting state = 6, (b) starting state = 5, smoothed over time.
  • Figure 5: Transition probabilities in time, moving from state 9 (left) and 8 (right) to lower states.
  • ...and 7 more figures