Maintenance Strategies for Sewer Pipes with Multi-State Degradation and Deep Reinforcement Learning

Lisandro A. Jimenez-Roa; Thiago D. Simão; Zaharah Bukhsh; Tiedo Tinga; Hajo Molegraaf; Nils Jansen; Marielle Stoelinga

Maintenance Strategies for Sewer Pipes with Multi-State Degradation and Deep Reinforcement Learning

Lisandro A. Jimenez-Roa, Thiago D. Simão, Zaharah Bukhsh, Tiedo Tinga, Hajo Molegraaf, Nils Jansen, Marielle Stoelinga

TL;DR

This work addresses maintenance policy optimization for sewer pipes under multi-state degradation by integrating Multi-State Degradation Models (MSDM) with Deep Reinforcement Learning (DRL). The authors formulate a prognostics-informed Markov Decision Process and train PPO-based agents in MSDM-driven environments, evaluating against traditional heuristics through a Dutch case study ( Breda, 25k+ pipes). Key findings show that DRL policies, particularly those trained with Gompertz-based MSDMs, adapt to pipe age and surpass condition-based, scheduled, and reactive maintenance in cost efficiency, while maintaining lower degradation levels. The study demonstrates the practical potential of DRL in PHM for long-lived civil infrastructure and highlights the value of incorporating prognostic outputs into the RL state for improved decision-making. Future work envisions partial observability, system-level expansion, and broader algorithmic comparisons to further enhance robust, explainable maintenance strategies.

Abstract

Large-scale infrastructure systems are crucial for societal welfare, and their effective management requires strategic forecasting and intervention methods that account for various complexities. Our study addresses two challenges within the Prognostics and Health Management (PHM) framework applied to sewer assets: modeling pipe degradation across severity levels and developing effective maintenance policies. We employ Multi-State Degradation Models (MSDM) to represent the stochastic degradation process in sewer pipes and use Deep Reinforcement Learning (DRL) to devise maintenance strategies. A case study of a Dutch sewer network exemplifies our methodology. Our findings demonstrate the model's effectiveness in generating intelligent, cost-saving maintenance strategies that surpass heuristics. It adapts its management strategy based on the pipe's age, opting for a passive approach for newer pipes and transitioning to active strategies for older ones to prevent failures and reduce costs. This research highlights DRL's potential in optimizing maintenance policies. Future research will aim improve the model by incorporating partial observability, exploring various reinforcement learning algorithms, and extending this methodology to comprehensive infrastructure management.

Maintenance Strategies for Sewer Pipes with Multi-State Degradation and Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (36 sections, 21 equations, 7 figures, 8 tables)

This paper contains 36 sections, 21 equations, 7 figures, 8 tables.

Introduction
Contributions.
Paper outline.
Related work.
Technical background
Multi-state degradation model for sewer pipes
IHTMC.
Pipe-element degradation model.
Parametrization of IHTMC.
Markov Decision Process
Deep Reinforcement Learning
Methodology
Multi-state degradation models
Case study
Parametrization
...and 21 more sections

Figures (7)

Figure 1: Markov chain structure for IHTMC.
Figure 2: Methodology overview for sewer pipe maintenance policy optimization using Deep Reinforcement Learning and Multi-State Degradation models.
Figure 3: Probability of being in state $k \in \Omega$ at pipe age $t$$S_k(t)$, using three hazard functions modeled via Exponential, Gompertz, and Weibull probability density functions. The Turnbull non-parametric estimator indicates the ground truth. The gray circles indicate the frequency based on the inspection data set.
Figure 4: Environment for maintenance policy optimization of a sewer pipe via Deep Reinforcement Learning, considering degradation along the pipe length.
Figure 5: Behavior of policies over an episode for a new pipe, showing the health vector over the pipe age and actions per policy: (a) Agent-G, (b) Agent-E, (c) Condition-based Maintenance (CBM), and (d) Scheduled Maintenance (SchM).
...and 2 more figures

Maintenance Strategies for Sewer Pipes with Multi-State Degradation and Deep Reinforcement Learning

TL;DR

Abstract

Maintenance Strategies for Sewer Pipes with Multi-State Degradation and Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)