Table of Contents
Fetching ...

A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs

Alberto Pliego Marugán, Jesús M. Pinar-Pérez, Fausto Pedro García Márquez

TL;DR

This work addresses maintenance optimization for deteriorating systems by coupling a continuous gamma degradation process with a memoryful, increasingly imperfect repair model and a reinforcement learning agent based on Double Deep Q-Networks. The agent operates in a continuous degradation state without predefined preventive thresholds and learns maintenance policies from interactions framed as a Markov decision process, balancing repair, replacement, and downtime costs. Key contributions include the novel gamma SDP with imperfect repairs, a DDQN-based policy that accommodates continuous states and discrete actions, and extensive case studies showing substantial long-run cost reductions compared with conventional CBM policies. The approach is particularly relevant for Industry 4.0 contexts, offering flexible, parameter-sensitive maintenance decisions, though it is not intended for critical safety-critical systems where failures have catastrophic consequences.

Abstract

Efficient maintenance has always been essential for the successful application of engineering systems. However, the challenges to be overcome in the implementation of Industry 4.0 necessitate new paradigms of maintenance optimization. Machine learning techniques are becoming increasingly used in engineering and maintenance, with reinforcement learning being one of the most promising. In this paper, we propose a gamma degradation process together with a novel maintenance model in which repairs are increasingly imperfect, i.e., the beneficial effect of system repairs decreases as more repairs are performed, reflecting the degradational behavior of real-world systems. To generate maintenance policies for this system, we developed a reinforcement-learning-based agent using a Double Deep Q-Network architecture. This agent presents two important advantages: it works without a predefined preventive threshold, and it can operate in a continuous degradation state space. Our agent learns to behave in different scenarios, showing great flexibility. In addition, we performed an analysis of how changes in the main parameters of the environment affect the maintenance policy proposed by the agent. The proposed approach is demonstrated to be appropriate and to significatively improve long-run cost as compared with other common maintenance strategies.

A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs

TL;DR

This work addresses maintenance optimization for deteriorating systems by coupling a continuous gamma degradation process with a memoryful, increasingly imperfect repair model and a reinforcement learning agent based on Double Deep Q-Networks. The agent operates in a continuous degradation state without predefined preventive thresholds and learns maintenance policies from interactions framed as a Markov decision process, balancing repair, replacement, and downtime costs. Key contributions include the novel gamma SDP with imperfect repairs, a DDQN-based policy that accommodates continuous states and discrete actions, and extensive case studies showing substantial long-run cost reductions compared with conventional CBM policies. The approach is particularly relevant for Industry 4.0 contexts, offering flexible, parameter-sensitive maintenance decisions, though it is not intended for critical safety-critical systems where failures have catastrophic consequences.

Abstract

Efficient maintenance has always been essential for the successful application of engineering systems. However, the challenges to be overcome in the implementation of Industry 4.0 necessitate new paradigms of maintenance optimization. Machine learning techniques are becoming increasingly used in engineering and maintenance, with reinforcement learning being one of the most promising. In this paper, we propose a gamma degradation process together with a novel maintenance model in which repairs are increasingly imperfect, i.e., the beneficial effect of system repairs decreases as more repairs are performed, reflecting the degradational behavior of real-world systems. To generate maintenance policies for this system, we developed a reinforcement-learning-based agent using a Double Deep Q-Network architecture. This agent presents two important advantages: it works without a predefined preventive threshold, and it can operate in a continuous degradation state space. Our agent learns to behave in different scenarios, showing great flexibility. In addition, we performed an analysis of how changes in the main parameters of the environment affect the maintenance policy proposed by the agent. The proposed approach is demonstrated to be appropriate and to significatively improve long-run cost as compared with other common maintenance strategies.

Paper Structure

This paper contains 14 sections, 18 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: General RL structure. Adapted from sutton2018reinforcement
  • Figure 2: Double Deep Q-Network architecture
  • Figure 3: An example of the degradation process. (green circle: preventive repair; red circle: corrective replacement; orange circle: preventive replacement)
  • Figure 4: RL-based maintenance for all case studies
  • Figure 5: Amount of maintenance actions
  • ...and 4 more figures