Table of Contents
Fetching ...

Digital Twin-enabled Multi-generation Control Co-Design with Deep Reinforcement Learning

Ying-Kuan Tsai, Vispi Karkaria, Yi-Ping Chen, Wei Chen

TL;DR

This paper tackles robust design under uncertainty by integrating Digital Twins with Deep Reinforcement Learning into a multi-generation Control Co-Design framework. It develops a simultaneous CCD formulation that jointly optimizes physical design $p$ and policy parameters $\boldsymbol{\theta}$ within a DT-enabled loop, using DRL (PPO) and automatic differentiation to enable end-to-end learning across generations. Key innovations include embedding physical design into the learning inputs, initializing with invariant-set-guided samples, and employing quantile regression to model and penalize environmental uncertainty, leading to progressively better designs and controllers as more real-world data accrue. The active suspension case demonstrates that Gen-2 designs achieve smoother trajectories, lower control effort, and reduced variability, illustrating the practical impact for safety-critical, uncertain environments.

Abstract

Control Co-Design (CCD) integrates physical and control system design to improve the performance of dynamic and autonomous systems. Despite advances in uncertainty-aware CCD methods, real-world uncertainties remain highly unpredictable. Multi-generation design addresses this challenge by considering the full lifecycle of a product: data collected from each generation informs the design of subsequent generations, enabling progressive improvements in robustness and efficiency. Digital Twin (DT) technology further strengthens this paradigm by creating virtual representations that evolve over the lifecycle through real-time sensing, model updating, and adaptive re-optimization. This paper presents a DT-enabled CCD framework that integrates Deep Reinforcement Learning (DRL) to jointly optimize physical design and controller. DRL accelerates real-time decision-making by allowing controllers to continuously learn from data and adapt to uncertain environments. Extending this approach, the framework employs a multi-generation paradigm, where each cycle of deployment, operation, and redesign uses collected data to refine DT models, improve uncertainty quantification through quantile regression, and inform next-generation designs of both physical components and controllers. The framework is demonstrated on an active suspension system, where DT-enabled learning from road conditions and driving behaviors yields smoother and more stable control trajectories. Results show that the method significantly enhances dynamic performance, robustness, and efficiency. Contributions of this work include: (1) extending CCD into a lifecycle-oriented multi-generation framework, (2) leveraging DTs for continuous model updating and informed design, and (3) employing DRL to accelerate adaptive real-time decision-making.

Digital Twin-enabled Multi-generation Control Co-Design with Deep Reinforcement Learning

TL;DR

This paper tackles robust design under uncertainty by integrating Digital Twins with Deep Reinforcement Learning into a multi-generation Control Co-Design framework. It develops a simultaneous CCD formulation that jointly optimizes physical design and policy parameters within a DT-enabled loop, using DRL (PPO) and automatic differentiation to enable end-to-end learning across generations. Key innovations include embedding physical design into the learning inputs, initializing with invariant-set-guided samples, and employing quantile regression to model and penalize environmental uncertainty, leading to progressively better designs and controllers as more real-world data accrue. The active suspension case demonstrates that Gen-2 designs achieve smoother trajectories, lower control effort, and reduced variability, illustrating the practical impact for safety-critical, uncertain environments.

Abstract

Control Co-Design (CCD) integrates physical and control system design to improve the performance of dynamic and autonomous systems. Despite advances in uncertainty-aware CCD methods, real-world uncertainties remain highly unpredictable. Multi-generation design addresses this challenge by considering the full lifecycle of a product: data collected from each generation informs the design of subsequent generations, enabling progressive improvements in robustness and efficiency. Digital Twin (DT) technology further strengthens this paradigm by creating virtual representations that evolve over the lifecycle through real-time sensing, model updating, and adaptive re-optimization. This paper presents a DT-enabled CCD framework that integrates Deep Reinforcement Learning (DRL) to jointly optimize physical design and controller. DRL accelerates real-time decision-making by allowing controllers to continuously learn from data and adapt to uncertain environments. Extending this approach, the framework employs a multi-generation paradigm, where each cycle of deployment, operation, and redesign uses collected data to refine DT models, improve uncertainty quantification through quantile regression, and inform next-generation designs of both physical components and controllers. The framework is demonstrated on an active suspension system, where DT-enabled learning from road conditions and driving behaviors yields smoother and more stable control trajectories. Results show that the method significantly enhances dynamic performance, robustness, and efficiency. Contributions of this work include: (1) extending CCD into a lifecycle-oriented multi-generation framework, (2) leveraging DTs for continuous model updating and informed design, and (3) employing DRL to accelerate adaptive real-time decision-making.

Paper Structure

This paper contains 19 sections, 19 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: CCD formulations for feedback policy-based systems: (a) simultaneous and (b) nested (bi-level), where $\boldsymbol{\pi}_{\boldsymbol{\theta}}$ represents a feedback policy which is a function of states $\mathbf{x}$ and system parameters $\mathbf{p}\in\mathcal{P}$ with policy parameters $\boldsymbol{\theta}\in\mathbf{\Theta}$.
  • Figure 2: Diagram of reinforcement learning (RL), modified from sutton1998reinforcement.
  • Figure 3: Proposed multi-generation Control Co-Design (CCD) framework for Digital Twin (DT) systems. (a) Overview of the CCD process across generations. (b) Offline initialization using Latin Hypercube Sampling (LHS) and optimal control theory to train initial policy $\boldsymbol{\pi}_0$ and value function $V_0$. (c) First CCD optimization using gradient-based DRL with auto-differentiation. (d) Online deployment and data collection in Generation-1, where real-time feedback is used to update the digital model and policy. (e) Second CCD optimization using updated models to improve performance in subsequent generations.
  • Figure 4: Visualization of $\mathcal{X}_{fea}$ of the illustrative example for different values of $p$. The pink area in each plot represents the state feasible region $\mathcal{X}_{fea}$ with the specified value of $p$, which expands as $p$ increases. This trend reflects improved controllability of the system due to the larger entries in Matrix $\mathbf{B}$, allowing a broader set of states to be driven to the origin under the state and control input constraints.
  • Figure 5: Flow chart of the proposed DRL-based CCD optimization using the PPO algorithm. Unlike conventional PPO, this framework integrates the physical system parameter $\mathbf{p}$ as part of the input space and enables gradient-based updates of both the control policy and the physical design. This is achieved by treating the environment dynamics as differentiable and enabling backpropagation through the environment to the loss function.
  • ...and 8 more figures