Digital Twin-enabled Multi-generation Control Co-Design with Deep Reinforcement Learning
Ying-Kuan Tsai, Vispi Karkaria, Yi-Ping Chen, Wei Chen
TL;DR
This paper tackles robust design under uncertainty by integrating Digital Twins with Deep Reinforcement Learning into a multi-generation Control Co-Design framework. It develops a simultaneous CCD formulation that jointly optimizes physical design $p$ and policy parameters $\boldsymbol{\theta}$ within a DT-enabled loop, using DRL (PPO) and automatic differentiation to enable end-to-end learning across generations. Key innovations include embedding physical design into the learning inputs, initializing with invariant-set-guided samples, and employing quantile regression to model and penalize environmental uncertainty, leading to progressively better designs and controllers as more real-world data accrue. The active suspension case demonstrates that Gen-2 designs achieve smoother trajectories, lower control effort, and reduced variability, illustrating the practical impact for safety-critical, uncertain environments.
Abstract
Control Co-Design (CCD) integrates physical and control system design to improve the performance of dynamic and autonomous systems. Despite advances in uncertainty-aware CCD methods, real-world uncertainties remain highly unpredictable. Multi-generation design addresses this challenge by considering the full lifecycle of a product: data collected from each generation informs the design of subsequent generations, enabling progressive improvements in robustness and efficiency. Digital Twin (DT) technology further strengthens this paradigm by creating virtual representations that evolve over the lifecycle through real-time sensing, model updating, and adaptive re-optimization. This paper presents a DT-enabled CCD framework that integrates Deep Reinforcement Learning (DRL) to jointly optimize physical design and controller. DRL accelerates real-time decision-making by allowing controllers to continuously learn from data and adapt to uncertain environments. Extending this approach, the framework employs a multi-generation paradigm, where each cycle of deployment, operation, and redesign uses collected data to refine DT models, improve uncertainty quantification through quantile regression, and inform next-generation designs of both physical components and controllers. The framework is demonstrated on an active suspension system, where DT-enabled learning from road conditions and driving behaviors yields smoother and more stable control trajectories. Results show that the method significantly enhances dynamic performance, robustness, and efficiency. Contributions of this work include: (1) extending CCD into a lifecycle-oriented multi-generation framework, (2) leveraging DTs for continuous model updating and informed design, and (3) employing DRL to accelerate adaptive real-time decision-making.
