Residual Control for Fast Recovery from Dynamics Shifts

Nethmi Jayasinghe; Diana Gontero; Francesco Migliarba; Spencer T. Brown; Vinod K. Sangwan; Mark C. Hersam; Amit Ranjan Trivedi

Residual Control for Fast Recovery from Dynamics Shifts

Nethmi Jayasinghe, Diana Gontero, Francesco Migliarba, Spencer T. Brown, Vinod K. Sangwan, Mark C. Hersam, Amit Ranjan Trivedi

TL;DR

Across mid-episode perturbations including actuator degradation, mass variation, and contact changes, the proposed method consistently reduces recovery time relative to frozen and online-adaptation baselines while maintaining near-nominal steady-state performance.

Abstract

Robotic systems operating in real-world environments inevitably encounter unobserved dynamics shifts during continuous execution, including changes in actuation, mass distribution, or contact conditions. When such shifts occur mid-episode, even locally stabilizing learned policies can experience substantial transient performance degradation. While input-to-state stability guarantees bounded state deviation, it does not ensure rapid restoration of task-level performance. We address inference-time recovery under frozen policy parameters by casting adaptation as constrained disturbance shaping around a nominal stabilizing controller. We propose a stability-aligned residual control architecture in which a reinforcement learning policy trained under nominal dynamics remains fixed at deployment, and adaptation occurs exclusively through a bounded additive residual channel. A Stability Alignment Gate (SAG) regulates corrective authority through magnitude constraints, directional coherence with the nominal action, performance-conditioned activation, and adaptive gain modulation. These mechanisms preserve the nominal closed-loop structure while enabling rapid compensation for unobserved dynamics shifts without retraining or privileged disturbance information. Across mid-episode perturbations including actuator degradation, mass variation, and contact changes, the proposed method consistently reduces recovery time relative to frozen and online-adaptation baselines while maintaining near-nominal steady-state performance. Recovery time is reduced by \textbf{87\%} on the Go1 quadruped, \textbf{48\%} on the Cassie biped, \textbf{30\%} on the H1 humanoid, and \textbf{20\%} on the Scout wheeled platform on average across evaluated conditions relative to a frozen SAC policy.

Residual Control for Fast Recovery from Dynamics Shifts

TL;DR

Abstract

Paper Structure (13 sections, 21 equations, 5 figures, 3 tables)

This paper contains 13 sections, 21 equations, 5 figures, 3 tables.

Introduction
Relation to Prior Work
Methodology
Problem Formulation and Control Objective
Cerebellar-Inspired Principles for Residual Control
Stability Alignment and Authority Regulation
Experimental Setup
Results
Quantitative Comparison on Unitree Go1
Scaling Across Fault Severities
Cross-Platform Evaluation
Ablation Study
Conclusion

Figures (5)

Figure 1: Overview of the proposed cerebellar-inspired residual control architecture. A frozen RL policy provides nominal control, while a parallel cerebellar residual controller generates per-joint residual actions via microzone-based pathways. Tracking-error driven learning updates dual-timescale residual heads online for rapid adaptation. A Stability Alignment Gate (SAG) constrains residual magnitude and directional alignment before combining the residual with the nominal policy to produce the final action $a_t$.
Figure 2: Performance under mid-episode perturbations on the Go1 quadruped across increasing fault severities. (a) Friction increase: Recovery AUC (↑). (b) Actuator degradation: Steady-State Ratio (↑). (c) Mass increase: Time-to-Recovery (TTR-50, ↓). Averaged over 30 trials per condition. The proposed method achieves faster recovery while preserving competitive steady-state performance across perturbation types.
Figure 3: Normalized reward traces following mid-episode mild actuator degradation (scaling factor 0.80) on the Go1 quadruped. The fault is injected at timestep 500.
Figure 4: Cassie and H1 evaluation under mid-episode perturbations. (a,b) Cassie under friction and mass increase. (c,d) H1 under friction and mass increase. (e--h) Steady-state stability ratio (SSR; higher is better) versus fault severity. (i--l) Recovery time (TTR-50; lower is better) for the same conditions. The proposed stability-aligned residual controller consistently reduces recovery time while maintaining near-nominal steady-state performance across bipedal and humanoid platforms.
Figure 5: Scout platform evaluation under mid-episode perturbations. (a,b) Friction decrease and mass increase scenarios. (c,d) Steady-state stability ratio (SSR; higher is better) versus fault severity. (e,f) Recovery time (TTR-50; lower is better) for the same conditions. The proposed stability-aligned residual controller improves recovery speed while preserving steady-state stability across perturbations.

Residual Control for Fast Recovery from Dynamics Shifts

TL;DR

Abstract

Residual Control for Fast Recovery from Dynamics Shifts

Authors

TL;DR

Abstract

Table of Contents

Figures (5)