ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty

Chenliang Li; Junyu Leng; Jiaxiang Li; Youbang Sun; Shixiang Chen; Shahin Shahrampour; Alfredo Garcia

ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty

Chenliang Li, Junyu Leng, Jiaxiang Li, Youbang Sun, Shixiang Chen, Shahin Shahrampour, Alfredo Garcia

TL;DR

Empirical results demonstrate that adaptive low-rank policy representations provide an efficient and principled alternative for robust RL under model uncertainty, and highlight that adaptive low-rank policy representations provide an efficient and principled alternative for robust RL under model uncertainty.

Abstract

Robust reinforcement learning (Robust RL) seeks to handle epistemic uncertainty in environment dynamics, but existing approaches often rely on nested min--max optimization, which is computationally expensive and yields overly conservative policies. We propose \textbf{Adaptive Rank Representation (AdaRL)}, a bi-level optimization framework that improves robustness by aligning policy complexity with the intrinsic dimension of the task. At the lower level, AdaRL performs policy optimization under fixed-rank constraints with dynamics sampled from a Wasserstein ball around a centroid model. At the upper level, it adaptively adjusts the rank to balance the bias--variance trade-off, projecting policy parameters onto a low-rank manifold. This design avoids solving adversarial worst-case dynamics while ensuring robustness without over-parameterization. Empirical results on MuJoCo continuous control benchmarks demonstrate that AdaRL not only consistently outperforms fixed-rank baselines (e.g., SAC) and state-of-the-art robust RL methods (e.g., RNAC, Parseval), but also converges toward the intrinsic rank of the underlying tasks. These results highlight that adaptive low-rank policy representations provide an efficient and principled alternative for robust RL under model uncertainty.

ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty

TL;DR

Abstract

ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (1)