Stable Deep Reinforcement Learning via Isotropic Gaussian Representations
Ali Saheb, Johan Obando-Ceron, Aaron Courville, Pouya Bashivan, Pablo Samuel Castro
TL;DR
This work analyzes how non-stationarity in deep reinforcement learning destabilizes training and degrades representations. It advocates isotropic Gaussian representations, enforced via the Sketched Isotropic Gaussian Regularization (SIGReg), as a principled prior that yields stable tracking of drifting targets and maximizes entropy under a fixed variance budget. Theoretical analysis shows that isotropy provides uniform contraction across directions while Gaussian tails minimize drift variance, and empirical results across CIFAR-10 shifts, Atari PQN/PPO, and Isaac Gym demonstrate improved stability, reduced neuron dormancy, and higher performance. Overall, shaping representation geometry emerges as a robust, lightweight pathway to stabilizing learning in non-stationary, online RL settings with broad applicability across algorithms and domains.
Abstract
Deep reinforcement learning systems often suffer from unstable training dynamics due to non-stationarity, where learning objectives and data distributions evolve over time. We show that under non-stationary targets, isotropic Gaussian embeddings are provably advantageous. In particular, they induce stable tracking of time-varying targets for linear readouts, achieve maximal entropy under a fixed variance budget, and encourage a balanced use of all representational dimensions--all of which enable agents to be more adaptive and stable. Building on this insight, we propose the use of Sketched Isotropic Gaussian Regularization for shaping representations toward an isotropic Gaussian distribution during training. We demonstrate empirically, over a variety of domains, that this simple and computationally inexpensive method improves performance under non-stationarity while reducing representation collapse, neuron dormancy, and training instability.
