D2RL: Deep Dense Architectures in Reinforcement Learning
Samarth Sinha, Homanga Bharadhwaj, Aravind Srinivas, Animesh Garg
TL;DR
This work identifies a core bottleneck in deep reinforcement learning: naively deep MLPs propagate information poorly due to DPI, hindering sample efficiency and optimization. It proposes D2RL, a Dense Deep Architecture for Reinforcement Learning that concatenates inputs into every hidden layer, enabling deeper yet information-preserving networks for both policy and value function representations. Across a wide range of manipulation and locomotion tasks and multiple off-policy algorithms (e.g., SAC, TD3), D2RL yields significant gains in sample efficiency and asymptotic performance, and outperforms ResNet-style alternatives. The results suggest architectural inductive biases—specifically dense connectivity—are a practical lever to improve DRL in robotics, with broad applicability and a public code baseline.
Abstract
While improvements in deep learning architectures have played a crucial role in improving the state of supervised and unsupervised learning in computer vision and natural language processing, neural network architecture choices for reinforcement learning remain relatively under-explored. We take inspiration from successful architectural choices in computer vision and generative modelling, and investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments. Our findings reveal that current methods benefit significantly from dense connections and deeper networks, across a suite of manipulation and locomotion tasks, for both proprioceptive and image-based observations. We hope that our results can serve as a strong baseline and further motivate future research into neural network architectures for reinforcement learning. The project website with code is at this link https://sites.google.com/view/d2rl/home.
