CDRL: A Reinforcement Learning Framework Inspired by Cerebellar Circuits and Dendritic Computational Strategies
Sibo Zhang, Rui Jing, Liangfu Lv, Jian Zhang, Yunliang Zang
TL;DR
CDRL introduces a cerebellum-inspired reinforcement learning architecture that integrates large expansion, sparse connectivity, sparse activation, and dendritic modulation into a DDQN framework. By mapping the input state through GrCs with sparse MF connections and gating via dendritic modulation, the approach yields improved sample efficiency, noise robustness, and generalization in a high‑dimensional, noisy RL setting, demonstrated on a custom Pong-like task. A systematic sensitivity analysis shows the architecture can be tuned for performance under parameter constraints, underscoring cerebellar structural priors as valuable inductive biases for RL. The work suggests a path toward more robust, data-efficient RL by leveraging biologically grounded architectural priors beyond optimization strategies alone.
Abstract
Reinforcement learning (RL) has achieved notable performance in high-dimensional sequential decision-making tasks, yet remains limited by low sample efficiency, sensitivity to noise, and weak generalization under partial observability. Most existing approaches address these issues primarily through optimization strategies, while the role of architectural priors in shaping representation learning and decision dynamics is less explored. Inspired by structural principles of the cerebellum, we propose a biologically grounded RL architecture that incorporate large expansion, sparse connectivity, sparse activation, and dendritic-level modulation. Experiments on noisy, high-dimensional RL benchmarks show that both the cerebellar architecture and dendritic modulation consistently improve sample efficiency, robustness, and generalization compared to conventional designs. Sensitivity analysis of architectural parameters suggests that cerebellum-inspired structures can offer optimized performance for RL with constrained model parameters. Overall, our work underscores the value of cerebellar structural priors as effective inductive biases for RL.
