Fisher-Guided Selective Forgetting: Mitigating The Primacy Bias in Deep Reinforcement Learning
Massimiliano Falzari, Matthia Sabatelli
TL;DR
PB in DRL causes early experiences to disproportionately shape learning, hindering adaptation. The authors introduce Fisher-Guided Selective Forgetting (FGSF), which uses the Fisher Information Matrix to identify PB dynamics and perform targeted, stochastic weight perturbations as a form of selective unlearning. Across DeepMind Control Suite tasks, FGSF improves final performance and stability relative to SAC and reset baselines, and shows robustness to replay ratios while revealing that the critic is more PB-prone than the actor. By marrying information geometry with machine unlearning, the work demonstrates a practical, geometry-aware approach to bias mitigation with potential for broader DRL applicability. While incurring modest computational overhead and requiring hyperparameter tuning, FGSF highlights a promising direction for bias-aware, data-efficient reinforcement learning and transferability of information-geometric ideas to learning dynamics.
Abstract
Deep Reinforcement Learning (DRL) systems often tend to overfit to early experiences, a phenomenon known as the primacy bias (PB). This bias can severely hinder learning efficiency and final performance, particularly in complex environments. This paper presents a comprehensive investigation of PB through the lens of the Fisher Information Matrix (FIM). We develop a framework characterizing PB through distinct patterns in the FIM trace, identifying critical memorization and reorganization phases during learning. Building on this understanding, we propose Fisher-Guided Selective Forgetting (FGSF), a novel method that leverages the geometric structure of the parameter space to selectively modify network weights, preventing early experiences from dominating the learning process. Empirical results across DeepMind Control Suite (DMC) environments show that FGSF consistently outperforms baselines, particularly in complex tasks. We analyze the different impacts of PB on actor and critic networks, the role of replay ratios in exacerbating the effect, and the effectiveness of even simple noise injection methods. Our findings provide a deeper understanding of PB and practical mitigation strategies, offering a FIM-based geometric perspective for advancing DRL.
