CaMU: Disentangling Causal Effects in Deep Model Unlearning
Shaofei Shen, Chenhao Zhang, Alina Bialkowski, Weitong Chen, Miao Xu
TL;DR
This paper addresses the challenge of deep model unlearning, where removing forgetting data can inadvertently degrade performance on remaining data due to intertwined information. It introduces CaMU, a causal framework that constructs data-, representation-, and output-level causal graphs to disentangle forgetting and remaining data. CaMU uses counterfactual data to erase the causal influence of forgetting samples while preserving the remaining-data causal effects, optimized via KL-divergence and cross-entropy losses on a joint dataset. Comprehensive experiments across MNIST and CIFAR datasets show CaMU improves remaining-data performance, minimizes forgetting leakage, and offers stable relearning behavior, with ablations confirming the efficacy of the proposed causal components. Overall, CaMU represents a first step toward causality-informed unlearning, enabling more reliable and privacy-preserving deep-model updates with broad practical implications.
Abstract
Machine unlearning requires removing the information of forgetting data while keeping the necessary information of remaining data. Despite recent advancements in this area, existing methodologies mainly focus on the effect of removing forgetting data without considering the negative impact this can have on the information of the remaining data, resulting in significant performance degradation after data removal. Although some methods try to repair the performance of remaining data after removal, the forgotten information can also return after repair. Such an issue is due to the intricate intertwining of the forgetting and remaining data. Without adequately differentiating the influence of these two kinds of data on the model, existing algorithms take the risk of either inadequate removal of the forgetting data or unnecessary loss of valuable information from the remaining data. To address this shortcoming, the present study undertakes a causal analysis of the unlearning and introduces a novel framework termed Causal Machine Unlearning (CaMU). This framework adds intervention on the information of remaining data to disentangle the causal effects between forgetting data and remaining data. Then CaMU eliminates the causal impact associated with forgetting data while concurrently preserving the causal relevance of the remaining data. Comprehensive empirical results on various datasets and models suggest that CaMU enhances performance on the remaining data and effectively minimizes the influences of forgetting data. Notably, this work is the first to interpret deep model unlearning tasks from a new perspective of causality and provide a solution based on causal analysis, which opens up new possibilities for future research in deep model unlearning.
