Machine Unlearning: Linear Filtration for Logit-based Classifiers
Thomas Baumhauer, Pascal Schöttle, Matthias Zeppelzauer
TL;DR
This work tackles the data-deletion challenge posed by privacy regulations for ML by focusing on class-wide deletion in logit-based classifiers and introducing linear filtration as a fast, weak unlearning method that can be absorbed into the final layer. The proposed approach constructs a filtration that linearly transforms the classifier’s weight matrix to erase the influence of a deleted class while preserving performance on remaining classes. The authors formalize weak unlearning, evaluate it adversarially using a binary attacker on pre-softmax outputs, and demonstrate that normalization filtration significantly reduces leakage as evidenced by improved indistinguishability of seen vs not-seen distributions and mitigates model-inversion reconstructions for the deleted class. While promising, the method remains a shallow, black-box technique with potential for deeper integration and stronger guarantees, offering a practical complement to existing unlearning frameworks in scenarios where class ownership and post-hoc sanitization are required.
Abstract
Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used, and in particular a "right to be forgotten". This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data which has been part of the training process of a model? From this question emerges the field of machine unlearning, which could be broadly described as the investigation of how to "delete training data from models". Our work complements this direction of research for the specific setting of class-wide deletion requests for classification models (e.g. deep neural networks). As a first step, we propose linear filtration as a intuitive, computationally efficient sanitization method. Our experiments demonstrate benefits in an adversarial setting over naive deletion schemes.
