Efficient Availability Attacks against Supervised and Contrastive Learning Simultaneously
Yihan Wang, Yifan Zhu, Xiao-Shan Gao
TL;DR
This work tackles data protection by designing availability attacks that degrade performance under both supervised and contrastive learning. It shows that CL-focused poisoning is not sufficient for SL protection, and proposes AUE and AAP, which embed contrastive-like augmentations into supervised poisoning to induce dual unlearnability. The methods achieve state-of-the-art worst-case unlearnability across diverse datasets (including high-resolution ones) with significantly improved efficiency over CL-based attacks, and they generalize across architectures and CL variants. The results indicate practical viability for real-world data protection, enabling scalable defense against data abusers who exploit both supervised and self-supervised learning paradigms.
Abstract
Availability attacks can prevent the unauthorized use of private data and commercial datasets by generating imperceptible noise and making unlearnable examples before release. Ideally, the obtained unlearnability prevents algorithms from training usable models. When supervised learning (SL) algorithms have failed, a malicious data collector possibly resorts to contrastive learning (CL) algorithms to bypass the protection. Through evaluation, we have found that most of the existing methods are unable to achieve both supervised and contrastive unlearnability, which poses risks to data protection. Different from recent methods based on contrastive error minimization, we employ contrastive-like data augmentations in supervised error minimization or maximization frameworks to obtain attacks effective for both SL and CL. Our proposed AUE and AAP attacks achieve state-of-the-art worst-case unlearnability across SL and CL algorithms with less computation consumption, showcasing prospects in real-world applications.
