Towards Adversarially Robust Dataset Distillation by Curvature Regularization
Eric Xue, Yijiang Li, Haoyang Liu, Peiran Wang, Yifan Shen, Haohan Wang
TL;DR
The paper tackles the challenge of embedding adversarial robustness into dataset distillation without incurring the heavy cost of traditional adversarial training. It derives a theoretical bound showing that the upper bound on adversarial loss for distilled data is governed by the curvature of the loss surface, and proposes GUARD, a curvature-regularization strategy, to flatten the loss landscape during distillation. GUARD is integrated into existing DD methods (e.g., SRe2L, DC) and achieves improved robustness with minimal overhead across ImageNette, Tiny ImageNet, and ImageNet-1K, often improving clean accuracy as a byproduct. The work provides robustness guarantees relative to the real data distribution and demonstrates transferability to multiple distillation approaches, suggesting practical impact for efficient, robust DD in large-scale vision tasks.
Abstract
Dataset distillation (DD) allows datasets to be distilled to fractions of their original size while preserving the rich distributional information, so that models trained on the distilled datasets can achieve a comparable accuracy while saving significant computational loads. Recent research in this area has been focusing on improving the accuracy of models trained on distilled datasets. In this paper, we aim to explore a new perspective of DD. We study how to embed adversarial robustness in distilled datasets, so that models trained on these datasets maintain the high accuracy and meanwhile acquire better adversarial robustness. We propose a new method that achieves this goal by incorporating curvature regularization into the distillation process with much less computational overhead than standard adversarial training. Extensive empirical experiments suggest that our method not only outperforms standard adversarial training on both accuracy and robustness with less computation overhead but is also capable of generating robust distilled datasets that can withstand various adversarial attacks. Our implementation is available at: https://github.com/yumozi/GUARD.
