Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models
Peiyan Zhang, Haoyang Liu, Chaozhuo Li, Xing Xie, Sunghun Kim, Haohan Wang
TL;DR
This paper tackles the misalignment between fixed benchmarks and real-world robustness by proposing a dynamic evaluation framework that treats a zoo of foundation models as surrogate oracles. It introduces a counterfactual-generation method, guided by a foundation-model ensemble, to perturb images while preserving the underlying image–label structure, and defines Foundation Model-oriented Robustness (FMR) to quantify robustness relative to the oracle. The authors conduct comprehensive experiments across standard and robust vision models on MNIST, CIFAR-10, and ImageNet, showing that transformer-based architectures and certain perturbation strategies yield higher FMR, while some existing robustness methods under dynamic evaluation falter. They also analyze biases, zero-shot limitations, and the transferability of generated perturbations, arguing that dynamic, foundation-model–driven evaluation offers a more credible and actionable picture of model robustness and guidance for future improvements.
Abstract
Machine learning has demonstrated remarkable performance over finite datasets, yet whether the scores over the fixed benchmarks can sufficiently indicate the model's performance in the real world is still in discussion. In reality, an ideal robust model will probably behave similarly to the oracle (e.g., the human users), thus a good evaluation protocol is probably to evaluate the models' behaviors in comparison to the oracle. In this paper, we introduce a new robustness measurement that directly measures the image classification model's performance compared with a surrogate oracle (i.e., a foundation model). Besides, we design a simple method that can accomplish the evaluation beyond the scope of the benchmarks. Our method extends the image datasets with new samples that are sufficiently perturbed to be distinct from the ones in the original sets, but are still bounded within the same image-label structure the original test image represents, constrained by a foundation model pretrained with a large amount of samples. As a result, our new method will offer us a new way to evaluate the models' robustness performance, free of limitations of fixed benchmarks or constrained perturbations, although scoped by the power of the oracle. In addition to the evaluation results, we also leverage our generated data to understand the behaviors of the model and our new evaluation strategies.
