Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

Kaican Li; Weiyan Xie; Yongxiang Huang; Didan Deng; Lanqing Hong; Zhenguo Li; Ricardo Silva; Nevin L. Zhang

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

Kaican Li, Weiyan Xie, Yongxiang Huang, Didan Deng, Lanqing Hong, Zhenguo Li, Ricardo Silva, Nevin L. Zhang

TL;DR

Fine-tuning zero-shot foundation models often degrades robustness to distribution shifts by overfitting to non-core features. The paper introduces Dual Risk Minimization (DRM), which couples ERM with WRM and uses LLM-generated concept descriptions to proxy core-feature risk, enabling a tractable optimization that preserves core features while maintaining good average performance. DRM achieves new state-of-the-art OOD results across ImageNet variants, iWildCam, and FMoW, including sizable gains without and with WiSE-FT, albeit with modest computational overhead. The combination of dual prompts, robust p_c(y|x) estimation, and dual-inference blending yields practical, scalable robustness improvements for fine-tuning zero-shot CLIP models.

Abstract

Fine-tuning foundation models often compromises their robustness to distribution shifts. To remedy this, most robust fine-tuning methods aim to preserve the pre-trained features. However, not all pre-trained features are robust and those methods are largely indifferent to which ones to preserve. We propose dual risk minimization (DRM), which combines empirical risk minimization with worst-case risk minimization, to better preserve the core features of downstream tasks. In particular, we utilize core-feature descriptions generated by LLMs to induce core-based zero-shot predictions which then serve as proxies to estimate the worst-case risk. DRM balances two crucial aspects of model robustness: expected performance and worst-case performance, establishing a new state of the art on various real-world benchmarks. DRM significantly improves the out-of-distribution performance of CLIP ViT-L/14@336 on ImageNet (75.9 to 77.1), WILDS-iWildCam (47.1 to 51.8), and WILDS-FMoW (50.7 to 53.1); opening up new avenues for robust fine-tuning. Our code is available at https://github.com/vaynexie/DRM .

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

TL;DR

Abstract

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)