Table of Contents
Fetching ...

The Geometry of Robustness: Optimizing Loss Landscape Curvature and Feature Manifold Alignment for Robust Finetuning of Vision-Language Models

Shivang Chopra, Shaunak Halbe, Chengyue Huan, Brisa Maneechotesuwan, Zsolt Kira

Abstract

Fine-tuning approaches for Vision-Language Models (VLMs) face a critical three-way trade-off between In-Distribution (ID) accuracy, Out-of-Distribution (OOD) generalization, and adversarial robustness. Existing robust fine-tuning strategies resolve at most two axes of this trade-off. Generalization-preserving methods retain ID/OOD performance but leave models vulnerable to adversarial attacks, while adversarial training improves robustness to targeted attacks but degrades ID/OOD accuracy. Our key insight is that the robustness trade-off stems from two geometric failures: sharp, anisotropic minima in parameter space and unstable feature representations that deform under perturbation. To address this, we propose GRACE (Gram-aligned Robustness via Adaptive Curvature Estimation), a unified fine-tuning framework that jointly regularizes the parameter-space curvature and feature-space invariance for VLMs. Grounded in Robust PAC-Bayes theory, GRACE employs adaptive weight perturbations scaled by local curvature to promote flatter minima, combined with a feature alignment loss that maintains representation consistency across clean, adversarial, and OOD inputs. On ImageNet fine-tuning of CLIP models, GRACE simultaneously improves ID accuracy by 10.8%, and adversarial accuracy by 13.5% while maintaining 57.0% OOD accuracy (vs. 57.4% zero-shot baseline). Geometric analysis confirms that GRACE converges to flatter minima without feature distortion across distribution shifts, providing a principled step toward generalized robustness in foundation VLMs.

The Geometry of Robustness: Optimizing Loss Landscape Curvature and Feature Manifold Alignment for Robust Finetuning of Vision-Language Models

Abstract

Fine-tuning approaches for Vision-Language Models (VLMs) face a critical three-way trade-off between In-Distribution (ID) accuracy, Out-of-Distribution (OOD) generalization, and adversarial robustness. Existing robust fine-tuning strategies resolve at most two axes of this trade-off. Generalization-preserving methods retain ID/OOD performance but leave models vulnerable to adversarial attacks, while adversarial training improves robustness to targeted attacks but degrades ID/OOD accuracy. Our key insight is that the robustness trade-off stems from two geometric failures: sharp, anisotropic minima in parameter space and unstable feature representations that deform under perturbation. To address this, we propose GRACE (Gram-aligned Robustness via Adaptive Curvature Estimation), a unified fine-tuning framework that jointly regularizes the parameter-space curvature and feature-space invariance for VLMs. Grounded in Robust PAC-Bayes theory, GRACE employs adaptive weight perturbations scaled by local curvature to promote flatter minima, combined with a feature alignment loss that maintains representation consistency across clean, adversarial, and OOD inputs. On ImageNet fine-tuning of CLIP models, GRACE simultaneously improves ID accuracy by 10.8%, and adversarial accuracy by 13.5% while maintaining 57.0% OOD accuracy (vs. 57.4% zero-shot baseline). Geometric analysis confirms that GRACE converges to flatter minima without feature distortion across distribution shifts, providing a principled step toward generalized robustness in foundation VLMs.

Paper Structure

This paper contains 93 sections, 4 theorems, 55 equations, 11 figures, 14 tables, 1 algorithm.

Key Result

Theorem 3.1

Under Assumptions asmp:smooth-asmp:features, with probability at least $1-\delta$ over the training set: where $\hat{R}_{\text{ID}}$ is empirical ID risk, $d_{\mathcal{H}\Delta\mathcal{H}}$ is the $\mathcal{H}\Delta\mathcal{H}$-divergence, and $\lambda^*$ is the ideal joint error.

Figures (11)

  • Figure 1: The VLM robustness three-way tradeoff. Existing robust fine-tuning strategies resolve at most two of {ID, OOD, adversarial} robustness simultaneously, leaving a gap in generalized robustness. GRACE is designed to close this gap.
  • Figure 2: (a) Feature Distribution Analysis: 3D projection of image features for in-distribution ($f_{\text{ID}}$), OOD ($f_{\text{OOD}}$), and PGD adversarial inputs ($f_{\text{Adv}}$) of the same class. (b) Loss Landscape Analysis: 3D/2D loss slices around the converged solutions for each method, using shared perturbation directions.
  • Figure 3: Low-rank adaptation and perturbation in GRACE. Frozen pretrained weights $W(\theta_0)$ (blue) are updated only through low-rank LoRA adapters (orange). LAR-AWP injects additional low-rank perturbations (red) in the same subspace.
  • Figure 4: LAR-AWP rank curriculum. A diagonal rank mask controls the effective perturbation rank per layer. Curvature estimates $h_W$ (from mini-batch gradients) are used to assign higher perturbation ranks to sharper layers, focusing smoothing where the loss landscape is steepest.
  • Figure 5: Gram-volume feature alignment. For each input, GRACE compares clean, adversarial, and LAR-AWP-perturbed image embeddings via a small Gram matrix. The Gram-volume loss encourages these three vectors to remain close to each other (low volume) while preserving separation across different classes.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Theorem 3.1: Robust PAC-Bayes Bound
  • Lemma 3.2: Feature-Space Domain Discrepancy
  • Lemma C.1: Hessian Boundedness
  • Lemma C.2: Taylor Remainder