Table of Contents
Fetching ...

Minimizing Embedding Distortion for Robust Out-of-Distribution Performance

Tom Shaked, Yuval Goldman, Oran Shayer

TL;DR

This work introduces a novel approach to similarity loss, which can be incorporated into the fine-tuning process of any task, and demonstrates that this approach significantly improves OOD performance while maintaining strong in-distribution (ID) performance.

Abstract

Foundational models, trained on vast and diverse datasets, have demonstrated remarkable capabilities in generalizing across different domains and distributions for various zero-shot tasks. Our work addresses the challenge of retaining these powerful generalization capabilities when adapting foundational models to specific downstream tasks through fine-tuning. To this end, we introduce a novel approach we call "similarity loss", which can be incorporated into the fine-tuning process of any task. By minimizing the distortion of fine-tuned embeddings from the pre-trained embeddings, our method strikes a balance between task-specific adaptation and preserving broad generalization abilities. We evaluate our approach on two diverse tasks: image classification on satellite imagery and face recognition, focusing on open-class and domain shift scenarios to assess out-of-distribution (OOD) performance. We demonstrate that this approach significantly improves OOD performance while maintaining strong in-distribution (ID) performance.

Minimizing Embedding Distortion for Robust Out-of-Distribution Performance

TL;DR

This work introduces a novel approach to similarity loss, which can be incorporated into the fine-tuning process of any task, and demonstrates that this approach significantly improves OOD performance while maintaining strong in-distribution (ID) performance.

Abstract

Foundational models, trained on vast and diverse datasets, have demonstrated remarkable capabilities in generalizing across different domains and distributions for various zero-shot tasks. Our work addresses the challenge of retaining these powerful generalization capabilities when adapting foundational models to specific downstream tasks through fine-tuning. To this end, we introduce a novel approach we call "similarity loss", which can be incorporated into the fine-tuning process of any task. By minimizing the distortion of fine-tuned embeddings from the pre-trained embeddings, our method strikes a balance between task-specific adaptation and preserving broad generalization abilities. We evaluate our approach on two diverse tasks: image classification on satellite imagery and face recognition, focusing on open-class and domain shift scenarios to assess out-of-distribution (OOD) performance. We demonstrate that this approach significantly improves OOD performance while maintaining strong in-distribution (ID) performance.
Paper Structure (19 sections, 3 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 3 equations, 2 figures, 2 tables, 1 algorithm.

Figures (2)

  • Figure 1: Images sampled from the ID and OOD datasets for face recognition (top - domain shift) and image classification (bottom - unseen classes, for example the rightmost image is labeled "solar panel", not present in train set).
  • Figure 2: Image embeddings of the OOD EuroSAT dataset, color-coded by class. (A) Model trained with similarity loss (Avg. cluster variance: 1.87e-04) (B) Model trained without similarity loss (Avg. cluster variance: 4.00e-04)