Table of Contents
Fetching ...

MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning

Julien Nicolas, Florent Chiaroni, Imtiaz Ziko, Ola Ahmad, Christian Desrosiers, Jose Dolz

TL;DR

A novel DIL approach based on a mixture of prompt-tuned CLIP models (MoPCLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference, and demonstrates the superiority of MoP-CLIP.

Abstract

Despite the recent progress in incremental learning, addressing catastrophic forgetting under distributional drift is still an open and important problem. Indeed, while state-of-the-art domain incremental learning (DIL) methods perform satisfactorily within known domains, their performance largely degrades in the presence of novel domains. This limitation hampers their generalizability, and restricts their scalability to more realistic settings where train and test data are drawn from different distributions. To address these limitations, we present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference. In particular, at the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain, leveraging a mixture of the prompt-tuned CLIP models. Our empirical evaluation reveals the poor performance of existing DIL methods under domain shift, and suggests that the proposed MoP-CLIP performs competitively in the standard DIL settings while outperforming state-of-the-art methods in OOD scenarios. These results demonstrate the superiority of MoP-CLIP, offering a robust and general solution to the problem of domain incremental learning.

MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning

TL;DR

A novel DIL approach based on a mixture of prompt-tuned CLIP models (MoPCLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference, and demonstrates the superiority of MoP-CLIP.

Abstract

Despite the recent progress in incremental learning, addressing catastrophic forgetting under distributional drift is still an open and important problem. Indeed, while state-of-the-art domain incremental learning (DIL) methods perform satisfactorily within known domains, their performance largely degrades in the presence of novel domains. This limitation hampers their generalizability, and restricts their scalability to more realistic settings where train and test data are drawn from different distributions. To address these limitations, we present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference. In particular, at the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain, leveraging a mixture of the prompt-tuned CLIP models. Our empirical evaluation reveals the poor performance of existing DIL methods under domain shift, and suggests that the proposed MoP-CLIP performs competitively in the standard DIL settings while outperforming state-of-the-art methods in OOD scenarios. These results demonstrate the superiority of MoP-CLIP, offering a robust and general solution to the problem of domain incremental learning.
Paper Structure (18 sections, 7 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 7 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Performance degradation under the presence of domain shift between adaptation and testing samples, which shows that sota DIL approaches do not generalize well. We employ S-Prompts wang2022sprompts as use-case. The red line represents the performance across each test domain, when all domains have been seen by the model. In contrast, the blue dotted line shows the performance of the same model when the test domain remains unknown, highlighting the performance degradation under distributional shift.
  • Figure 2: Proposed generalization scenario for domain incremental learning Standard problem (left): Only in-domain examples are encountered at test time. Addressed problem (right): Both in-domain and out-of-domain examples are presented at test time.
  • Figure 3: Overview of MoP-CLIP. The training phase (left): class-wise prototypes are identified from in-distribution domains. Inference (middle and right): domain selection and ensembling (Mixture of Prompts), respectively, for in-distribution and out-of-distribution samples. For simplicity, we depict the pipeline for 2 classes (Real vs Fake). However, the procedure for multiple classes (e.g., DomainNet or CoRE50) is exactly the same.
  • Figure 4: k-Means or class prototypes as domain centroids? Ablation study that demonstrates the benefits of using class prototypes (our approach) rather than k-Means prototypes, as in wang2022sprompts.
  • Figure 5: A controllable trade-off between in-domain and out-of-domain prediction performances. Impact of the threshold $q$ (Sec. \ref{['sec:infer']}) on the accuracy, evaluated on CDDB-Hard.