Revisiting Few-Shot Learning from a Causal Perspective

Guoliang Lin; Yongheng Xu; Hanjiang Lai; Jian Yin

Revisiting Few-Shot Learning from a Causal Perspective

Guoliang Lin, Yongheng Xu, Hanjiang Lai, Jian Yin

TL;DR

This work reframes metric-based few-shot learning through the lens of causal inference, specifically front-door adjustment, to identify true causal effects from inputs to labels while mitigating unobserved confounders. It shows that canonical methods like Matching Networks, Prototypical Networks, and CLIP/Tip-Adapter fit certain front-door forms, and it proposes two practical causal methods—an ensemble approach and a stochastic mapping strategy—to incorporate diverse representations and strengthen causal cues. Empirical results across 10 datasets demonstrate consistent improvements over strong baselines and existing prompt-based methods, with significant gains on ImageNet and robust performance across multiple CLIP/BLIP backbones. The work advances a principled integration of causality and representation diversity in FSL, offering a clear path toward more generalizable few-shot models with practical impact in vision-language tasks.

Abstract

Few-shot learning with $N$-way $K$-shot scheme is an open challenge in machine learning. Many metric-based approaches have been proposed to tackle this problem, e.g., the Matching Networks and CLIP-Adapter. Despite that these approaches have shown significant progress, the mechanism of why these methods succeed has not been well explored. In this paper, we try to interpret these metric-based few-shot learning methods via causal mechanism. We show that the existing approaches can be viewed as specific forms of front-door adjustment, which can alleviate the effect of spurious correlations and thus learn the causality. This causal interpretation could provide us a new perspective to better understand these existing metric-based methods. Further, based on this causal interpretation, we simply introduce two causal methods for metric-based few-shot learning, which considers not only the relationship between examples but also the diversity of representations. Experimental results demonstrate the superiority of our proposed methods in few-shot classification on various benchmark datasets. Code is available in https://github.com/lingl1024/causalFewShot.

Revisiting Few-Shot Learning from a Causal Perspective

TL;DR

Abstract

Few-shot learning with

-way

-shot scheme is an open challenge in machine learning. Many metric-based approaches have been proposed to tackle this problem, e.g., the Matching Networks and CLIP-Adapter. Despite that these approaches have shown significant progress, the mechanism of why these methods succeed has not been well explored. In this paper, we try to interpret these metric-based few-shot learning methods via causal mechanism. We show that the existing approaches can be viewed as specific forms of front-door adjustment, which can alleviate the effect of spurious correlations and thus learn the causality. This causal interpretation could provide us a new perspective to better understand these existing metric-based methods. Further, based on this causal interpretation, we simply introduce two causal methods for metric-based few-shot learning, which considers not only the relationship between examples but also the diversity of representations. Experimental results demonstrate the superiority of our proposed methods in few-shot classification on various benchmark datasets. Code is available in https://github.com/lingl1024/causalFewShot.

Paper Structure (31 sections, 21 equations, 6 figures, 7 tables)

This paper contains 31 sections, 21 equations, 6 figures, 7 tables.

Introduction
Introduction
Related Work
Metric-Based Few-Shot Learning
Causal Inference
Pre-trained Vision-Language Models
Causal Interpretation
Background: Front-Door Adjustment
Interpretation for Metric-Based FSL Methods
Interpretation for Matching Networks
Interpretation for Prototypical Networks
Interpretation for CLIP/Tip-Adapter
Two Causal Methods for FSL
Ensemble Method
Tip-Adapter-F
...and 16 more sections

Figures (6)

Figure 1: (a) The unobserved confounders, e.g., "taking pictures of dogs in the grass", would mislead the training model to learn spurious correlation: the model would tend to classify grass as the dog. This is incorrect for test examples of birds when it is also "taking pictures of birds in the grass". (b) The causal graph for better understanding the existing few-shot learning methods from the perspective of removing confounding factors by the front-door adjustment. $U$: unobserved confounder, $X$: example, $Z$: representation of the example, $Y$: label.
Figure 2: Illustration of front-door adjustment. $U$: unobserved confounder, $X$: example, $Z$: representation of the example, $Y$: label.
Figure 3: The pipeline of our ensemble method and its corresponding causal interpretation in Eq. (\ref{['approx']}). For a given test example $x$, we first obtain its representations $z_1$ and $z_2$ from diverse visual encoders. We then perform intermediate predictions via different models. The final prediction is the linear combination of intermediate predictions. Different parts of the pipeline and their corresponding causal interpretations are separated by dotted lines. Note that we only update representations ${\textbf{F}}_{train}$ as learnable parameters during training.
Figure 4: Comparison between the deterministic mapping and stochastic mapping.
Figure 5: Accuracy(%) of different models with various shots on various datasets. (Best viewed in color.)
...and 1 more figures

Revisiting Few-Shot Learning from a Causal Perspective

TL;DR

Abstract

Revisiting Few-Shot Learning from a Causal Perspective

Authors

TL;DR

Abstract

Table of Contents

Figures (6)