Bayesian Evidential Learning for Few-Shot Classification

Xiongkun Linghu; Yan Bai; Yihang Lou; Shengsen Wu; Jinze Li; Jianzhong He; Tao Bai

Bayesian Evidential Learning for Few-Shot Classification

Xiongkun Linghu, Yan Bai, Yihang Lou, Shengsen Wu, Jinze Li, Jianzhong He, Tao Bai

TL;DR

Bayesian Evidential Learning (BEL) tackles uncertainty in few-shot classification by modeling class probabilities with a Dirichlet distribution and decoupling uncertainty from metric learning. A Bayesian evidence fusion theorem combines prior evidence from a fixed pre-trained network with posterior evidence learned during meta-training, yielding posterior Dirichlet parameters $\\boldsymbol{\\alpha} = \eta\\boldsymbol{\\alpha}^P + \boldsymbol{\\alpha}^M$ and an expected probability $\\hat p_k = \alpha_k / S$. The method uses a smooth, KL-regularized Bayesian risk loss to form posterior opinions, resulting in adaptive gradients that reflect uncertainty. Empirical results across five FSC benchmarks show improved accuracy and calibration with plug-and-play integration into existing metric-based FSC methods, without extra computation cost compared to prior Bayesian approaches.

Abstract

Few-Shot Classification(FSC) aims to generalize from base classes to novel classes given very limited labeled samples, which is an important step on the path toward human-like machine learning. State-of-the-art solutions involve learning to find a good metric and representation space to compute the distance between samples. Despite the promising accuracy performance, how to model uncertainty for metric-based FSC methods effectively is still a challenge. To model uncertainty, We place a distribution over class probability based on the theory of evidence. As a result, uncertainty modeling and metric learning can be decoupled. To reduce the uncertainty of classification, we propose a Bayesian evidence fusion theorem. Given observed samples, the network learns to get posterior distribution parameters given the prior parameters produced by the pre-trained network. Detailed gradient analysis shows that our method provides a smooth optimization target and can capture the uncertainty. The proposed method is agnostic to metric learning strategies and can be implemented as a plug-and-play module. We integrate our method into several newest FSC methods and demonstrate the improved accuracy and uncertainty quantification on standard FSC benchmarks.

Bayesian Evidential Learning for Few-Shot Classification

TL;DR

and an expected probability

. The method uses a smooth, KL-regularized Bayesian risk loss to form posterior opinions, resulting in adaptive gradients that reflect uncertainty. Empirical results across five FSC benchmarks show improved accuracy and calibration with plug-and-play integration into existing metric-based FSC methods, without extra computation cost compared to prior Bayesian approaches.

Abstract

Paper Structure (22 sections, 1 theorem, 17 equations, 5 figures, 7 tables)

This paper contains 22 sections, 1 theorem, 17 equations, 5 figures, 7 tables.

Introduction
Related work
Method
Preliminary
The Theory of Evidence
Bayesian Evidence Fusion for Few-shot Classifcation
Learning to Form Posterior Opinions
Gradient Analysis
Experiments
Training Setup and evaluation
Uncertainty Quantification through Calibration
Results
Ablation analysis
More Results
More analysis of Bayesian Evidential Learning
...and 7 more sections

Key Result

Theorem 1

Given the prior dirichlet distribution $p(\mathbf{z}|\bm{\beta})=\mathrm{Dir}(\mathbf{z}|\bm{\beta})$ and distribution parameters collected from observed samples $\bm{\gamma}$(here we regard $\bm{\gamma}$ as random variable vector instead of deterministic parameter vector), then the posterior distri

Figures (5)

Figure 1: Samples of simplex
Figure 2: The framework of Bayesian Evidential Learning. In the pre-training stage, the network is trained in the merged datasets of base classes. The weighted evidence $\eta \cdot \bm{\alpha}^P$ provides prior evidence with relatively higher uncertainty. Given the prior evidence, the meta-trained network learns to form posterior evidence $\eta\cdot\bm{\alpha}^P+\bm{\alpha}^M$. $f_{pre}$: pre-trained network, $f_{meta}$: meta-trained network
Figure 3: Effect of $\lambda$ and $\eta$ on Accuracy(%)
Figure 4: Effect of $\lambda$ and $\eta$ on ECE(%)
Figure 5: Novel class generalization

Theorems & Definitions (3)

Theorem 1
Proof 1
Definition 1

Bayesian Evidential Learning for Few-Shot Classification

TL;DR

Abstract

Bayesian Evidential Learning for Few-Shot Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (3)