Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning

HeeSun Bae; Seungjae Shin; Byeonghu Na; Il-Chul Moon

Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning

HeeSun Bae, Seungjae Shin, Byeonghu Na, Il-Chul Moon

TL;DR

This work proposes good utilization of the transition matrix is crucial and suggests a new utilization method based on resampling, coined RENT, a REsampling method with Noise Transition matrix, which consistently outperforms existing transition matrix utilization methods, which includes reweighting, on various benchmark datasets.

Abstract

For learning with noisy labels, the transition matrix, which explicitly models the relation between noisy label distribution and clean label distribution, has been utilized to achieve the statistical consistency of either the classifier or the risk. Previous researches have focused more on how to estimate this transition matrix well, rather than how to utilize it. We propose good utilization of the transition matrix is crucial and suggest a new utilization method based on resampling, coined RENT. Specifically, we first demonstrate current utilizations can have potential limitations for implementation. As an extension to Reweighting, we suggest the Dirichlet distribution-based per-sample Weight Sampling (DWS) framework, and compare reweighting and resampling under DWS framework. With the analyses from DWS, we propose RENT, a REsampling method with Noise Transition matrix. Empirically, RENT consistently outperforms existing transition matrix utilization methods, which includes reweighting, on various benchmark datasets. Our code is available at \url{https://github.com/BaeHeeSun/RENT}.

Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning

TL;DR

Abstract

Paper Structure (61 sections, 2 theorems, 16 equations, 20 figures, 15 tables, 2 algorithms)

This paper contains 61 sections, 2 theorems, 16 equations, 20 figures, 15 tables, 2 algorithms.

Introduction
Transition Matrix for Learning with Noisy Label
Problem Definition: Learning with Noisy Label
Transition Matrix for Learning with Noisy Label
Utilizing Transition Matrix for Learning with Noisy Label
DWS: Dirichlet-based Per-sample Weight Sampling
Dirichlet-based Weight Sampling
Analyzing DWS for Learning with Noisy Label
$V(w^{j}_i)$ and $V(R_{l,\text{DWS}}^{emp})$
Distance from the true weight
Noise injection of $R_{l,\text{DWS}}^{emp}$
Comparison to Previous Work
RENT: REsample from Noise Transition
Experiment
Implementation
...and 46 more sections

Key Result

Proposition 3.1

If $\boldsymbol{\mu}^*$ is accessible, $R_{l,\text{RENT}}^{emp}$ is statistically consistent to $R_l$ (Proof: Appendix appendix:rmk5).

Figures (20)

Figure 1: Dirichlet distribution-based per-sample Weight Sampling with shape parameter $\alpha$ and the mean vector $\boldsymbol{\mu}$. Image at the vertices of yellow triangles represents data instance. Blocks above the images represent true Class, noisy Label. Sides are implementation example of sampled $\boldsymbol{w}$. $\boldsymbol{w}^{(1)}$ assigns weights to all data (Reweighting), while $\boldsymbol{w}^{(2)}$ simulates resampling refined dataset (RENT).
Figure 2: Density plot of $\text{Dir}(\alpha\boldsymbol{\mu})$ with different $\alpha$. $\boldsymbol{\mu}$ is set as $\boldsymbol{[0.7,0.2, 0.1]}$ for this illustration. Star ($\star$) denotes the mean ($\boldsymbol{\mu}$). Note that this value is invariant to $\alpha$. Yellow denotes lower density, while it becomes denser progressively with violet.
Figure 3: Test accuracy with regard to various $\alpha$ for CIFAR10. (Star ($\star$) is RENT and Cross (x) means RW, respectively.)
Figure 4: Test accuracies over various $\sigma$ for CIFAR10. RW+$\epsilon$ denotes the integration of RW and the label perturbation technique.
Figure 5: Histogram of $w_i$, of RENT on CIFAR10. Cycle for $T$ estimation. Blue and orange represents samples with clean and noisy labels, respectively. Vertical dotted line denotes $1/B$.
...and 15 more figures

Theorems & Definitions (4)

Proposition 3.1
Remark C.1
Proposition D.1
proof

Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning

TL;DR

Abstract

Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (20)

Theorems & Definitions (4)