Table of Contents
Fetching ...

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

Weipeng Huang, Qin Li, Yang Xiao, Cheng Qiao, Tie Cai, Junwei Liang, Neil J. Hurley, Guangyuan Piao

TL;DR

This work tackles noisy labels in multilabel classification by reframing label noise as a latent-space shift and introducing LSNPC, a Bayesian deep-generative post-processing method. LSNPC uses a latent variable $oldsymbol{z} hicksim ext{Normal}(oldsymbol{0}, oldsymbol{I})$ and a shifted latent $oldsymbol{ ilde{z}} ig| oldsymbol{z} hicksim ext{Student}(g_ ext{psi}(oldsymbol{z}), oldsymbol{I}, u_0)$ to generate true and noisy labels through shared decoders, enabling correction of pre-trained predictions. The approach supports unsupervised, supervised, and semi-supervised learning via a variational auto-encoder with specialized posteriors and a correction function $oldsymbol{y}^*=oldsymbol{C}(oldsymbol{x}; h)$ estimated by Monte Carlo, and it provides theoretical KL-bound guarantees comparing the latent and observed posteriors. Empirically, LSNPC yields consistent improvements over strong baselines across VOC07, VOC12, COCO, and Tomato datasets, particularly under higher noise rates, and ablation studies validate the efficacy of the latent-shift design and the choice of Student versus Normal proposals. Overall, LSNPC offers a robust, post-processing remedy for noisy multilabel predictions with practical appeal for real-world noisy-data deployments.

Abstract

Noise in data appears to be inevitable in most real-world machine learning applications and would cause severe overfitting problems. Not only can data features contain noise, but labels are also prone to be noisy due to human input. In this paper, rather than noisy label learning in multiclass classifications, we instead focus on the less explored area of noisy label learning for multilabel classifications. Specifically, we investigate the post-correction of predictions generated from classifiers learned with noisy labels. The reasons are two-fold. Firstly, this approach can directly work with the trained models to save computational resources. Secondly, it could be applied on top of other noisy label correction techniques to achieve further improvements. To handle this problem, we appeal to deep generative approaches that are possible for uncertainty estimation. Our model posits that label noise arises from a stochastic shift in the latent variable, providing a more robust and beneficial means for noisy learning. We develop both unsupervised and semi-supervised learning methods for our model. The extensive empirical study presents solid evidence to that our approach is able to consistently improve the independent models and performs better than a number of existing methods across various noisy label settings. Moreover, a comprehensive empirical analysis of the proposed method is carried out to validate its robustness, including sensitivity analysis and an ablation study, among other elements.

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

TL;DR

This work tackles noisy labels in multilabel classification by reframing label noise as a latent-space shift and introducing LSNPC, a Bayesian deep-generative post-processing method. LSNPC uses a latent variable and a shifted latent to generate true and noisy labels through shared decoders, enabling correction of pre-trained predictions. The approach supports unsupervised, supervised, and semi-supervised learning via a variational auto-encoder with specialized posteriors and a correction function estimated by Monte Carlo, and it provides theoretical KL-bound guarantees comparing the latent and observed posteriors. Empirically, LSNPC yields consistent improvements over strong baselines across VOC07, VOC12, COCO, and Tomato datasets, particularly under higher noise rates, and ablation studies validate the efficacy of the latent-shift design and the choice of Student versus Normal proposals. Overall, LSNPC offers a robust, post-processing remedy for noisy multilabel predictions with practical appeal for real-world noisy-data deployments.

Abstract

Noise in data appears to be inevitable in most real-world machine learning applications and would cause severe overfitting problems. Not only can data features contain noise, but labels are also prone to be noisy due to human input. In this paper, rather than noisy label learning in multiclass classifications, we instead focus on the less explored area of noisy label learning for multilabel classifications. Specifically, we investigate the post-correction of predictions generated from classifiers learned with noisy labels. The reasons are two-fold. Firstly, this approach can directly work with the trained models to save computational resources. Secondly, it could be applied on top of other noisy label correction techniques to achieve further improvements. To handle this problem, we appeal to deep generative approaches that are possible for uncertainty estimation. Our model posits that label noise arises from a stochastic shift in the latent variable, providing a more robust and beneficial means for noisy learning. We develop both unsupervised and semi-supervised learning methods for our model. The extensive empirical study presents solid evidence to that our approach is able to consistently improve the independent models and performs better than a number of existing methods across various noisy label settings. Moreover, a comprehensive empirical analysis of the proposed method is carried out to validate its robustness, including sensitivity analysis and an ablation study, among other elements.

Paper Structure

This paper contains 28 sections, 9 theorems, 60 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Minimizing the objective in eq:unsup_obj_summarized is equivalent to minimizing the upper bound of the expected KL-divergence between the marginalized true posterior on $\mathbf{z}$ and the proposed marginalized one. That is, for a given data point $\mathbf{x}$, we obtain

Figures (5)

  • Figure 1: The process of the noisily labelled data generation. The gray background indicate that the variable is observed.
  • Figure 2: Training analysis of extension of NPC. (a) Extension of NPC on VOC07. The base model is chosen to be $\hbox{MLP}_r$. The noise setting is Pair with 0.3 NR. (b) Extension of NPC on Tomato. The base model is chosen to be ADDGCN. The noise setting is Sym with 0.5 NR.
  • Figure 3: Unsupervised and semi-supervised learning results for VOC12 and COCO. The standard deviations are presented as the error bars.
  • Figure 4: Empirical results of the sensitivity analysis. (a) Sensitivity analysis of choices over $\nu_0$ and $\nu$ on VOC07. The base model is chosen to be HLC. The noise setting is Pair with a rate of 0.4. (b) Sensitivity analysis of choices over $\nu_0$ and $\nu$ on Tomato. The base model is chosen to be $\hbox{MLP}_v$. The noise setting is Sym with a rate of 0.3.
  • Figure 5: GradCAM examples on VOC07 with $\hbox{MLP}_{r}$ and LSNPC using $\hbox{MLP}_{r}$ as the pre-trained classifier. Ground truth labels are on top of each column, and the labels highlighted in blue and red indicate those missed and mis-classified from the classifications of $\hbox{MLP}_{r}$ respectively, but then corrected by LSNPC.

Theorems & Definitions (17)

  • Theorem 1
  • Theorem 2
  • proof
  • proof
  • Theorem 3
  • Corollary 1
  • proof
  • Lemma 1
  • proof
  • Proposition 1
  • ...and 7 more