Table of Contents
Fetching ...

ANNE: Adaptive Nearest Neighbors and Eigenvector-based Sample Selection for Robust Learning with Noisy Labels

Filipe R. Cordeiro, Gustavo Carneiro

TL;DR

This paper introduces the Adaptive Nearest Neighbors and Eigenvector-based (ANNE) sample selection methodology, a novel approach that integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios.

Abstract

An important stage of most state-of-the-art (SOTA) noisy-label learning methods consists of a sample selection procedure that classifies samples from the noisy-label training set into noisy-label or clean-label subsets. The process of sample selection typically consists of one of the two approaches: loss-based sampling, where high-loss samples are considered to have noisy labels, or feature-based sampling, where samples from the same class tend to cluster together in the feature space and noisy-label samples are identified as anomalies within those clusters. Empirically, loss-based sampling is robust to a wide range of noise rates, while feature-based sampling tends to work effectively in particular scenarios, e.g., the filtering of noisy instances via their eigenvectors (FINE) sampling exhibits greater robustness in scenarios with low noise rates, and the K nearest neighbor (KNN) sampling mitigates better high noise-rate problems. This paper introduces the Adaptive Nearest Neighbors and Eigenvector-based (ANNE) sample selection methodology, a novel approach that integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios. ANNE achieves this integration by first partitioning the training set into high-loss and low-loss sub-groups using loss-based sampling. Subsequently, within the low-loss subset, sample selection is performed using FINE, while the high-loss subset employs Adaptive KNN for effective sample selection. We integrate ANNE into the noisy-label learning state of the art (SOTA) method SSR+, and test it on CIFAR-10/-100 (with symmetric, asymmetric and instance-dependent noise), Webvision and ANIMAL-10, where our method shows better accuracy than the SOTA in most experiments, with a competitive training time.

ANNE: Adaptive Nearest Neighbors and Eigenvector-based Sample Selection for Robust Learning with Noisy Labels

TL;DR

This paper introduces the Adaptive Nearest Neighbors and Eigenvector-based (ANNE) sample selection methodology, a novel approach that integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios.

Abstract

An important stage of most state-of-the-art (SOTA) noisy-label learning methods consists of a sample selection procedure that classifies samples from the noisy-label training set into noisy-label or clean-label subsets. The process of sample selection typically consists of one of the two approaches: loss-based sampling, where high-loss samples are considered to have noisy labels, or feature-based sampling, where samples from the same class tend to cluster together in the feature space and noisy-label samples are identified as anomalies within those clusters. Empirically, loss-based sampling is robust to a wide range of noise rates, while feature-based sampling tends to work effectively in particular scenarios, e.g., the filtering of noisy instances via their eigenvectors (FINE) sampling exhibits greater robustness in scenarios with low noise rates, and the K nearest neighbor (KNN) sampling mitigates better high noise-rate problems. This paper introduces the Adaptive Nearest Neighbors and Eigenvector-based (ANNE) sample selection methodology, a novel approach that integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios. ANNE achieves this integration by first partitioning the training set into high-loss and low-loss sub-groups using loss-based sampling. Subsequently, within the low-loss subset, sample selection is performed using FINE, while the high-loss subset employs Adaptive KNN for effective sample selection. We integrate ANNE into the noisy-label learning state of the art (SOTA) method SSR+, and test it on CIFAR-10/-100 (with symmetric, asymmetric and instance-dependent noise), Webvision and ANIMAL-10, where our method shows better accuracy than the SOTA in most experiments, with a competitive training time.

Paper Structure

This paper contains 18 sections, 5 equations, 6 figures, 9 tables, 1 algorithm.

Figures (6)

  • Figure 1: Mean classification accuracy, Precision (P), and Recall (R) of the detection of clean or noisy-label samples done by SSR+ (using KNN) ssr and FINE fine on the last 10 epochs (out of 300) on CIFAR-100 with symmetric noise rate of 20% (a) and 80% (b). The top row shows the P and R of the clean or noisy-label sample detection, together with the CIFAR-100 classification accuracy for SSR+ and FINE. The bottom row shows the P and R of the clean or noisy-label classification for the High Confidence Set (HCS) and Low Confidence Set (LCS), formed with the small-loss selection from DivideMix dividemix. On the top row, note that FINE works better (higher accuracy, P and R) for the low-noise rate scenario, while SSR+ is better for high-noise rate (higher accuracy and P, but comparable R). On the bottom row, FINE shows better sample selection results for the HCS samples of both datasets, while SSR+ is better for the LCS samples from both datasets.
  • Figure 2: Our ANNE sample selection strategy starts with a sample relabeling approach ssr, followed by a sample selection stage that first divides the training set into high-confidence ($\mathcal{D}_{HCS}$) and low-confidence ($\mathcal{D}_{LCS}$) subsets, according to the classification probability of the samples. Then, the samples from $\mathcal{D}_{LCS}$ will be divided into clean or noisy-label using an adaptive KNN approach, while samples from $\mathcal{D}_{HCS}$ will be divided into clean or noisy-label using an eigen-decomposition technique. As shown in the rightmost figure, our proposed adaptive KNN automatically changes the range to find the $K$ nearest neighbors, depending on the density of the sample in the feature space.
  • Figure 3: Our AKNN strategy automatically defines the K value of the K-nearest neighbour based on local density. For each sample $i$, we estimate the number of nearest neighbours $K_i$ based on the number of neighbours with cosine similarity above threshold $\omega_i$. If a low density region is identified (i.e. $K_i < K^{min}$) the threshold $\omega_i$ is reduced by $\Delta$. This process continues until the condition $K_i \ge K^{min}$ is achieved. For samples that initially are already in high-density regions, this process is skipped.
  • Figure 4: Evaluation of SSR+ ssr either using the original KNN (within SSR+ using $K \in \{5, 10, 30, 50, 100, 200\}$), or our proposed AKNN on CIFAR-100, 50% symmetric noise: (a) average $K$-value through training epochs; (b) test accuracy results during training.
  • Figure 5: Evaluation of sample selection size, clean label selection accuracy (clean rate) and accuracy on CIFAR-100, with 40% instance dependent noisy. The graph compares the performance of ANNE (with parameter $\gamma_e = \{ 0.1,0.3 \}$) with SSR+ ssr using the sample selection strategies of KNN (with parameter $K \in \{ 50,100,200,300 \}$), FINE (with parameter $\gamma_e = 0.3$), Small Loss, and AKNN (with parameter $\gamma_e = \{ 0.8,0.9 \}$). The label on each marker refers to the size of the subset from the training set classified as clean label.
  • ...and 1 more figures