Table of Contents
Fetching ...

Selfish Evolution: Making Discoveries in Extreme Label Noise with the Help of Overfitting Dynamics

Nima Sedaghat, Tanawan Chatchadanoraset, Colin Orion Chandler, Ashish Mahabal, Maryam Eslami

TL;DR

This work addresses the challenge of scarce and noisy labels in scientific domains by introducing Selfish Evolution, a method that uses per-sample overfitting dynamics to reveal noise patterns and recover true labels. It constructs evolution cubes $\mathcal{E}_i=\{y_i^t\}_{t=1}^T$ during targeted overfitting and trains a secondary Evolution-to-Label mapper $g(\mathcal{E};\phi)$ to predict corrected labels, optionally looping in a closed-loop to gradually denoise the dataset. The approach is network-state-agnostic and designed to work with image-like labels, demonstrated on supernova detection in DESC/DC2 data where $817$ missed events were recovered, and on MNIST to validate correction under high noise with competitive baselines. The method’s combination of evolution-based label correction and closed-loop refinement offers practical impact for domain-specific labeling bottlenecks and has potential for broader domain adaptation across imaging tasks. All mathematical notation is kept within $...$ delimiters to ensure precise, machine-readable representation of the core concepts.

Abstract

Motivated by the scarcity of proper labels in an astrophysical application, we have developed a novel technique, called Selfish Evolution, which allows for the detection and correction of corrupted labels in a weakly supervised fashion. Unlike methods based on early stopping, we let the model train on the noisy dataset. Only then do we intervene and allow the model to overfit to individual samples. The ``evolution'' of the model during this process reveals patterns with enough information about the noisiness of the label, as well as its correct version. We train a secondary network on these spatiotemporal ``evolution cubes'' to correct potentially corrupted labels. We incorporate the technique in a closed-loop fashion, allowing for automatic convergence towards a mostly clean dataset, without presumptions about the state of the network in which we intervene. We evaluate on the main task of the Supernova-hunting dataset but also demonstrate efficiency on the more standard MNIST dataset.

Selfish Evolution: Making Discoveries in Extreme Label Noise with the Help of Overfitting Dynamics

TL;DR

This work addresses the challenge of scarce and noisy labels in scientific domains by introducing Selfish Evolution, a method that uses per-sample overfitting dynamics to reveal noise patterns and recover true labels. It constructs evolution cubes during targeted overfitting and trains a secondary Evolution-to-Label mapper to predict corrected labels, optionally looping in a closed-loop to gradually denoise the dataset. The approach is network-state-agnostic and designed to work with image-like labels, demonstrated on supernova detection in DESC/DC2 data where missed events were recovered, and on MNIST to validate correction under high noise with competitive baselines. The method’s combination of evolution-based label correction and closed-loop refinement offers practical impact for domain-specific labeling bottlenecks and has potential for broader domain adaptation across imaging tasks. All mathematical notation is kept within delimiters to ensure precise, machine-readable representation of the core concepts.

Abstract

Motivated by the scarcity of proper labels in an astrophysical application, we have developed a novel technique, called Selfish Evolution, which allows for the detection and correction of corrupted labels in a weakly supervised fashion. Unlike methods based on early stopping, we let the model train on the noisy dataset. Only then do we intervene and allow the model to overfit to individual samples. The ``evolution'' of the model during this process reveals patterns with enough information about the noisiness of the label, as well as its correct version. We train a secondary network on these spatiotemporal ``evolution cubes'' to correct potentially corrupted labels. We incorporate the technique in a closed-loop fashion, allowing for automatic convergence towards a mostly clean dataset, without presumptions about the state of the network in which we intervene. We evaluate on the main task of the Supernova-hunting dataset but also demonstrate efficiency on the more standard MNIST dataset.

Paper Structure

This paper contains 17 sections, 10 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Capturing the evolution of the model output during individual overfitting processes results in data volumes encapsulating a good amount of information about the presence of label noise, and potentially the noise-free label. Here we depict an exemplar "evolution cube" for a special approach to the task of supernova detection in which labels are 2D images. The third axis, representing the evolution steps, is aligned with the (dis)appearing objects in the above illustration.
  • Figure 2: Illustration of various stages of a complete super-epoch. At step 1 (bottom left), the main model is trained on the training subset $\mathcal{D}$ with the original noised labels. In step 2 (top left), individual samples from the gold subset $\mathcal{G}$ are used to train the model to generate evolution cubes. During step 3 (top right), the E2L model is trained from scratch to learn to map evolution cubes of this super-epoch to clean labels. Finally, at step 4 (dashed blue arrow), the main subset $\mathcal{D}$ is passed through the main and E2L models to give a cleaned-up version of the labels -- evolution cubes are generated on the fly.
  • Figure 3: Image-based redefinition of the task of supernova detection. On the left, two images of the same region of the sky are passed to the network, and the output is defined as an image of the same size, containing only the reconstructed desired object sedaghat2018effective
  • Figure 4: Results of denoising on one exemplar pair of inputs. The top row is the full image crop, while in the second row, we zoom in to have a clearer view of the target object. "noised target" is the blank target we have trained the primary network on. "denoised target" is the output of our algorithm, where the correct truth label is recovered.
  • Figure 5: Exemplar illustration of a down-sampled, unrolled, evolution cube. The first half (top row) is the first half of the evolution, where the network tries to overfit the support batch. In the second half, overfitting happens towards the single noised target. The race between the two overfitting schemes reveals subtle information about the clean label, which is exploited by our E2L model later on.
  • ...and 2 more figures