Table of Contents
Fetching ...

cVIL: Class-Centric Visual Interactive Labeling

Matthias Matt, Matthias Zeppelzauer, Manuela Waldner

TL;DR

This work tackles the challenge of annotating large unlabeled image collections where class overlap and data scale hinder traditional instance-centric labeling and active learning. It proposes cVIL, a class-centric visual interactive labeling framework that uses three property measures—$Min\text{-}Margin = 1 - (p_{\max} - p_{\text{second max}})$, $Eccentricity = \| (x - m) \odot Var^{-1/2} \|_2$, and $Disagreement = \frac{1}{|N_k(x)|} \sum_{y \in N_k(x)} JSD(p(x), p(y))$ with $k=20$—coupled with KDE-based class distributions to support both instance- and batch-labeling. Through simulations and a user study, cVIL demonstrates that batch labeling can outperform traditional active learning, and that the class-centric interface yields higher labeling accuracy and user preference than an instance-centric baseline, especially on complex datasets; retraining is fast and updates propagate to the visualizations to keep users informed. The work contributes a scalable, interactive labeling workflow, provides evidence for the effectiveness of class-centric uncertainty measures, and outlines future directions for optimizing measurement strategies and evaluating on larger, more diverse data regimes. Overall, cVIL offers a practical approach to QA4ML-ready labeling for large-scale image datasets with significant class overlap.

Abstract

We present cVIL, a class-centric approach to visual interactive labeling, which facilitates human annotation of large and complex image data sets. cVIL uses different property measures to support instance labeling for labeling difficult instances and batch labeling to quickly label easy instances. Simulated experiments reveal that cVIL with batch labeling can outperform traditional labeling approaches based on active learning. In a user study, cVIL led to better accuracy and higher user preference compared to a traditional instance-based visual interactive labeling approach based on 2D scatterplots.

cVIL: Class-Centric Visual Interactive Labeling

TL;DR

This work tackles the challenge of annotating large unlabeled image collections where class overlap and data scale hinder traditional instance-centric labeling and active learning. It proposes cVIL, a class-centric visual interactive labeling framework that uses three property measures—, , and with —coupled with KDE-based class distributions to support both instance- and batch-labeling. Through simulations and a user study, cVIL demonstrates that batch labeling can outperform traditional active learning, and that the class-centric interface yields higher labeling accuracy and user preference than an instance-centric baseline, especially on complex datasets; retraining is fast and updates propagate to the visualizations to keep users informed. The work contributes a scalable, interactive labeling workflow, provides evidence for the effectiveness of class-centric uncertainty measures, and outlines future directions for optimizing measurement strategies and evaluating on larger, more diverse data regimes. Overall, cVIL offers a practical approach to QA4ML-ready labeling for large-scale image datasets with significant class overlap.

Abstract

We present cVIL, a class-centric approach to visual interactive labeling, which facilitates human annotation of large and complex image data sets. cVIL uses different property measures to support instance labeling for labeling difficult instances and batch labeling to quickly label easy instances. Simulated experiments reveal that cVIL with batch labeling can outperform traditional labeling approaches based on active learning. In a user study, cVIL led to better accuracy and higher user preference compared to a traditional instance-based visual interactive labeling approach based on 2D scatterplots.
Paper Structure (10 sections, 6 figures)

This paper contains 10 sections, 6 figures.

Figures (6)

  • Figure 1: t-SNE projection of high dimensional embeddings found by representation learning, used for the iVIL approach in our user study.
  • Figure 2: cVIL with Min-Margin is en par with AL for instance labeling (left). For batch labeling, cVIL outperforms AL independently of the property measure (right).
  • Figure 3: Final accuracy.
  • Figure 4: Completion time.
  • Figure 5: Final accuracy by number of instance labels.
  • ...and 1 more figures