Table of Contents
Fetching ...

Detecting Systematic Weaknesses in Vision Models along Predefined Human-Understandable Dimensions

Sujan Sai Gannamaneni, Rohil Prakash Rao, Michael Mock, Maram Akila, Stefan Wrobel

TL;DR

This work presents a Systematic Weakness Detector (SWD) that identifies human-understandable, safety-relevant weaknesses in vision models by generating semantic metadata with foundation models (e.g., CLIP) and performing a noise-aware slice discovery via SliceLine. A Bayesian framework corrects for labeling noise in the metadata, producing corrected slice errors that enable more accurate identification of weak data slices aligned with predefined ODD dimensions. Across synthetic and real-world vision tasks (including CelebA and pedestrian detection datasets), SWD-3 (the noise-corrected fusion) reliably recovers ground-truth weaknesses and yields more actionable insights than metadata-free SOTA methods like DOMINO, Spotlight, and SVM-FD. The approach supports safety arguments and targeted data acquisition for retraining, contributing to trustworthy AI in safety-critical domains like autonomous driving and surveillance.

Abstract

Slice discovery methods (SDMs) are prominent algorithms for finding systematic weaknesses in DNNs. They identify top-k semantically coherent slices/subsets of data where a DNN-under-test has low performance. For being directly useful, slices should be aligned with human-understandable and relevant dimensions, which, for example, are defined by safety and domain experts as part of the operational design domain (ODD). While SDMs can be applied effectively on structured data, their application on image data is complicated by the lack of semantic metadata. To address these issues, we present an algorithm that combines foundation models for zero-shot image classification to generate semantic metadata with methods for combinatorial search to find systematic weaknesses in images. In contrast to existing approaches, ours identifies weak slices that are in line with pre-defined human-understandable dimensions. As the algorithm includes foundation models, its intermediate and final results may not always be exact. Therefore, we include an approach to address the impact of noisy metadata. We validate our algorithm on both synthetic and real-world datasets, demonstrating its ability to recover human-understandable systematic weaknesses. Furthermore, using our approach, we identify systematic weaknesses of multiple pre-trained and publicly available state-of-the-art computer vision DNNs.

Detecting Systematic Weaknesses in Vision Models along Predefined Human-Understandable Dimensions

TL;DR

This work presents a Systematic Weakness Detector (SWD) that identifies human-understandable, safety-relevant weaknesses in vision models by generating semantic metadata with foundation models (e.g., CLIP) and performing a noise-aware slice discovery via SliceLine. A Bayesian framework corrects for labeling noise in the metadata, producing corrected slice errors that enable more accurate identification of weak data slices aligned with predefined ODD dimensions. Across synthetic and real-world vision tasks (including CelebA and pedestrian detection datasets), SWD-3 (the noise-corrected fusion) reliably recovers ground-truth weaknesses and yields more actionable insights than metadata-free SOTA methods like DOMINO, Spotlight, and SVM-FD. The approach supports safety arguments and targeted data acquisition for retraining, contributing to trustworthy AI in safety-critical domains like autonomous driving and surveillance.

Abstract

Slice discovery methods (SDMs) are prominent algorithms for finding systematic weaknesses in DNNs. They identify top-k semantically coherent slices/subsets of data where a DNN-under-test has low performance. For being directly useful, slices should be aligned with human-understandable and relevant dimensions, which, for example, are defined by safety and domain experts as part of the operational design domain (ODD). While SDMs can be applied effectively on structured data, their application on image data is complicated by the lack of semantic metadata. To address these issues, we present an algorithm that combines foundation models for zero-shot image classification to generate semantic metadata with methods for combinatorial search to find systematic weaknesses in images. In contrast to existing approaches, ours identifies weak slices that are in line with pre-defined human-understandable dimensions. As the algorithm includes foundation models, its intermediate and final results may not always be exact. Therefore, we include an approach to address the impact of noisy metadata. We validate our algorithm on both synthetic and real-world datasets, demonstrating its ability to recover human-understandable systematic weaknesses. Furthermore, using our approach, we identify systematic weaknesses of multiple pre-trained and publicly available state-of-the-art computer vision DNNs.

Paper Structure

This paper contains 23 sections, 23 equations, 14 figures, 9 tables, 1 algorithm.

Figures (14)

  • Figure 1: Our algorithm for finding systematic weaknesses of CV models. Given a model, a test dataset, and an ODD description for the objects we are interested in, we build a database of object-level performance and metadata in a structured format. Weak slice discovery methods are then applied to this database to identify top-k weak slices of the model.
  • Figure 2: Based on labeling quality, we divide the generated datasets into (i) good quality (left), (ii) medium quality (middle), and (iii) bad quality (right). In three cases, we look at the spread of error in GT ($p(e|\mathcal{S})$), Observed ($p(e|\mathcal{C})$), and Corrected ($p(e|\mathcal{S})$). In the second row, corresponding performance in terms of precision and recall of SWD-1,2,3 are shown. Precision and Recall in this figure are metrics to evaluate weak slice recovery and are not related to labeling quality. The legend for both rows are presented on the figures on left.
  • Figure 3: Left: The hierarchy level-0 ridnik2021imagenet predictions of the pre-trained ViT-B-16 model on the full CelebA dataset converted into a binary classification problem. While a majority of the predictions are correct, there is a non-trivial subset of images with systematic errors due to label overlap issues. Right: Top-1 weak slice, identified by SWD-3, of a ViT-B-16 classification model trained on ImageNet21k and evaluated on the full celebA dataset. The statistics provide a quantitative evaluation of the entire slice. For qualitative evaluation, we provide some sample images from the slice.
  • Figure 4: Faster R-CNN
  • Figure 5: SeTR
  • ...and 9 more figures