Table of Contents
Fetching ...

Error Slice Discovery via Manifold Compactness

Han Yu, Hao Zou, Jiashuo Liu, Renzhe Xu, Yue He, Xingxuan Zhang, Peng Cui

TL;DR

This work tackles the challenge of identifying semantically coherent error slices without relying on predefined metadata. It introduces manifold compactness as a geometry-based coherence metric and develops MCSD, an optimization framework that jointly maximizes risk reduction and coherence via a quadratic program on a kNN-manifold. Through benchmark (Dcbench) results and diverse case studies (CelebA, CheXpert, BDD100K, CivilComments), MCSD consistently yields more coherent error slices than prior methods and demonstrates applicability across vision and language tasks. The approach offers a practical, metadata-free pathway to understand and improve model behavior in subpopulations.

Abstract

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

Error Slice Discovery via Manifold Compactness

TL;DR

This work tackles the challenge of identifying semantically coherent error slices without relying on predefined metadata. It introduces manifold compactness as a geometry-based coherence metric and develops MCSD, an optimization framework that jointly maximizes risk reduction and coherence via a quadratic program on a kNN-manifold. Through benchmark (Dcbench) results and diverse case studies (CelebA, CheXpert, BDD100K, CivilComments), MCSD consistently yields more coherent error slices than prior methods and demonstrates applicability across vision and language tasks. The approach offers a practical, metadata-free pathway to understand and improve model behavior in subpopulations.

Abstract

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

Paper Structure

This paper contains 40 sections, 3 theorems, 5 equations, 25 figures, 11 tables, 1 algorithm.

Key Result

Lemma A.1

For the inequality constraint $\sum_{i=1}^n w_i\leq \alpha n$ in eq:qp, the equality can be achieved for the solution of eq:qp.

Figures (25)

  • Figure 1: Illustration of MCSD. Blue points are correctly classified by the given trained model, while red ones are wrongly classified. The trained model achieves a good overall accuracy, but exhibit a high error in a certain slice.
  • Figure 2: Category "Blond Hair" of CelebA. Visualization of t-SNE and UMAP (manifold-based dimension reduction) shows much clearer clustering structures than that of PCA (mainly preserving Euclidean distances between data points). Thus it could be better to measure coherence in the metric space of a manifold than using metrics directly calculated in Euclidean space.
  • Figure 3: Percentage of increase of manifold compactness ("Comp.") and decrease of variance ("Var.") from coarse-grained slices to fine-grained ones in CelebA. For manifold compactness, there is always a positive increase from coarse-grained slices to fine-grained slices. However, in some cases, variance fails to decrease from coarse-grained slices to fine-grained slices as expected, which are marked in red arrows. This could imply that manifold compactness is better at capturing semantic coherence than variance does.
  • Figure 4: Images randomly sampled from slices of CelebA. Left five columns are results of the category "Blond Hair". Right five columns are results of the category "Not Blond Hair". We can see that MCSD is capable of finding error slices that are more coherent than others.
  • Figure 5: Images randomly sampled from slices of CheXpert. Left five columns are results of the category "Ill". Right five columns are results of the category "Healthy". We can see that MCSD is capable of finding error slices that are more coherent than others.
  • ...and 20 more figures

Theorems & Definitions (4)

  • Definition 3.1: Manifold Compactness
  • Lemma A.1
  • Proposition A.2
  • Proposition A.3