Table of Contents
Fetching ...

A Priori Generalizability Estimate for a CNN

Cito Balsells, Beatrice Riviere, David Fuentes

TL;DR

This work addresses estimating CNN generalizability by formulating a per-input matrix $A[x]$ representing the CNN and computing its truncated SVD via the adjoint $A^T[x]$, enabling a low-rank CNN approximation. It introduces the Adjoint CNN $\mathcal{G}_\theta$ and two projection-based metrics, the Right Projection Ratio $\mathrm{RPR}$ and Left Projection Ratio $\mathrm{LPR}$, to assess whether inputs or labels lie in the CNN's range or nullspace. The approach is validated on MNIST for classification and BraTS for brain tumor segmentation, showing that $\mathrm{RPR}$ correlates with performance on unlabeled data and can reveal class imbalance biases. These results suggest a practical diagnostic tool for sample-level performance estimation and potential use in active learning and bias detection, grounded in a theoretical link between adjoint operators and CNN structure.

Abstract

We formulate truncated singular value decompositions of entire convolutional neural networks. We demonstrate the computed left and right singular vectors are useful in identifying which images the convolutional neural network is likely to perform poorly on. To create this diagnostic tool, we define two metrics: the Right Projection Ratio and the Left Projection Ratio. The Right (Left) Projection Ratio evaluates the fidelity of the projection of an image (label) onto the computed right (left) singular vectors. We observe that both ratios are able to identify the presence of class imbalance for an image classification problem. Additionally, the Right Projection Ratio, which only requires unlabeled data, is found to be correlated to the model's performance when applied to image segmentation. This suggests the Right Projection Ratio could be a useful metric to estimate how likely the model is to perform well on a sample.

A Priori Generalizability Estimate for a CNN

TL;DR

This work addresses estimating CNN generalizability by formulating a per-input matrix representing the CNN and computing its truncated SVD via the adjoint , enabling a low-rank CNN approximation. It introduces the Adjoint CNN and two projection-based metrics, the Right Projection Ratio and Left Projection Ratio , to assess whether inputs or labels lie in the CNN's range or nullspace. The approach is validated on MNIST for classification and BraTS for brain tumor segmentation, showing that correlates with performance on unlabeled data and can reveal class imbalance biases. These results suggest a practical diagnostic tool for sample-level performance estimation and potential use in active learning and bias detection, grounded in a theoretical link between adjoint operators and CNN structure.

Abstract

We formulate truncated singular value decompositions of entire convolutional neural networks. We demonstrate the computed left and right singular vectors are useful in identifying which images the convolutional neural network is likely to perform poorly on. To create this diagnostic tool, we define two metrics: the Right Projection Ratio and the Left Projection Ratio. The Right (Left) Projection Ratio evaluates the fidelity of the projection of an image (label) onto the computed right (left) singular vectors. We observe that both ratios are able to identify the presence of class imbalance for an image classification problem. Additionally, the Right Projection Ratio, which only requires unlabeled data, is found to be correlated to the model's performance when applied to image segmentation. This suggests the Right Projection Ratio could be a useful metric to estimate how likely the model is to perform well on a sample.

Paper Structure

This paper contains 10 sections, 3 theorems, 29 equations, 12 figures, 2 tables.

Key Result

Lemma 1

The sumpooling operator $\Pi_\downarrow: \mathbb{R}^{m\times n\times d}\rightarrow\mathbb{R}^{\frac{m}{2}\times \frac{n}{2} \times \frac{d}{2}}$ can be expressed as an input independent matrix, $A_\downarrow \in\mathbb{R}^{\frac{mnd}{8}\times mnd}$. Let $e_q\in\mathbb{R}^{mnd}$ represent the $q$th c where,

Figures (12)

  • Figure 1: Evaluating the digit specific accuracy of the balanced (left) and unbalanced (right) rank $k$ CNN.
  • Figure 2: Boxplot showing how the RPR changes for each digit as the rank increases. Using the RPRs from the balanced model as a baseline, we observe that digit $1$ deviates the most for the unbalanced dataset. With its significantly stunted RPR, this can be interpreted as images depicting the digit $1$ belonging more to the nullspace of the CNN. Additionally, the RPR for digit $9$, which was generally above digit $8$ for the balanced experiment, drops below digit $8$ for the unbalanced experiment. By evaluating proximity to the nullspace with the RPR, we can identify digits $\{1,9\}$ as candidates of poor model performance for the unbalanced experiment.
  • Figure 3: Boxplot showing how the LPR changes for each digit as the rank increases. When the training set is unbalanced, digits $\{1,9\}$ consistently have the lowest LPRs. Referring back to Section \ref{['methods:implementation_solver']}, we recognize that this means these digits are not well represented in the range of the CNN. This offers improved interpretability for why the CNN trained on the unbalanced dataset performed worse than the CNN trained on the balanced dataset.
  • Figure 4: The RPR is calculated using the 10 right singular vectors corresponding to the 10 largest singular values. The 'CNN Dice Score' is the dice score from the Tensor Representation of the CNN. Test set contains 369 volumes.
  • Figure 5: Left: the input image. Right: The projection of the image onto the first 10 singular vectors.
  • ...and 7 more figures

Theorems & Definitions (4)

  • Definition 1: Region
  • Lemma 1
  • Lemma 2
  • Lemma 3