Table of Contents
Fetching ...

Debugging and Runtime Analysis of Neural Networks with VLMs (A Case Study)

Boyue Caroline Hu, Divya Gopinath, Corina S. Pasareanu, Nina Narodytska, Ravi Mangal, Susmit Jha

TL;DR

The paper tackles the challenge of debugging vision neural networks by leveraging Vision-Language Models to render internal representations as human-understandable concepts via semantic heatmaps. It introduces a suite of heatmap-based tools (single-input, ground-truth, output-label, differential, and binarized) and a fault-localization pipeline that uses a CLIP-based oracle to distinguish encoder vs head errors, complemented by a lightweight runtime defect detector based on heatmap similarity. The approach is validated on a ResNet18 trained on the RIVAL10 dataset, revealing predominantly encoder-related vulnerabilities for adversarial inputs and mixed encoder/head contributions for misclassifications, along with substantial runtime detection accuracy for both adversarial and non-adversarial defects. Overall, the work provides a semantically grounded, annotation-free framework that supports debugging, robustness analysis, and potential targeted repair and testing of vision models, with implications for safety-critical deployments.

Abstract

Debugging of Deep Neural Networks (DNNs), particularly vision models, is very challenging due to the complex and opaque decision-making processes in these networks. In this paper, we explore multi-modal Vision-Language Models (VLMs), such as CLIP, to automatically interpret the opaque representation space of vision models using natural language. This in turn, enables a semantic analysis of model behavior using human-understandable concepts, without requiring costly human annotations. Key to our approach is the notion of semantic heatmap, that succinctly captures the statistical properties of DNNs in terms of the concepts discovered with the VLM and that are computed off-line using a held-out data set. We show the utility of semantic heatmaps for fault localization -- an essential step in debugging -- in vision models. Our proposed technique helps localize the fault in the network (encoder vs head) and also highlights the responsible high-level concepts, by leveraging novel differential heatmaps, which summarize the semantic differences between the correct and incorrect behaviour of the analyzed DNN. We further propose a lightweight runtime analysis to detect and filter-out defects at runtime, thus improving the reliability of the analyzed DNNs. The runtime analysis works by measuring and comparing the similarity between the heatmap computed for a new (unseen) input and the heatmaps computed a-priori for correct vs incorrect DNN behavior. We consider two types of defects: misclassifications and vulnerabilities to adversarial attacks. We demonstrate the debugging and runtime analysis on a case study involving a complex ResNet-based classifier trained on the RIVAL10 dataset.

Debugging and Runtime Analysis of Neural Networks with VLMs (A Case Study)

TL;DR

The paper tackles the challenge of debugging vision neural networks by leveraging Vision-Language Models to render internal representations as human-understandable concepts via semantic heatmaps. It introduces a suite of heatmap-based tools (single-input, ground-truth, output-label, differential, and binarized) and a fault-localization pipeline that uses a CLIP-based oracle to distinguish encoder vs head errors, complemented by a lightweight runtime defect detector based on heatmap similarity. The approach is validated on a ResNet18 trained on the RIVAL10 dataset, revealing predominantly encoder-related vulnerabilities for adversarial inputs and mixed encoder/head contributions for misclassifications, along with substantial runtime detection accuracy for both adversarial and non-adversarial defects. Overall, the work provides a semantically grounded, annotation-free framework that supports debugging, robustness analysis, and potential targeted repair and testing of vision models, with implications for safety-critical deployments.

Abstract

Debugging of Deep Neural Networks (DNNs), particularly vision models, is very challenging due to the complex and opaque decision-making processes in these networks. In this paper, we explore multi-modal Vision-Language Models (VLMs), such as CLIP, to automatically interpret the opaque representation space of vision models using natural language. This in turn, enables a semantic analysis of model behavior using human-understandable concepts, without requiring costly human annotations. Key to our approach is the notion of semantic heatmap, that succinctly captures the statistical properties of DNNs in terms of the concepts discovered with the VLM and that are computed off-line using a held-out data set. We show the utility of semantic heatmaps for fault localization -- an essential step in debugging -- in vision models. Our proposed technique helps localize the fault in the network (encoder vs head) and also highlights the responsible high-level concepts, by leveraging novel differential heatmaps, which summarize the semantic differences between the correct and incorrect behaviour of the analyzed DNN. We further propose a lightweight runtime analysis to detect and filter-out defects at runtime, thus improving the reliability of the analyzed DNNs. The runtime analysis works by measuring and comparing the similarity between the heatmap computed for a new (unseen) input and the heatmaps computed a-priori for correct vs incorrect DNN behavior. We consider two types of defects: misclassifications and vulnerabilities to adversarial attacks. We demonstrate the debugging and runtime analysis on a case study involving a complex ResNet-based classifier trained on the RIVAL10 dataset.

Paper Structure

This paper contains 26 sections, 2 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Architecture of the vision model ResNet18. The left side shows the encoder-head decomposition of ResNet18, and the right side shows an alternative decomposition.
  • Figure 2: Leveraging VLMs for debugging and run-time analysis of vision models.
  • Figure 3: Ground-truth summary and Differential heatmaps for truck images. Red outlines indicate relevant predicates for truck.
  • Figure 4: Examples of misclassified inputs of ResNet18, error localized to the encoder (encoder error) and the head (head error).
  • Figure 5: Heatmaps for identifying robust and non-robust features against Projected Gradient Descent (PGD) attack (both $l_{\infty}$ and $l_2$). All heatmaps consider images with ground truth truck. The black outline shows robust strength predicates where the absolute difference in satisfaction probability $\leq0.05$. The red outline indicates relevant strength predicates for class truck.
  • ...and 2 more figures