Table of Contents
Fetching ...

Understanding the Dependence of Perception Model Competency on Regions in an Image

Sara Pohland, Claire Tomlin

TL;DR

Five novel methods for identifying regions in the input image contributing to low model competency, which are referred to as image cropping, segment masking, pixel perturbation, competency gradients, and reconstruction loss, are explored.

Abstract

While deep neural network (DNN)-based perception models are useful for many applications, these models are black boxes and their outputs are not yet well understood. To confidently enable a real-world, decision-making system to utilize such a perception model without human intervention, we must enable the system to reason about the perception model's level of competency and respond appropriately when the model is incompetent. In order for the system to make an intelligent decision about the appropriate action when the model is incompetent, it would be useful for the system to understand why the model is incompetent. We explore five novel methods for identifying regions in the input image contributing to low model competency, which we refer to as image cropping, segment masking, pixel perturbation, competency gradients, and reconstruction loss. We assess the ability of these five methods to identify unfamiliar objects, recognize regions associated with unseen classes, and identify unexplored areas in an environment. We find that the competency gradients and reconstruction loss methods show great promise in identifying regions associated with low model competency, particularly when aspects of the image that are unfamiliar to the perception model are causing this reduction in competency. Both of these methods boast low computation times and high levels of accuracy in detecting image regions that are unfamiliar to the model, allowing them to provide potential utility in decision-making pipelines. The code for reproducing our methods and results is available on GitHub: https://github.com/sarapohland/explainable-competency.

Understanding the Dependence of Perception Model Competency on Regions in an Image

TL;DR

Five novel methods for identifying regions in the input image contributing to low model competency, which are referred to as image cropping, segment masking, pixel perturbation, competency gradients, and reconstruction loss, are explored.

Abstract

While deep neural network (DNN)-based perception models are useful for many applications, these models are black boxes and their outputs are not yet well understood. To confidently enable a real-world, decision-making system to utilize such a perception model without human intervention, we must enable the system to reason about the perception model's level of competency and respond appropriately when the model is incompetent. In order for the system to make an intelligent decision about the appropriate action when the model is incompetent, it would be useful for the system to understand why the model is incompetent. We explore five novel methods for identifying regions in the input image contributing to low model competency, which we refer to as image cropping, segment masking, pixel perturbation, competency gradients, and reconstruction loss. We assess the ability of these five methods to identify unfamiliar objects, recognize regions associated with unseen classes, and identify unexplored areas in an environment. We find that the competency gradients and reconstruction loss methods show great promise in identifying regions associated with low model competency, particularly when aspects of the image that are unfamiliar to the perception model are causing this reduction in competency. Both of these methods boast low computation times and high levels of accuracy in detecting image regions that are unfamiliar to the model, allowing them to provide potential utility in decision-making pipelines. The code for reproducing our methods and results is available on GitHub: https://github.com/sarapohland/explainable-competency.
Paper Structure (28 sections, 4 equations, 13 figures, 6 tables)

This paper contains 28 sections, 4 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Image cropping approach for identifying low competency regions. We partition the image into grid cells, crop the image around each grid cell to obtain a new image, $X^i_j$, and compute the competency score, $C^i_j$, for each cropped image. Regions with lower scores are associated with lower levels of competency and are said to contribute more to the overall low model competency, resulting in a higher dependency score, $d^i_j$.
  • Figure 2: Segment masking approach for identifying low competency regions. We begin by segmenting the image using the Felzenszwalb segmentation algorithm felzenszwalb_efficient_2004. For each segment determined by this algorithm, we mask out the rest of the image to obtain a new image, $X_i$, and compute the competency score, $C_i$, of that masked image. Segments with lower corresponding competency scores are said to contribute more to the overall low model competency for that image, resulting in a higher dependency score, $d_i$.
  • Figure 3: Pixel perturbation approach for identifying low competency regions. We begin by segmenting the image using the Felzenszwalb algorithm felzenszwalb_efficient_2004. We then perturb each segment determined by this algorithm to obtain a new image, $X_i$, and compute the competency score, $C_i$, for each of these perturbed images. If the competency score increases (above $C$) after perturbing a given region, that region is believed to contribute more to the overall low model competency, resulting in a higher dependency score, $d_i$.
  • Figure 4: Competency gradients approach for identifying low competency regions. We begin by computing the partial derivative of the overall competency score with respect to each of the pixel values in the input image. We then calculate the average derivative over segmented regions in the image, using the Felzenszwalb algorithm felzenszwalb_efficient_2004.
  • Figure 5: Reconstruction loss approach for identifying low competency regions. We begin by segmenting the image using the Felzenszwalb algorithm felzenszwalb_efficient_2004, then generate masked images for each segment. Given an image with a segment masked out using ones, an autoencoder aims to predict the pixels of the original image. The difference between the original image and the predicted image is the reconstruction loss. If the reconstruction loss is high for a given image segment, this segment is believed to be unfamiliar to the perception model, and the overall low model competency is said to be more dependent on this region in the image, resulting in a higher dependency score, $d_i$.
  • ...and 8 more figures