Generating visual explanations from deep networks using implicit neural representations
Michal Byra, Henrik Skibbe
TL;DR
This work introduces implicit neural representations (INRs) as a novel framework for generating visual explanations of deep models. By conditioning coordinate-based INRs on an area parameter, the authors reformulate extremal perturbations to produce smooth, area-constrained attribution masks and extend this with an iterative method to generate multiple non-overlapping explanations. Empirical results on ImageNet-S50 and PASCAL VOC show the INR-based approach can achieve competitive precision and better area-smoothness than traditional perturbation methods, while revealing that a model may rely on object appearance as well as surrounding context. The study highlights the versatility and potential of INRs for explainability, albeit with trade-offs in training time and stability, and points to future directions such as richer conditioning and joint optimization with other vision tasks.
Abstract
Explaining deep learning models in a way that humans can easily understand is essential for responsible artificial intelligence applications. Attribution methods constitute an important area of explainable deep learning. The attribution problem involves finding parts of the network's input that are the most responsible for the model's output. In this work, we demonstrate that implicit neural representations (INRs) constitute a good framework for generating visual explanations. Firstly, we utilize coordinate-based implicit networks to reformulate and extend the extremal perturbations technique and generate attribution masks. Experimental results confirm the usefulness of our method. For instance, by proper conditioning of the implicit network, we obtain attribution masks that are well-behaved with respect to the imposed area constraints. Secondly, we present an iterative INR-based method that can be used to generate multiple non-overlapping attribution masks for the same image. We depict that a deep learning model may associate the image label with both the appearance of the object of interest as well as with areas and textures usually accompanying the object. Our study demonstrates that implicit networks are well-suited for the generation of attribution masks and can provide interesting insights about the performance of deep learning models.
