Table of Contents
Fetching ...

Restyling Unsupervised Concept Based Interpretable Networks with Generative Models

Jayneel Parekh, Quentin Bouniot, Pavlo Mozharovskyi, Alasdair Newson, Florence d'Alché-Buc

TL;DR

VisCoIN addresses the interpretability gap in unsupervised concept-based networks by tying concept activations to the latent space of a pretrained generator through a learned concept translator $\\Omega$, enabling high-quality, viewable reconstructions and interactive visualization of concepts. The approach jointly optimizes an interpretable prediction network with reconstruction-guided losses and regularizers to produce sparse, diverse concepts, while enforcing a viewability property via $G$ and $\\Omega$. Evaluation on CelebA-HQ, CUB-200, and Stanford Cars shows VisCoIN achieving competitive predictive performance while delivering improved faithfulness and consistency in concept visualization compared to prior unsupervised CoINs, with stronger groundings of visual changes through latent traversals. The framework opens avenues for robust interpretation in large-scale vision tasks and can be extended to supervised CoINs or multimodal generative models, offering practical impact for trustworthy AI deployment.

Abstract

Developing inherently interpretable models for prediction has gained prominence in recent years. A subclass of these models, wherein the interpretable network relies on learning high-level concepts, are valued because of closeness of concept representations to human communication. However, the visualization and understanding of the learnt unsupervised dictionary of concepts encounters major limitations, especially for large-scale images. We propose here a novel method that relies on mapping the concept features to the latent space of a pretrained generative model. The use of a generative model enables high quality visualization, and lays out an intuitive and interactive procedure for better interpretation of the learnt concepts by imputing concept activations and visualizing generated modifications. Furthermore, leveraging pretrained generative models has the additional advantage of making the training of the system more efficient. We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts. The experiments are conducted on multiple image recognition benchmarks for large-scale images. Project page available at https://jayneelparekh.github.io/VisCoIN_project_page/

Restyling Unsupervised Concept Based Interpretable Networks with Generative Models

TL;DR

VisCoIN addresses the interpretability gap in unsupervised concept-based networks by tying concept activations to the latent space of a pretrained generator through a learned concept translator , enabling high-quality, viewable reconstructions and interactive visualization of concepts. The approach jointly optimizes an interpretable prediction network with reconstruction-guided losses and regularizers to produce sparse, diverse concepts, while enforcing a viewability property via and . Evaluation on CelebA-HQ, CUB-200, and Stanford Cars shows VisCoIN achieving competitive predictive performance while delivering improved faithfulness and consistency in concept visualization compared to prior unsupervised CoINs, with stronger groundings of visual changes through latent traversals. The framework opens avenues for robust interpretation in large-scale vision tasks and can be extended to supervised CoINs or multimodal generative models, offering practical impact for trustworthy AI deployment.

Abstract

Developing inherently interpretable models for prediction has gained prominence in recent years. A subclass of these models, wherein the interpretable network relies on learning high-level concepts, are valued because of closeness of concept representations to human communication. However, the visualization and understanding of the learnt unsupervised dictionary of concepts encounters major limitations, especially for large-scale images. We propose here a novel method that relies on mapping the concept features to the latent space of a pretrained generative model. The use of a generative model enables high quality visualization, and lays out an intuitive and interactive procedure for better interpretation of the learnt concepts by imputing concept activations and visualizing generated modifications. Furthermore, leveraging pretrained generative models has the additional advantage of making the training of the system more efficient. We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts. The experiments are conducted on multiple image recognition benchmarks for large-scale images. Project page available at https://jayneelparekh.github.io/VisCoIN_project_page/
Paper Structure (54 sections, 6 equations, 13 figures, 21 tables)

This paper contains 54 sections, 6 equations, 13 figures, 21 tables.

Figures (13)

  • Figure 1: Comparison of the generated images obtain for the same learnt concept ("Yellow-colored head") using FLINT visualization flint and our proposed VisCoIN visualization (in red boxes). Using our concept translator, that maps concept representation space to the latent space of a generative model, we can visualize each concept at different activation values, allowing for more granular and interactive interpretation. Visual modifications manually indicated by red boxes.
  • Figure 2: Left: Overview of a standard CoIN system $g$, that makes prediction $g(x)$ from extracted concepts $\Phi(x)$. Right: Design of our unsupervised concept-based interpretable network VisCoIN leveraging a pretrained generative model $G$ for visualization, and a pretrained classifier $f$. Purple blocks denote trainable subnetworks.
  • Figure 3: Visualization for a given image $x$ and concept function $\phi_k$. By imputing a higher activation for $\phi_k(x)$ in $\Phi(x)$ (by a factor $\lambda = 4$ in the figure), and comparing the obtained visualization to the original reconstruction $\tilde{x}$ (obtained with the untouched $\Phi(x)$), we interpret information encoded by $\phi_k$ about image $x$.
  • Figure 4: Qualitative examples obtained for different concepts, classes on (a)-(b) CUB-200, (c)-(d) CelebA-HQ, (e)-(f) Stanford-Cars datasets. On each subfigure, first column corresponds to maximum activated samples $x$ for class-concept pairs with high relevance ($r_{k,c} > 0.5$), second column to reconstructed image obtained with original $\Phi(x)$, and third column to the image obtained by imputing $4\times \phi_k(x)$ in $\Phi(x)$. Red boxes manually added to indicate key regions of modifications in generated images.
  • Figure 5: Qualitative results of VisCoIN with ProgressiveGAN karras2018progressive on CelebA-HQ dataset.
  • ...and 8 more figures