Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Jayneel Parekh, Quentin Bouniot, Pavlo Mozharovskyi, Alasdair Newson, Florence d'Alché-Buc
TL;DR
VisCoIN addresses the interpretability gap in unsupervised concept-based networks by tying concept activations to the latent space of a pretrained generator through a learned concept translator $\\Omega$, enabling high-quality, viewable reconstructions and interactive visualization of concepts. The approach jointly optimizes an interpretable prediction network with reconstruction-guided losses and regularizers to produce sparse, diverse concepts, while enforcing a viewability property via $G$ and $\\Omega$. Evaluation on CelebA-HQ, CUB-200, and Stanford Cars shows VisCoIN achieving competitive predictive performance while delivering improved faithfulness and consistency in concept visualization compared to prior unsupervised CoINs, with stronger groundings of visual changes through latent traversals. The framework opens avenues for robust interpretation in large-scale vision tasks and can be extended to supervised CoINs or multimodal generative models, offering practical impact for trustworthy AI deployment.
Abstract
Developing inherently interpretable models for prediction has gained prominence in recent years. A subclass of these models, wherein the interpretable network relies on learning high-level concepts, are valued because of closeness of concept representations to human communication. However, the visualization and understanding of the learnt unsupervised dictionary of concepts encounters major limitations, especially for large-scale images. We propose here a novel method that relies on mapping the concept features to the latent space of a pretrained generative model. The use of a generative model enables high quality visualization, and lays out an intuitive and interactive procedure for better interpretation of the learnt concepts by imputing concept activations and visualizing generated modifications. Furthermore, leveraging pretrained generative models has the additional advantage of making the training of the system more efficient. We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts. The experiments are conducted on multiple image recognition benchmarks for large-scale images. Project page available at https://jayneelparekh.github.io/VisCoIN_project_page/
