This Probably Looks Exactly Like That: An Invertible Prototypical Network
Zachariah Carmichael, Timothy Redgrave, Daniel Gonzalez Cedre, Walter J. Scheirer
TL;DR
ProtoFlow addresses the semantic gap in prototypical networks by learning prototypical distributions over a latent space with an invertible normalizing flow backbone. By modeling class prototypes as a Gaussian Mixture in latent space and leveraging an exact inverse $f^{-1}$, it provides faithful data-space visualizations and calibrated uncertainty estimates while maintaining competitive predictive accuracy. The approach achieves state-of-the-art joint generative-predictive performance across diverse datasets and offers richer interpretability through prototype distributions, heatmaps, and prototypical parts, reinforced by a diversity loss and prototype pruning. This work has practical impact for interpretable AI, enabling more trustworthy, data-efficient reasoning in vision tasks and beyond.
Abstract
We combine concept-based neural networks with generative, flow-based classifiers into a novel, intrinsically explainable, exactly invertible approach to supervised learning. Prototypical neural networks, a type of concept-based neural network, represent an exciting way forward in realizing human-comprehensible machine learning without concept annotations, but a human-machine semantic gap continues to haunt current approaches. We find that reliance on indirect interpretation functions for prototypical explanations imposes a severe limit on prototypes' informative power. From this, we posit that invertibly learning prototypes as distributions over the latent space provides more robust, expressive, and interpretable modeling. We propose one such model, called ProtoFlow, by composing a normalizing flow with Gaussian mixture models. ProtoFlow (1) sets a new state-of-the-art in joint generative and predictive modeling and (2) achieves predictive performance comparable to existing prototypical neural networks while enabling richer interpretation.
