Table of Contents
Fetching ...

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

Jon Donnelly, Zhicheng Guo, Alina Jade Barnett, Hayden McTavish, Chaofan Chen, Cynthia Rudin

TL;DR

Proto-RSet introduces a practical framework to edit interpretable ProtoPNets in real time by precomputing a Rashomon set of near-optimal models. It fixes the backbone and prototypes and locally approximates the last-layer space as an ellipsoid around the optimal weights, enabling fast sampling and constraint enforcement, including prototype removal or requirement. Across multiple datasets and backbones, Proto-RSet yields models that meet user constraints while preserving or improving accuracy, and user studies show clinicians and non-experts can rapidly refine models with guarantees. This approach shifts model debugging from slow retraining to interactive, explainable model-space exploration with real-world impact in high-stakes domains such as skin cancer classification and bias removal in bird identification.

Abstract

Interpretability is critical for machine learning models in high-stakes settings because it allows users to verify the model's reasoning. In computer vision, prototypical part models (ProtoPNets) have become the dominant model type to meet this need. Users can easily identify flaws in ProtoPNets, but fixing problems in a ProtoPNet requires slow, difficult retraining that is not guaranteed to resolve the issue. This problem is called the "interaction bottleneck." We solve the interaction bottleneck for ProtoPNets by simultaneously finding many equally good ProtoPNets (i.e., a draw from a "Rashomon set"). We show that our framework - called Proto-RSet - quickly produces many accurate, diverse ProtoPNets, allowing users to correct problems in real time while maintaining performance guarantees with respect to the training set. We demonstrate the utility of this method in two settings: 1) removing synthetic bias introduced to a bird identification model and 2) debugging a skin cancer identification model. This tool empowers non-machine-learning experts, such as clinicians or domain experts, to quickly refine and correct machine learning models without repeated retraining by machine learning experts.

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

TL;DR

Proto-RSet introduces a practical framework to edit interpretable ProtoPNets in real time by precomputing a Rashomon set of near-optimal models. It fixes the backbone and prototypes and locally approximates the last-layer space as an ellipsoid around the optimal weights, enabling fast sampling and constraint enforcement, including prototype removal or requirement. Across multiple datasets and backbones, Proto-RSet yields models that meet user constraints while preserving or improving accuracy, and user studies show clinicians and non-experts can rapidly refine models with guarantees. This approach shifts model debugging from slow retraining to interactive, explainable model-space exploration with real-world impact in high-stakes domains such as skin cancer classification and bias removal in bird identification.

Abstract

Interpretability is critical for machine learning models in high-stakes settings because it allows users to verify the model's reasoning. In computer vision, prototypical part models (ProtoPNets) have become the dominant model type to meet this need. Users can easily identify flaws in ProtoPNets, but fixing problems in a ProtoPNet requires slow, difficult retraining that is not guaranteed to resolve the issue. This problem is called the "interaction bottleneck." We solve the interaction bottleneck for ProtoPNets by simultaneously finding many equally good ProtoPNets (i.e., a draw from a "Rashomon set"). We show that our framework - called Proto-RSet - quickly produces many accurate, diverse ProtoPNets, allowing users to correct problems in real time while maintaining performance guarantees with respect to the training set. We demonstrate the utility of this method in two settings: 1) removing synthetic bias introduced to a bird identification model and 2) debugging a skin cancer identification model. This tool empowers non-machine-learning experts, such as clinicians or domain experts, to quickly refine and correct machine learning models without repeated retraining by machine learning experts.

Paper Structure

This paper contains 28 sections, 18 equations, 17 figures, 3 tables.

Figures (17)

  • Figure 1: How Proto-RSet addresses the interaction bottleneck. (Bottom) Without Proto-RSet, incorporating user feedback such as "this prototype does not make sense, this would be a better option" requires practitioners to make complicated adjustments to their training regime and train a whole new model. This process can take days, and be prohibitively slow when multiple rounds of feedback are required. (Top) Proto-RSet allows practitioners to incorporate user feedback in real time by selecting different candidate models, eliminating the interaction bottleneck. Moreover, Proto-RSet guarantees that user constraints are met, producing their ideal model.
  • Figure 2: Models produced by ProtoRSet before (top) and after (bottom) a user specifies that prototype 414 must be removed. Proto-RSet guarantees that the bottom model has similar performance to the top, despite following a different reasoning process. If a prototype cannot be removed while maintaining performance, Proto-RSet quickly identifies and reports this.
  • Figure 3: Time to compute Proto-RSet across three datasets and six backbones as a stacked bar plot. The lower bar presents the time required to train the base ProtoPNet, and the top represents the time to compute Proto-RSet given that reference ProtoPNet. We find that Proto-RSet can be computed in less than twenty minutes across a variety of settings.
  • Figure 4: Change in test accuracy as random prototypes are removed. In all cases, we see that removing prototypes using ProtoRSet maintains or slightly improves the accuracy of the original model. Only naive removal with retraining maintains comparable accuracy.
  • Figure 5: Time in seconds required to remove a single prototype, averaged over 100 iterations of removal. In all cases, ProtoRSet removes prototypes almost instantly. In contrast, removing a prototype then retraining the last layer can take orders of magnitude longer. We exclude naive removal without retraining because it is simply updating a value in an array, and as such is nearly instantaneous.
  • ...and 12 more figures