Federated Concept-Based Models: Interpretable models with distributed supervision
Dario Fenoglio, Arianna Casanova, Francesco De Santis, Mohan Li, Gabriele Dominici, Johannes Schneider, Martin Gjoreski, Marc Langheinrich, Pietro Barbiero, Giovanni De Felice
TL;DR
Federated Concept-based Models address the need for interpretable predictions in privacy-sensitive, cross-institutional settings by grounding outputs in human-understandable concepts. F-CMs aggregate distributed concept supervision across clients and dynamically expand the shared concept space $\mathcal{M}_t$ and dependency graph $\mathcal{G}_t$, updating only affected modules to handle temporal non-stationarity. Four CM instantiations are evaluated on five synthetic Bayesian networks and a real SIIM-Pneumothorax dataset, showing that F-CMs preserve task accuracy and concept coverage close to centralized training while enabling interpretable inference for concepts unseen at individual sites. The framework also demonstrates efficient adaptation with limited parameter changes and supports privacy via module-wise FL updates and optional differential privacy mechanisms.
Abstract
Concept-based models (CMs) enhance interpretability in deep learning by grounding predictions in human-understandable concepts. However, concept annotations are expensive to obtain and rarely available at scale within a single data source. Federated learning (FL) could alleviate this limitation by enabling cross-institutional training that leverages concept annotations distributed across multiple data owners. Yet, FL lacks interpretable modeling paradigms. Integrating CMs with FL is non-trivial: CMs assume a fixed concept space and a predefined model architecture, whereas real-world FL is heterogeneous and non-stationary, with institutions joining over time and bringing new supervision. In this work, we propose Federated Concept-based Models (F-CMs), a new methodology for deploying CMs in evolving FL settings. F-CMs aggregate concept-level information across institutions and efficiently adapt the model architecture in response to changes in the available concept supervision, while preserving institutional privacy. Empirically, F-CMs preserve the accuracy and intervention effectiveness of training settings with full concept supervision, while outperforming non-adaptive federated baselines. Notably, F-CMs enable interpretable inference on concepts not available to a given institution, a key novelty with respect to existing approaches.
