SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data
Maya Bechler-Speicher, Andrea Zerio, Maor Huri, Marie Vibeke Vestergaard, Ran Gilad-Bachrach, Tine Jess, Samir Bhatt, Aleksejs Sazonovs
TL;DR
SuperMAN introduces a novel framework for learning from sets of sparse, irregular temporal signals by representing each signal type as an implicit graph and aggregating across a graph set with signal-grouping. Its ExtGNAN component enables multivariate processing within signal groups, while the additive structure preserves interpretability at node, graph, and subset levels; grouping priors can increase expressivity when domain knowledge is available. The method achieves state-of-the-art results in high-stakes medical tasks (Crohn's onset, ICU length of stay) and fake-news detection, and its interpretability analyses yield clinically meaningful insights such as phase-transition detection and system-level biomarker contributions. Theoretical results establish that SuperMAN is strictly more expressive than GNAN and that grouping increases expressivity, with empirical demonstrations across domains and ablations confirming the value of its components and interpretability commitments.
Abstract
Real-world temporal data often consists of multiple signal types recorded at irregular, asynchronous intervals. For instance, in the medical domain, different types of blood tests can be measured at different times and frequencies, resulting in fragmented and unevenly scattered temporal data. Similar issues of irregular sampling occur in other domains, such as the monitoring of large systems using event log files. Effectively learning from such data requires handling sets of temporal sparse and heterogeneous signals. In this work, we propose Super Mixing Additive Networks (SuperMAN), a novel and interpretable-by-design framework for learning directly from such heterogeneous signals, by modeling them as sets of implicit graphs. SuperMAN provides diverse interpretability capabilities, including node-level, graph-level, and subset-level importance, and enables practitioners to trade finer-grained interpretability for greater expressivity when domain priors are available. SuperMAN achieves state-of-the-art performance in real-world high-stakes tasks, including predicting Crohn's disease onset and hospital length of stay from routine blood test measurements and detecting fake news. Furthermore, we demonstrate how SuperMAN's interpretability properties assist in revealing disease development phase transitions and provide crucial insights in the healthcare domain.
