Time-series attribution maps with regularized contrastive learning
Steffen Schneider, Rodrigo González Laiz, Anastasiia Filippova, Markus Frey, Mackenzie Weygandt Mathis
TL;DR
The paper addresses the lack of identifiability in gradient-based time-series attribution by introducing xCEBRA, a regularized contrastive learning framework that uses Inverted Neuron Gradient to recover the data-generating Jacobian $oldsymbol{J}_{oldsymbol{g}}$ up to a linear indeterminacy. It formalizes identifiability concepts for time-series attribution maps and proves two theorems: a goodness-of-fit result for the contrastive objective and an identifiability result for the Jacobian, linking the pseudo-inverse Jacobian $oldsymbol{J}_{oldsymbol{f}}^{+}(oldsymbol{x})$ to the ground-truth map $oldsymbol{A}_{oldsymbol{g}}$. Through extensive synthetic and RatInABox neural-data experiments, the method consistently outperforms gradient-based and Shapley baselines and demonstrates reliable dimensionality identification and applicability to neural dynamics. The approach offers a principled, data-generating–centered framework for time-series attribution with potential for broad use in neuroscience and related domains. Key constructs include the ground-truth attribution map $oldsymbol{A}_{oldsymbol{g}}$ tied to $oldsymbol{J}_{oldsymbol{g}}$, subspace identifiability of encoders, and the inverted neuron gradient $oldsymbol{J}_{oldsymbol{f}}^{+}(oldsymbol{x})$ as a practical attribution tool.
Abstract
Gradient-based attribution methods aim to explain decisions of deep learning models but so far lack identifiability guarantees. Here, we propose a method to generate attribution maps with identifiability guarantees by developing a regularized contrastive learning algorithm trained on time-series data plus a new attribution method called Inverted Neuron Gradient (collectively named xCEBRA). We show theoretically that xCEBRA has favorable properties for identifying the Jacobian matrix of the data generating process. Empirically, we demonstrate robust approximation of zero vs. non-zero entries in the ground-truth attribution map on synthetic datasets, and significant improvements across previous attribution methods based on feature ablation, Shapley values, and other gradient-based methods. Our work constitutes a first example of identifiable inference of time-series attribution maps and opens avenues to a better understanding of time-series data, such as for neural dynamics and decision-processes within neural networks.
