Compositionality Unlocks Deep Interpretable Models
Thomas Dooms, Ward Gauderis, Geraint A. Wiggins, Jose Oramas
TL;DR
The paper addresses the challenge of achieving mechanistic interpretability without sacrificing accuracy by introducing χ-nets, an architecture that merges tensor-network compositional structure with deep nonlinear layers. It presents the ODT algorithm (Orthogonalisation, Diagonalisation, Truncation) to extract low-rank, interpretable features and compress the model, with a per-layer complexity of $O(L \cdot h^4)$. Empirically, a 3-layer χ-net trained on SVHN attains ~85% test accuracy while allowing substantial dimension reduction (≈70%–90%) via truncation, and the weight-based interpretability is illustrated through atom-level and eigenvector analyses that reveal prototypical digits and edge-like features. These results demonstrate a principled pathway toward interpretable, compositional AI that can inform safety, reliability, and efficiency in neural models, with clear directions for scaling and broader applications.
Abstract
We propose $χ$-net, an intrinsically interpretable architecture combining the compositional multilinear structure of tensor networks with the expressivity and efficiency of deep neural networks. $χ$-nets retain equal accuracy compared to their baseline counterparts. Our novel, efficient diagonalisation algorithm, ODT, reveals linear low-rank structure in a multilayer SVHN model. We leverage this toward formal weight-based interpretability and model compression.
