Compositionality Unlocks Deep Interpretable Models

Thomas Dooms; Ward Gauderis; Geraint A. Wiggins; Jose Oramas

Compositionality Unlocks Deep Interpretable Models

Thomas Dooms, Ward Gauderis, Geraint A. Wiggins, Jose Oramas

TL;DR

The paper addresses the challenge of achieving mechanistic interpretability without sacrificing accuracy by introducing χ-nets, an architecture that merges tensor-network compositional structure with deep nonlinear layers. It presents the ODT algorithm (Orthogonalisation, Diagonalisation, Truncation) to extract low-rank, interpretable features and compress the model, with a per-layer complexity of $O(L \cdot h^4)$. Empirically, a 3-layer χ-net trained on SVHN attains ~85% test accuracy while allowing substantial dimension reduction (≈70%–90%) via truncation, and the weight-based interpretability is illustrated through atom-level and eigenvector analyses that reveal prototypical digits and edge-like features. These results demonstrate a principled pathway toward interpretable, compositional AI that can inform safety, reliability, and efficiency in neural models, with clear directions for scaling and broader applications.

Abstract

We propose $χ$-net, an intrinsically interpretable architecture combining the compositional multilinear structure of tensor networks with the expressivity and efficiency of deep neural networks. $χ$-nets retain equal accuracy compared to their baseline counterparts. Our novel, efficient diagonalisation algorithm, ODT, reveals linear low-rank structure in a multilayer SVHN model. We leverage this toward formal weight-based interpretability and model compression.

Compositionality Unlocks Deep Interpretable Models

TL;DR

. Empirically, a 3-layer χ-net trained on SVHN attains ~85% test accuracy while allowing substantial dimension reduction (≈70%–90%) via truncation, and the weight-based interpretability is illustrated through atom-level and eigenvector analyses that reveal prototypical digits and edge-like features. These results demonstrate a principled pathway toward interpretable, compositional AI that can inform safety, reliability, and efficiency in neural models, with clear directions for scaling and broader applications.

Abstract

We propose

-net, an intrinsically interpretable architecture combining the compositional multilinear structure of tensor networks with the expressivity and efficiency of deep neural networks.

-nets retain equal accuracy compared to their baseline counterparts. Our novel, efficient diagonalisation algorithm, ODT, reveals linear low-rank structure in a multilayer SVHN model. We leverage this toward formal weight-based interpretability and model compression.

Compositionality Unlocks Deep Interpretable Models

TL;DR

Abstract

Compositionality Unlocks Deep Interpretable Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)