Tensorization is a powerful but underexplored tool for compression and interpretability of neural networks
Safa Hamreras, Sukhbinder Singh, Román Orús
TL;DR
The paper argues that scaling neural networks demands compression techniques that preserve performance and enhance interpretability. It advocates Tensorized Neural Networks (TNNs), which reshape dense weight matrices $W$ into higher-order tensors and approximate them with low-rank tensor networks (TNs) such as Matrix Product Operators (MPO/TT) and Tucker/CP decompositions, enabling parameter efficiency and revealing latent bond indices. It highlights the stack view as a flexible design that yields multiple equivalent computation paths, offers forward/backward pass acceleration under certain contractions, and opens new interpretability avenues via bond features and tensorized autoencoders. Finally, it outlines practical challenges—hardware/software support, inductive-bias characterization, hyperparameter complexity, and integration with other compression methods—and sketches a roadmap toward fully tensorized networks with tensorized activations and nonlinearities to advance scalable, trustworthy AI.
Abstract
Tensorizing a neural network involves reshaping some or all of its dense weight matrices into higher-order tensors and approximating them using low-rank tensor network decompositions. This technique has shown promise as a model compression strategy for large-scale neural networks. However, despite encouraging empirical results, tensorized neural networks (TNNs) remain underutilized in mainstream deep learning. In this position paper, we offer a perspective on both the potential and current limitations of TNNs. We argue that TNNs represent a powerful yet underexplored framework for deep learning--one that deserves greater attention from both engineering and theoretical communities. Beyond compression, we highlight the value of TNNs as a flexible class of architectures with distinctive scaling properties and increased interpretability. A central feature of TNNs is the presence of bond indices, which introduce new latent spaces not found in conventional networks. These internal representations may provide deeper insight into the evolution of features across layers, potentially advancing the goals of mechanistic interpretability. We conclude by outlining several key research directions aimed at overcoming the practical barriers to scaling and adopting TNNs in modern deep learning workflows.
