Tensorization of neural networks for improved privacy and interpretability

José Ramón Pareja Monturiol; Alejandro Pozas-Kerstjens; David Pérez-García

Tensorization of neural networks for improved privacy and interpretability

José Ramón Pareja Monturiol, Alejandro Pozas-Kerstjens, David Pérez-García

TL;DR

The paper addresses the privacy and interpretability challenges of neural networks by introducing TT-RSS, a Tensor Train via Recursive Sketching from Samples that builds a TT representation from black-box function evaluations using a small pivot set. It extends to continuous functions, relates to TT-CI, and provides core mechanisms (SketchForming, Sketching, Trimming, SystemForming, Solving) to recover TT cores efficiently. The authors demonstrate TT-RSS’s utility through privacy defenses (Private-TT), AKLT state reconstruction for SPT-phase order parameters, and initialization/compression benefits, along with performance on synthetic tensors and neural networks. The work has practical impact by enabling privacy-preserving, interpretable, and initialization-friendly tensorized representations, with potential for extension to higher-dimensional data and broader TN layouts. Overall, TT-RSS offers a scalable pathway to harness tensor networks for machine learning while preserving the black-box convenience of NN models.

Abstract

We present a tensorization algorithm for constructing tensor train representations of functions, drawing on sketching and cross interpolation ideas. The method only requires black-box access to the target function and a small set of sample points defining the domain of interest. Thus, it is particularly well-suited for machine learning models, where the domain of interest is naturally defined by the training dataset. We show that this approach can be used to enhance the privacy and interpretability of neural network models. Specifically, we apply our decomposition to (i) obfuscate neural networks whose parameters encode patterns tied to the training data distribution, and (ii) estimate topological phases of matter that are easily accessible from the tensor train representation. Additionally, we show that this tensorization can serve as an efficient initialization method for optimizing tensor trains in general settings, and that, for model compression, our algorithm achieves a superior trade-off between memory and time complexity compared to conventional tensorization methods of neural networks.

Tensorization of neural networks for improved privacy and interpretability

TL;DR

Abstract

Paper Structure (43 sections, 1 theorem, 59 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 43 sections, 1 theorem, 59 equations, 9 figures, 1 table, 1 algorithm.

Introduction
Contributions
Organization
Prior work on tensorization
Cross interpolation
Sketching
Prior work on applications
Privacy
Interpretability
Notations
Description of the algorithm
Main idea
Details of subroutines
SketchForming
Sketching
...and 28 more sections

Key Result

Proposition 2.1

Let $f: [d_1] \times \cdots \times [d_n] \to \mathbb{R}$ a discrete function that admits a TT representation with ranks $r_1, \ldots, r_{n-1}$, and define tensors $\Phi_k: [d_1] \times \cdots [d_k] \times [r_k] \to \mathbb{R}$ such that the column space of $\Phi_k(x_{1:k}, \alpha_k)$ is the same as which we refer to as the Core Determining Equations (CDEs). Each of these equations has a unique so

Figures (9)

Figure 3.1: Performance of TT-RSS applied to different functions when varying the number of pivots $N$. From left to right, the columns show: the relative error on the $N$ pivots $\mathbf{x}$ used in TT-RSS, $R(\mathbf{x})$; the relative error on a set of $M$ test samples $\mathbf{s}$ from the functions' domains, $R(\mathbf{s})$; and the time (in seconds) required to perform the decompositions. For each value of $N$, the decomposition is performed 10 times. The figures display the mean values with error bars at $\pm 0.5 \sigma$, where $\sigma$ denotes the standard deviation. Additionally, three configurations with different numbers of variables, $n = 100$, $n = 200$, and $n = 500$, are shown in different colors.
Figure 3.2: Performance of TT-RSS applied to different NN models when varying the number of pivots $N$. From left to right, columns show: relative error on the $N$ pivots $\mathbf{x}$ used in TT-RSS, $R(\mathbf{x})$; relative error on a set of $M$ test samples $\mathbf{s}$ from the test sets, $R(\mathbf{s})$; percentage of classifications of the TT models that differ from those of the original NN models; and time in seconds taken to perform the decompositions. For each value of $N$, the decomposition is performed 10 times, displaying in the figures the mean values with error bars at $\pm 0.5 \sigma$, where $\sigma$ denotes the standard deviation. Also, three configurations are considered with different numbers of variables, $n=144$, $n=256$, and $n=400$, which are displayed using different colors.
Figure 3.3: Distribution of the accuracies of 50 tensorized models for each configuration, represented using box plots. Each configuration is made by varying one of the hyperparameters $d$, $r$, and $N$ from the baseline point $d=2$, $r=5$, $N=50$. The orange lines connect the medians across all the different values for a hyperparameter. The horizontal dashed black lines represent the original accuracy of the NN model being tensorized.
Figure 4.1: Accuracy of the different models evaluated on a test dataset consisting of 300 points, equally distributed across the 4 subgroups: English woman, Canadian woman, English man, and Canadian man. From top to bottom, the models are: the original NN models, TT models output by TT-RSS, and TT models re-trained for 10 epochs. Accuracies are measured for all percentages $q$ of English speakers present in the datasets.
Figure 4.2: Difference between the accuracy measured only on the pivots used in TT-RSS and the accuracy on test samples. On the left, accuracies are measured for TT models directly output from the TT-RSS decomposition. On the right, accuracies are for TT models after re-training them for 10 epochs on a small dataset containing the pivots. These differences are measured for all percentages $q$ of the presence of English speakers in the datasets. For each $q$, mean differences in accuracy across all models trained for that $q$ are shown, with error bars at $\pm 0.5 \sigma$, where $\sigma$ denotes the standard deviation.
...and 4 more figures

Theorems & Definitions (15)

Definition 1.1
Proposition 2.1
Remark 2.1
Remark 2.2
Remark 2.3
Remark 2.4
Remark 2.5
Remark 2.6
Remark 2.7
Remark 3.1
...and 5 more

Tensorization of neural networks for improved privacy and interpretability

TL;DR

Abstract

Tensorization of neural networks for improved privacy and interpretability

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (15)