Table of Contents
Fetching ...

Robust Feature Learning for Multi-Index Models in High Dimensions

Alireza Mousavi-Hosseini, Adel Javanmard, Murat A. Erdogdu

TL;DR

The paper addresses robust learning for high-dimensional data when the target depends on a low-dimensional projection (a multi-index model). It shows that under $\ell_2$ perturbations, the Bayes-optimal robust projection aligns with the standard low-dimensional subspace defined by the target directions $\boldsymbol{U}$, provided a mild independence assumption holds. A two-layer neural-network framework is proposed, where an oracle recovers $\boldsymbol{U}$ and a robust readout is trained on the projected, low-dimensional representation, yielding sample complexity that does not scale with the ambient dimension $d$. The work provides polynomial- and SFL/DFL-based guarantees, along with numerical experiments indicating practical benefits of pre-learning the latent subspace before adversarial tuning, and outlines open questions on extending to other perturbation models and deeper architectures.

Abstract

Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown that in high dimensions, the majority of the compute and data resources are spent on recovering the low-dimensional projection; once this subspace is recovered, the remainder of the target can be learned independently of the ambient dimension. However, implications of feature learning in adversarial settings remain unexplored. In this work, we take the first steps towards understanding adversarially robust feature learning with neural networks. Specifically, we prove that the hidden directions of a multi-index model offer a Bayes optimal low-dimensional projection for robustness against $\ell_2$-bounded adversarial perturbations under the squared loss, assuming that the multi-index coordinates are statistically independent from the rest of the coordinates. Therefore, robust learning can be achieved by first performing standard feature learning, then robustly tuning a linear readout layer on top of the standard representations. In particular, we show that adversarially robust learning is just as easy as standard learning. Specifically, the additional number of samples needed to robustly learn multi-index models when compared to standard learning does not depend on dimensionality.

Robust Feature Learning for Multi-Index Models in High Dimensions

TL;DR

The paper addresses robust learning for high-dimensional data when the target depends on a low-dimensional projection (a multi-index model). It shows that under perturbations, the Bayes-optimal robust projection aligns with the standard low-dimensional subspace defined by the target directions , provided a mild independence assumption holds. A two-layer neural-network framework is proposed, where an oracle recovers and a robust readout is trained on the projected, low-dimensional representation, yielding sample complexity that does not scale with the ambient dimension . The work provides polynomial- and SFL/DFL-based guarantees, along with numerical experiments indicating practical benefits of pre-learning the latent subspace before adversarial tuning, and outlines open questions on extending to other perturbation models and deeper architectures.

Abstract

Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown that in high dimensions, the majority of the compute and data resources are spent on recovering the low-dimensional projection; once this subspace is recovered, the remainder of the target can be learned independently of the ambient dimension. However, implications of feature learning in adversarial settings remain unexplored. In this work, we take the first steps towards understanding adversarially robust feature learning with neural networks. Specifically, we prove that the hidden directions of a multi-index model offer a Bayes optimal low-dimensional projection for robustness against -bounded adversarial perturbations under the squared loss, assuming that the multi-index coordinates are statistically independent from the rest of the coordinates. Therefore, robust learning can be achieved by first performing standard feature learning, then robustly tuning a linear readout layer on top of the standard representations. In particular, we show that adversarially robust learning is just as easy as standard learning. Specifically, the additional number of samples needed to robustly learn multi-index models when compared to standard learning does not depend on dimensionality.

Paper Structure

This paper contains 21 sections, 14 theorems, 40 equations, 1 figure, 3 algorithms.

Key Result

Theorem 1

Suppose Assumption assump:indep holds and eq:adv_risk_min admits a minimizer. Then, there exists a function $f^* : \mathbb{R}^d \to \mathbb{R}$ of the form $f^*(\boldsymbol{x}) = h(\boldsymbol{U}\boldsymbol{x})$ with $h : \mathbb{R}^k \to \mathbb{R}$ given by $h(\boldsymbol{z}) = \mathop{\mathrm{\ma with equality when $f^* \in \mathcal{F}$.

Figures (1)

  • Figure 1: The adversarial test error of a two-layer ReLU network as a function of the number of adversarial training iterations, where each iteration is performed on a batch of independent 300 samples, except 500 samples for He2 with unknown direction to reduce variance. Full AD training performs adversarial training on all layers from random initialization. SD training is standard training, which provides a better initialization for $W$ before performing adversarial training. We use the adversary budget $\varepsilon=1$ for all experiments, each of which are averaged over three runs.

Theorems & Definitions (16)

  • Theorem 1
  • Definition 2: DFL
  • Definition 3: SFL
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Proposition 8: lee2024neural
  • Corollary 9
  • Proposition 10: damian2022neural
  • ...and 6 more