Neural Networks Perform Sufficient Dimension Reduction

Shuntuo Xu; Zhou Yu

Neural Networks Perform Sufficient Dimension Reduction

Shuntuo Xu, Zhou Yu

TL;DR

This work establishes a rigorous link between neural networks and sufficient dimension reduction in regression by showing that, under rank regularization in the first layer, the network learns a projection $B_0^{\top}x$ that captures the central mean subspace, i.e., $\Pi_{\mathcal{T}(f^*)}=\Pi_{B_0}$ at the population level. It proves both population-level unbiasedness (Theorem 1) and sample-level consistency (Theorem 2), with the estimator converging as $n$ grows under scalable network depth and width, without strong distributional assumptions on $x$. The authors validate the theory through comprehensive simulations and a real-data study on Seoul weather data, showing competitive or superior SDR performance relative to classical methods and demonstrating practical utility of NN-based SDR. They also discuss extensions toward the central subspace via kernel-based strategies, hinting at broader applicability of neural networks for SDR beyond the mean subspace.

Abstract

This paper investigates the connection between neural networks and sufficient dimension reduction (SDR), demonstrating that neural networks inherently perform SDR in regression tasks under appropriate rank regularizations. Specifically, the weights in the first layer span the central mean subspace. We establish the statistical consistency of the neural network-based estimator for the central mean subspace, underscoring the suitability of neural networks in addressing SDR-related challenges. Numerical experiments further validate our theoretical findings, and highlight the underlying capability of neural networks to facilitate SDR compared to the existing methods. Additionally, we discuss an extension to unravel the central subspace, broadening the scope of our investigation.

Neural Networks Perform Sufficient Dimension Reduction

TL;DR

Abstract

Neural Networks Perform Sufficient Dimension Reduction

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (8)