High-dimensional Asymptotics of Denoising Autoencoders
Hugo Cui, Lenka Zdeborová
TL;DR
The paper analyzes denoising from a high-dimensional Gaussian mixture using a two-layer DAE with tied weights and a skip connection. By applying the replica method in the RS setting, it derives sharp closed-form expressions for the denoising MSE and related metrics as a function of the sample-to-dimension ratio $\alpha$, noise level $\Delta$, and architecture hyperparameters, reducing the problem to a finite set of summary equations. It demonstrates that the full DAE with skip connections can outperform PCA-like bottlenecks and that the skip and bottleneck components play complementary roles, with empirical results aligning with theory on synthetic Gaussian mixtures and real datasets such as MNIST and FashionMNIST. The findings suggest a form of Gaussian universality for denoising and provide theoretical guidance for designing shallow nonlinear denoisers that exceed PCA baselines. Overall, the work advances understanding of how architectural choices in DAEs affect denoising performance in high dimensions and offers a framework for exact, tractable analysis applicable to practical datasets.
Abstract
We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension jointly tend to infinity while the number of hidden units remains bounded. We provide closed-form expressions for the denoising mean-squared test error. Building on this result, we quantitatively characterize the advantage of the considered architecture over the autoencoder without the skip connection that relates closely to principal component analysis. We further show that our results accurately capture the learning curves on a range of real data sets.
