Provable Bounds for Learning Some Deep Representations
Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma
TL;DR
This work provides provable guarantees for learning a broad class of deep neural networks under a random sparse generative model. It introduces a layerwise unsupervised learning framework that exploits correlations to recover inter-layer edges and uses graph-recovery techniques to reconstruct the network, achieving polynomial-time learning with favorable sample complexities for both discrete and real-weight variants. The authors show that adjacent layers form denoising autoencoders and establish a graph-theoretic reconstruction approach, and they prove depth cannot in general be collapsed into a single layer. These results offer a theoretical foundation for weight tying and reversibility assumptions common in practice and highlight a concrete path toward provable deep net learning under randomness assumptions.
Abstract
We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an $n$ node multilayer neural net that has degree at most $n^γ$ for some $γ<1$ and each edge has a random edge weight in $[-1,1]$. Our algorithm learns {\em almost all} networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.
