Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning

Andrew Ly; Pulin Gong

Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning

Andrew Ly, Pulin Gong

TL;DR

This work derives sufficient conditions for the emergence of riddled basins by analytically linking features widely observed in deep learning, including chaotic learning dynamics and symmetry-induced invariant subspaces, to reveal a general route to riddling in realistic deep networks.

Abstract

Fundamental limits to predictability are central to our understanding of many physical and computational systems. Here we show that, despite its remarkable capabilities, deep learning exhibits such fundamental limits rooted in the fractal, riddled geometry of its basins of attraction: any initialization that leads to one solution lies arbitrarily close to another that leads to a different one. We derive sufficient conditions for the emergence of riddled basins by analytically linking features widely observed in deep learning, including chaotic learning dynamics and symmetry-induced invariant subspaces, to reveal a general route to riddling in realistic deep networks. The resulting basins of attraction possess an infinitely fine-scale fractal structure characterized by an uncertainty exponent near zero, so that even large increases in the precision of initial conditions yield only marginal gains in outcome predictability. Riddling thus imposes a fundamental limit on the predictability and hence reproducibility of neural network training, providing a unified account of many empirical observations. These results reveal a general organizing principle of deep learning with important implications for optimization and the safe deployment of artificial intelligence.

Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning

TL;DR

Abstract

Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)