The Extrapolation Power of Implicit Models
Juliette Decugis, Alicia Y. Tsai, Max Emerling, Ashwin Ganesh, Laurent El Ghaoui
TL;DR
The paper studies the extrapolation capabilities of implicit models that define hidden states via equilibrium equations, enabling stable forward-backward information flow through closed-loop feedback. It compares implicit models against non-implicit architectures across mathematical, time-series, and geospatial tasks to assess robustness to unobserved data. The authors show that the equilibrium-based representations adaptively grow in depth through iterative refinement and leverage closed-loop feedback to achieve superior out-of-distribution extrapolation, with predictions given by $\hat{y} = Cx + Du$ where $x = \phi(Ax + Bu)$. These findings suggest implicit models can learn more general, task-agnostic representations with practical advantages for real-world extrapolation and limited data scenarios.
Abstract
In this paper, we investigate the extrapolation capabilities of implicit deep learning models in handling unobserved data, where traditional deep neural networks may falter. Implicit models, distinguished by their adaptability in layer depth and incorporation of feedback within their computational graph, are put to the test across various extrapolation scenarios: out-of-distribution, geographical, and temporal shifts. Our experiments consistently demonstrate significant performance advantage with implicit models. Unlike their non-implicit counterparts, which often rely on meticulous architectural design for each task, implicit models demonstrate the ability to learn complex model structures without the need for task-specific design, highlighting their robustness in handling unseen data.
