On the Dataless Training of Neural Networks
Alvaro Velasquez, Susmit Jha, Ismail R. Alkhouri
TL;DR
The paper tackles training-data-free optimization by defining dataless neural networks (dNNs) that optimize over a single problem instance, formalized with $D_{dNN}=\emptyset$ and problem-embedding via $G\times c \to \theta$ and $\mathcal{L}$. It develops a taxonomy separating architecture-agnostic (loss-embedded) from architecture-specific (architecture-embedded) methods, and situates dNNs relative to zero-shot/one-shot learning and lifting. It surveys applications across linear and quadratic programs, NP-hard graph problems, SAT, inverse imaging, and PDEs, highlighting representative approaches like PI-GNN, DIP, SIREN, and differentiable schedulers, while discussing theoretical and practical connections to lifting. Overall, dNNs offer a promising data-free optimization pathway for data-scarce scientific domains, but face challenges in scalability, convergence, and generalization that warrant further theoretical and empirical exploration.
Abstract
This paper surveys studies on the use of neural networks for optimization in the training-data-free setting. Specifically, we examine the dataless application of neural network architectures in optimization by re-parameterizing problems using fully connected (or MLP), convolutional, graph, and quadratic neural networks. Although MLPs have been used to solve linear programs a few decades ago, this approach has recently gained increasing attention due to its promising results across diverse applications, including those based on combinatorial optimization, inverse problems, and partial differential equations. The motivation for this setting stems from two key (possibly over-lapping) factors: (i) data-driven learning approaches are still underdeveloped and have yet to demonstrate strong results, as seen in combinatorial optimization, and (ii) the availability of training data is inherently limited, such as in medical image reconstruction and other scientific applications. In this paper, we define the dataless setting and categorize it into two variants based on how a problem instance -- defined by a single datum -- is encoded onto the neural network: (i) architecture-agnostic methods and (ii) architecture-specific methods. Additionally, we discuss similarities and clarify distinctions between the dataless neural network (dNN) settings and related concepts such as zero-shot learning, one-shot learning, lifting in optimization, and over-parameterization.
