Towards a theory of out-of-distribution learning
Jayanta Dey, Ali Geisa, Ronak Mehta, Tyler M. Tomita, Hayden S. Helm, Haoyin Xu, Eric Eaton, Jeffery Dick, Carey E. Priebe, Joshua T. Vogelstein
TL;DR
This paper proposes a unified PAC-based framework to study both in-distribution and out-of-distribution learning, arguing that many OOD paradigms can be viewed as instances of generalized learnability. It introduces the concept of learning efficiency (LE) to quantify transfer and forgetting across ID, transfer, multitask, meta-, continual, and lifelong learning, and lays out formal definitions for weak/strong/non-uniform/consistency in OOD settings. Theoretical results establish how strong learnability implies weaker variants and highlight impossibilities and caveats (e.g., weak but not strong OOD learnability), while zero-shot and few-shot settings are accommodated via LE without requiring large sample sizes. The framework also emphasizes computational constraints and advocates for precise reporting of problem scope, with the aim of enabling principled comparisons and bridging machine learning with biological learning to advance toward more general, robust OOD capabilities. Overall, the work offers a flexible, hierarchical lens to quantify learnability and transfer across a broad spectrum of learning paradigms, potentially informing future theory and practice in AI systems capable of robust out-of-distribution reasoning.
Abstract
Learning is a process wherein a learning agent enhances its performance through exposure of experience or data. Throughout this journey, the agent may encounter diverse learning environments. For example, data may be presented to the leaner all at once, in multiple batches, or sequentially. Furthermore, the distribution of each data sample could be either identical and independent (iid) or non-iid. Additionally, there may exist computational and space constraints for the deployment of the learning algorithms. The complexity of a learning task can vary significantly, depending on the learning setup and the constraints imposed upon it. However, it is worth noting that the current literature lacks formal definitions for many of the in-distribution and out-of-distribution learning paradigms. Establishing proper and universally agreed-upon definitions for these learning setups is essential for thoroughly exploring the evolution of ideas across different learning scenarios and deriving generalized mathematical bounds for these learners. In this paper, we aim to address this issue by proposing a chronological approach to defining different learning tasks using the provably approximately correct (PAC) learning framework. We will start with in-distribution learning and progress to recently proposed lifelong or continual learning. We employ consistent terminology and notation to demonstrate how each of these learning frameworks represents a specific instance of a broader, more generalized concept of learnability. Our hope is that this work will inspire a universally agreed-upon approach to quantifying different types of learning, fostering greater understanding and progress in the field.
