Learning from Snapshots of Discrete and Continuous Data Streams
Pramith Devulapalli, Steve Hanneke
TL;DR
The paper develops a theoretical framework for online learning from continuous data streams by formalizing two querying settings, update-and-deploy and blind-prediction, and introducing pattern classes that capture time- and data-dependent structure. It shows that for update-and-deploy, all concept classes with finite Littlestone dimension $LD(H)$ are learnable with a non-adaptive uniform sampling strategy, yielding a mistake-bound proportional to $LD(H)$, while in blind-prediction non-trivial concept classes are unlearnable under linear querying. The results further reveal a gap for pattern classes, proving that adaptive learners are necessary to learn certain continuous pattern classes, and they provide a discrete-time theory for pattern classes under blind-prediction. In addition, the paper develops a theory for learning pattern classes from discrete data streams via the query-learning distance $QLD$, linking $M_Q(\mathcal{P})$ to $QLD(\mathcal{P},Q)$ and presenting a constructive BP-SOA algorithm. Overall, the work offers a unified view connecting Littlestone-dimension-based learnability, pattern-class complexity, and adaptive querying in both continuous and discrete online-learning regimes, clarifying when and how learning is possible under restricted feedback.
Abstract
Imagine a smart camera trap selectively clicking pictures to understand animal movement patterns within a particular habitat. These "snapshots", or pieces of data captured from a data stream at adaptively chosen times, provide a glimpse of different animal movements unfolding through time. Learning a continuous-time process through snapshots, such as smart camera traps, is a central theme governing a wide array of online learning situations. In this paper, we adopt a learning-theoretic perspective in understanding the fundamental nature of learning different classes of functions from both discrete data streams and continuous data streams. In our first framework, the \textit{update-and-deploy} setting, a learning algorithm discretely queries from a process to update a predictor designed to make predictions given as input the data stream. We construct a uniform sampling algorithm that can learn with bounded error any concept class with finite Littlestone dimension. Our second framework, known as the \emph{blind-prediction} setting, consists of a learning algorithm generating predictions independently of observing the process, only engaging with the process when it chooses to make queries. Interestingly, we show a stark contrast in learnability where non-trivial concept classes are unlearnable. However, we show that adaptive learning algorithms are necessary to learn sets of time-dependent and data-dependent functions, called pattern classes, in either framework. Finally, we develop a theory of pattern classes under discrete data streams for the blind-prediction setting.
