Table of Contents
Fetching ...

Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis

Junfan Li, Shizhong Liao

TL;DR

This work studies online kernel selection with a fixed memory budget, addressing the gap between worst-case regret and learnability by introducing data-dependent regret bounds that depend on kernel alignment $\mathcal{A}_{T,\kappa_i}$ and cumulative losses $L_T(f)$. It proposes a buffer-based algorithmic framework that reduces online kernel selection to prediction with expert advice using adaptive sampling and strategic removal of examples, yielding two specialized algorithms: M-OMD-H for hinge loss and M-OMD-S for smooth losses. Theoretical results establish sub-linear regret under sub-linear data complexities, with tight upper and matching lower bounds for smooth losses, demonstrating learnability within memory $\mathcal{R}=O(\log T)$ when the data are favorable. Empirical validation on benchmark datasets confirms improved performance under memory constraints, illustrating practical feasibility for resource-limited devices. Overall, the paper provides a principled data-dependent perspective on memory-regret trade-offs in online kernel selection and proposes practical, memory-aware learning strategies.

Abstract

Online kernel selection is a fundamental problem of online kernel methods.In this paper,we study online kernel selection with memory constraint in which the memory of kernel selection and online prediction procedures is limited to a fixed budget. An essential question is what is the intrinsic relationship among online learnability, memory constraint, and data complexity? To answer the question,it is necessary to show the trade-offs between regret and memory constraint.Previous work gives a worst-case lower bound depending on the data size,and shows learning is impossible within a small memory constraint.In contrast, we present distinct results by offering data-dependent upper bounds that rely on two data complexities:kernel alignment and the cumulative losses of competitive hypothesis.We propose an algorithmic framework giving data-dependent upper bounds for two types of loss functions.For the hinge loss function,our algorithm achieves an expected upper bound depending on kernel alignment.For smooth loss functions,our algorithm achieves a high-probability upper bound depending on the cumulative losses of competitive hypothesis.We also prove a matching lower bound for smooth loss functions.Our results show that if the two data complexities are sub-linear,then learning is possible within a small memory constraint.Our algorithmic framework depends on a new buffer maintaining framework and a reduction from online kernel selection to prediction with expert advice. Finally,we empirically verify the prediction performance of our algorithms on benchmark datasets.

Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis

TL;DR

This work studies online kernel selection with a fixed memory budget, addressing the gap between worst-case regret and learnability by introducing data-dependent regret bounds that depend on kernel alignment and cumulative losses . It proposes a buffer-based algorithmic framework that reduces online kernel selection to prediction with expert advice using adaptive sampling and strategic removal of examples, yielding two specialized algorithms: M-OMD-H for hinge loss and M-OMD-S for smooth losses. Theoretical results establish sub-linear regret under sub-linear data complexities, with tight upper and matching lower bounds for smooth losses, demonstrating learnability within memory when the data are favorable. Empirical validation on benchmark datasets confirms improved performance under memory constraints, illustrating practical feasibility for resource-limited devices. Overall, the paper provides a principled data-dependent perspective on memory-regret trade-offs in online kernel selection and proposes practical, memory-aware learning strategies.

Abstract

Online kernel selection is a fundamental problem of online kernel methods.In this paper,we study online kernel selection with memory constraint in which the memory of kernel selection and online prediction procedures is limited to a fixed budget. An essential question is what is the intrinsic relationship among online learnability, memory constraint, and data complexity? To answer the question,it is necessary to show the trade-offs between regret and memory constraint.Previous work gives a worst-case lower bound depending on the data size,and shows learning is impossible within a small memory constraint.In contrast, we present distinct results by offering data-dependent upper bounds that rely on two data complexities:kernel alignment and the cumulative losses of competitive hypothesis.We propose an algorithmic framework giving data-dependent upper bounds for two types of loss functions.For the hinge loss function,our algorithm achieves an expected upper bound depending on kernel alignment.For smooth loss functions,our algorithm achieves a high-probability upper bound depending on the cumulative losses of competitive hypothesis.We also prove a matching lower bound for smooth loss functions.Our results show that if the two data complexities are sub-linear,then learning is possible within a small memory constraint.Our algorithmic framework depends on a new buffer maintaining framework and a reduction from online kernel selection to prediction with expert advice. Finally,we empirically verify the prediction performance of our algorithms on benchmark datasets.
Paper Structure (27 sections, 14 theorems, 130 equations, 3 tables, 2 algorithms)

This paper contains 27 sections, 14 theorems, 130 equations, 3 tables, 2 algorithms.

Key Result

Lemma 1

Let $M> 1$ and $\alpha\mathcal{R}:=B\geq 2M(1+\ln{T})$. For any $\mathcal{I}_T$, the expected times that M-OMD-H executes removing operation on $S_i$ are $\left\lceil\frac{4K\tilde{\mathcal{A}}_{T,\kappa_i}}{Bk_1}\right\rceil$ at most, in which

Theorems & Definitions (29)

  • Definition 1: Memory Budget Li2022Worst
  • Definition 2: Online Learnability
  • Lemma 1
  • Theorem 1
  • Theorem 2: Algorithm-dependent Bound
  • Definition 3
  • Lemma 2
  • Theorem 3
  • Theorem 4: Lower Bound
  • Theorem 5: Algorithm-dependent Bound
  • ...and 19 more