Table of Contents
Fetching ...

Supervised Learning with Evolving Tasks and Performance Guarantees

Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano

TL;DR

This work addresses learning across sequences of evolving classification tasks, unifying batch and online settings (MDA, MTL, SCD, CL) under a single, robust framework. It introduces minimax risk classifiers with uncertainty sets defined by expectations of a feature mapping, and develops forward (Kalman-filter-like) and backward (RTS smoother) recursions to estimate task-mean statistics and their uncertainties across multidimensional task changes. Theoretical results characterize the effective sample size (ESS) gains and furnish computable, tight performance guarantees for the error probability on each task, while extending the approach to higher-order dependencies. Empirical results on diverse benchmarks validate the multidimensional adaptation and the guarantees, showing consistent improvements over state-of-the-art baselines in both batch and online scenarios. The methodology paves the way for principled, scalable transfer across evolving task environments with rigorous performance assurances.

Abstract

Multiple supervised learning scenarios are composed by a sequence of classification tasks. For instance, multi-task learning and continual learning aim to learn a sequence of tasks that is either fixed or grows over time. Existing techniques for learning tasks that are in a sequence are tailored to specific scenarios, lacking adaptability to others. In addition, most of existing techniques consider situations in which the order of the tasks in the sequence is not relevant. However, it is common that tasks in a sequence are evolving in the sense that consecutive tasks often have a higher similarity. This paper presents a learning methodology that is applicable to multiple supervised learning scenarios and adapts to evolving tasks. Differently from existing techniques, we provide computable tight performance guarantees and analytically characterize the increase in the effective sample size. Experiments on benchmark datasets show the performance improvement of the proposed methodology in multiple scenarios and the reliability of the presented performance guarantees.

Supervised Learning with Evolving Tasks and Performance Guarantees

TL;DR

This work addresses learning across sequences of evolving classification tasks, unifying batch and online settings (MDA, MTL, SCD, CL) under a single, robust framework. It introduces minimax risk classifiers with uncertainty sets defined by expectations of a feature mapping, and develops forward (Kalman-filter-like) and backward (RTS smoother) recursions to estimate task-mean statistics and their uncertainties across multidimensional task changes. Theoretical results characterize the effective sample size (ESS) gains and furnish computable, tight performance guarantees for the error probability on each task, while extending the approach to higher-order dependencies. Empirical results on diverse benchmarks validate the multidimensional adaptation and the guarantees, showing consistent improvements over state-of-the-art baselines in both batch and online scenarios. The methodology paves the way for principled, scalable transfer across evolving task environments with rigorous performance assurances.

Abstract

Multiple supervised learning scenarios are composed by a sequence of classification tasks. For instance, multi-task learning and continual learning aim to learn a sequence of tasks that is either fixed or grows over time. Existing techniques for learning tasks that are in a sequence are tailored to specific scenarios, lacking adaptability to others. In addition, most of existing techniques consider situations in which the order of the tasks in the sequence is not relevant. However, it is common that tasks in a sequence are evolving in the sense that consecutive tasks often have a higher similarity. This paper presents a learning methodology that is applicable to multiple supervised learning scenarios and adapts to evolving tasks. Differently from existing techniques, we provide computable tight performance guarantees and analytically characterize the increase in the effective sample size. Experiments on benchmark datasets show the performance improvement of the proposed methodology in multiple scenarios and the reliability of the presented performance guarantees.
Paper Structure (35 sections, 7 theorems, 84 equations, 12 figures, 6 tables, 5 algorithms)

This paper contains 35 sections, 7 theorems, 84 equations, 12 figures, 6 tables, 5 algorithms.

Key Result

Theorem 1

Let $\boldsymbol{\sigma} _j^2$ and $\boldsymbol{d} _j$ in recursions eq:tau_general-eq:gain_general be the actual variance and the actual expected quadratic change between tasks, that is $\boldsymbol{\sigma} _j^2=[\mathbb{V}\text{ar}_{\mathrm{p}_j}\{\Phi^{(1)}(x,y)\}, \mathbb{V}\text{ar}_{\mathrm{p}

Figures (12)

  • Figure 1: Multiple supervised learning scenarios are composed by a sequence of tasks. Tasks are commonly evolving in the sense that consecutive tasks often have a higher similarity (e.g., gender classification in pictures of people with similar ages). The proposed methodology is applicable to multiple supervised learning scenarios and adapts to evolving tasks.
  • Figure 2: In common practical scenarios, tasks are evolving and tasks' changes are multidimensional.
  • Figure 3: The proposed methodology obtains an uncertainty set $\mathcal{U}_j^k$ for each $j$-th task using the sample set $D_j$, the adjacent uncertainty sets $\mathcal{U}_{j-1}^k, \mathcal{U}_{j+1}^k$, and the change between consecutive tasks $\boldsymbol{d} _j, \boldsymbol{d} _{j+1}$. Then, the uncertainty set is used to obtain the classification rule $\mathrm{h}_j^k$ together with the minimax risk $R(\mathcal{U}_j^k)$ that directly gives performance guarantees.
  • Figure 4: The ESS provided by forward learning significantly increases with the number of tasks especially for small values for the sample size $n$ and the expected quadratic change $d$.
  • Figure 5: The ESS provided by forward and backward learning (red and orange lines) significantly increases in comparison with forward learning (green line) especially for small values for the sample size $n$ and the expected quadratic change $d$.
  • ...and 7 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7