Table of Contents
Fetching ...

Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Rudy Semola, Julio Hurtado, Vincenzo Lomonaco, Davide Bacciu

TL;DR

Hyperparameter choice is crucial for continual learning but impractical to optimize across all tasks in real time. The authors introduce an adaptive HPO framework that uses fANOVA to identify the most impactful hyperparameters and tunes only a crucial subset per task, warm-starting from prior solutions. Empirical results across multiple class- and domain-incremental benchmarks show that this approach improves final stream accuracy with substantial efficiency gains and enhanced robustness to task order. The work advances practical, automated, and scalable CL systems by integrating task-aware hyperparameter adaptation into the training loop.

Abstract

Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all tasks, are unrealistic for building accurate lifelong learning systems. This paper aims to explore the role of hyperparameter selection in continual learning and the necessity of continually and automatically tuning them according to the complexity of the task at hand. Hence, we propose leveraging the nature of sequence task learning to improve Hyperparameter Optimization efficiency. By using the functional analysis of variance-based techniques, we identify the most crucial hyperparameters that have an impact on performance. We demonstrate empirically that this approach, agnostic to continual scenarios and strategies, allows us to speed up hyperparameters optimization continually across tasks and exhibit robustness even in the face of varying sequential task orders. We believe that our findings can contribute to the advancement of continual learning methodologies towards more efficient, robust and adaptable models for real-world applications.

Adaptive Hyperparameter Optimization for Continual Learning Scenarios

TL;DR

Hyperparameter choice is crucial for continual learning but impractical to optimize across all tasks in real time. The authors introduce an adaptive HPO framework that uses fANOVA to identify the most impactful hyperparameters and tunes only a crucial subset per task, warm-starting from prior solutions. Empirical results across multiple class- and domain-incremental benchmarks show that this approach improves final stream accuracy with substantial efficiency gains and enhanced robustness to task order. The work advances practical, automated, and scalable CL systems by integrating task-aware hyperparameter adaptation into the training loop.

Abstract

Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all tasks, are unrealistic for building accurate lifelong learning systems. This paper aims to explore the role of hyperparameter selection in continual learning and the necessity of continually and automatically tuning them according to the complexity of the task at hand. Hence, we propose leveraging the nature of sequence task learning to improve Hyperparameter Optimization efficiency. By using the functional analysis of variance-based techniques, we identify the most crucial hyperparameters that have an impact on performance. We demonstrate empirically that this approach, agnostic to continual scenarios and strategies, allows us to speed up hyperparameters optimization continually across tasks and exhibit robustness even in the face of varying sequential task orders. We believe that our findings can contribute to the advancement of continual learning methodologies towards more efficient, robust and adaptable models for real-world applications.
Paper Structure (30 sections, 3 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 30 sections, 3 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Importance of hyperparameters averaged over the four continual learning benchmarks. We report the results for ER and DER continual strategies. The performance variability, estimated by fANOVA, is largely caused by a few hyperparameters that help define a subspace to which we can restrict configuration search space.
  • Figure 2: Importance values for incremental per-task learning using ER and combined ER with SI and LwF; AdamW optimizer and TPE as Hyperparameter optimization (hpo). The results reported on the Online benchmarks show that a small set of hyperparameters is responsible for the most variation in performance. The learning rate plays an important role in online incremental scenarios.
  • Figure 3: Importance values for incremental per-task learning using ER and combined ER with SI and LwF; AdamW optimizer and TPE as Hyperparameter optimization (hpo). The results reported on the Batch benchmarks show that a small set of hyperparameters is responsible for the most variation in performance.
  • Figure 4: Results for executing training and optimization jointly in terms of the time demand on Tiny (over all the sequence); [$\downarrow$] lower is better
  • Figure 5: Results for executing training and optimization jointly in terms of the time demand on CORe50; [$\downarrow$] lower is better
  • ...and 1 more figures