Table of Contents
Fetching ...

HyperbolicLR: Epoch insensitive learning rate scheduler

Tae-Geun Kim

TL;DR

This work tackles epoch sensitivity in learning rate scheduling by introducing HyperbolicLR and ExpHyperbolicLR, two schedulers anchored in hyperbolic curve properties to stabilize early learning-rate changes across different epoch counts $N$. The authors provide explicit formulations, analyze their theoretical properties, and compare them against common schedulers, including PolynomialLR and CosineAnnealingLR, using three diverse tasks: image classification on CIFAR-10, time-series forecasting with oscillations, and operator learning with DeepONet/TraONet. Through a two-stage experimental protocol involving hyperparameter optimization at 50 epochs and evaluation up to 200 epochs, the study demonstrates that HyperbolicLR and ExpHyperbolicLR exhibit superior learning-curve stability (low smoothed learning-curve differences) and stronger, more consistent performance gains (via power-regression metrics and $R^2$), particularly as training duration increases. The results suggest that hyperbolic-based schedulers offer robust, resource-efficient optimization, with ExpHyperbolicLR delivering especially stable behavior in settings prone to overfitting, while the authors acknowledge scenario-dependent scheduler choice and call for broader validation and theoretical development.

Abstract

This study proposes two novel learning rate schedulers -- Hyperbolic Learning Rate Scheduler (HyperbolicLR) and Exponential Hyperbolic Learning Rate Scheduler (ExpHyperbolicLR) -- to address the epoch sensitivity problem that often causes inconsistent learning curves in conventional methods. By leveraging the asymptotic behavior of hyperbolic curves, the proposed schedulers maintain more stable learning curves across varying epoch settings. Specifically, HyperbolicLR applies this property directly in the epoch-learning rate space, while ExpHyperbolicLR extends it to an exponential space. We first determine optimal hyperparameters for each scheduler on a small number of epochs, fix these hyperparameters, and then evaluate performance as the number of epochs increases. Experimental results on various deep learning tasks (e.g., image classification, time series forecasting, and operator learning) demonstrate that both HyperbolicLR and ExpHyperbolicLR achieve more consistent performance improvements than conventional schedulers as training duration grows. These findings suggest that our hyperbolic-based schedulers offer a more robust and efficient approach to deep network optimization, particularly in scenarios constrained by computational resources or time.

HyperbolicLR: Epoch insensitive learning rate scheduler

TL;DR

This work tackles epoch sensitivity in learning rate scheduling by introducing HyperbolicLR and ExpHyperbolicLR, two schedulers anchored in hyperbolic curve properties to stabilize early learning-rate changes across different epoch counts . The authors provide explicit formulations, analyze their theoretical properties, and compare them against common schedulers, including PolynomialLR and CosineAnnealingLR, using three diverse tasks: image classification on CIFAR-10, time-series forecasting with oscillations, and operator learning with DeepONet/TraONet. Through a two-stage experimental protocol involving hyperparameter optimization at 50 epochs and evaluation up to 200 epochs, the study demonstrates that HyperbolicLR and ExpHyperbolicLR exhibit superior learning-curve stability (low smoothed learning-curve differences) and stronger, more consistent performance gains (via power-regression metrics and ), particularly as training duration increases. The results suggest that hyperbolic-based schedulers offer robust, resource-efficient optimization, with ExpHyperbolicLR delivering especially stable behavior in settings prone to overfitting, while the authors acknowledge scenario-dependent scheduler choice and call for broader validation and theoretical development.

Abstract

This study proposes two novel learning rate schedulers -- Hyperbolic Learning Rate Scheduler (HyperbolicLR) and Exponential Hyperbolic Learning Rate Scheduler (ExpHyperbolicLR) -- to address the epoch sensitivity problem that often causes inconsistent learning curves in conventional methods. By leveraging the asymptotic behavior of hyperbolic curves, the proposed schedulers maintain more stable learning curves across varying epoch settings. Specifically, HyperbolicLR applies this property directly in the epoch-learning rate space, while ExpHyperbolicLR extends it to an exponential space. We first determine optimal hyperparameters for each scheduler on a small number of epochs, fix these hyperparameters, and then evaluate performance as the number of epochs increases. Experimental results on various deep learning tasks (e.g., image classification, time series forecasting, and operator learning) demonstrate that both HyperbolicLR and ExpHyperbolicLR achieve more consistent performance improvements than conventional schedulers as training duration grows. These findings suggest that our hyperbolic-based schedulers offer a more robust and efficient approach to deep network optimization, particularly in scenarios constrained by computational resources or time.
Paper Structure (41 sections, 4 theorems, 25 equations, 9 figures, 5 tables)

This paper contains 41 sections, 4 theorems, 25 equations, 9 figures, 5 tables.

Key Result

Proposition 1.1

Let $h$ be a function defined by Then the graph $\{(n, h(n; N, U)) | 0 \leq n \leq N < U\}$ represents the part of hyperbolic curve.

Figures (9)

  • Figure 1: Comparison of learning curves for CosineAnnealingLR (a) and ExpHyperbolicLR (b) in an time series forecasting task. Blue solid curves represent training with 50 epochs, while red dashed curves represent training with 100 epochs. Note the significant decoupling in CosineAnnealingLR after 20 epochs, compared to the more consistent behavior of ExpHyperbolicLR.
  • Figure 2: Comparison of learning rate schedules for different total epochs ($N = 250,\,500,\,750,\,1000$). The schedulers shown are: (a) PolynomialLR ($p = 0.5$), (b) CosineAnnealingLR ($\eta_\text{min} = 10^{-4}$), (c) HyperbolicLR ($\eta_\text{inf} = 10^{-4},\,U = 1000$), and (d) ExpHyperbolicLR ($\eta_\text{inf} = 10^{-4},\,U = 1000$). All schedulers start with $\eta_\text{init} = 1$.
  • Figure 3: Performance of all schedulers for all tasks and architectures. Scheduler abbreviations: N = No scheduler (constant learning rate), P = PolynomialLR, C = CosineAnnealingLR, E = ExponentialLR, H = HyperbolicLR, EH = ExpHyperbolicLR.
  • Figure 4: (Red solid box) Architectures of SimpleCNN for CIFAR-10 classification, (Blue solid box) LSTM Sequence-to-Sequence model for oscillation prediction, (Green dashed box) DeepONet model for learning integral operator and (Orange solid box) TraONet model for learning integral operator.
  • Figure 5: Learning curves of validation loss for SimpleCNN on CIFAR-10.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Proposition 1.1
  • Proposition 1.2
  • Proposition 1.3
  • Proposition 1.4