Table of Contents
Fetching ...

AdamZ: An Enhanced Optimisation Method for Neural Network Training

Ilia Zaznov, Atta Badii, Alfonso Dufour, Julian Kunkel

TL;DR

benchmarking results demonstrate the effectiveness of AdamZ in maintaining optimal learning rates, leading to improved model performance across diverse tasks, leading to improved model performance across diverse tasks.

Abstract

AdamZ is an advanced variant of the Adam optimiser, developed to enhance convergence efficiency in neural network training. This optimiser dynamically adjusts the learning rate by incorporating mechanisms to address overshooting and stagnation, that are common challenges in optimisation. Specifically, AdamZ reduces the learning rate when overshooting is detected and increases it during periods of stagnation, utilising hyperparameters such as overshoot and stagnation factors, thresholds, and patience levels to guide these adjustments. While AdamZ may lead to slightly longer training times compared to some other optimisers, it consistently excels in minimising the loss function, making it particularly advantageous for applications where precision is critical. Benchmarking results demonstrate the effectiveness of AdamZ in maintaining optimal learning rates, leading to improved model performance across diverse tasks.

AdamZ: An Enhanced Optimisation Method for Neural Network Training

TL;DR

benchmarking results demonstrate the effectiveness of AdamZ in maintaining optimal learning rates, leading to improved model performance across diverse tasks, leading to improved model performance across diverse tasks.

Abstract

AdamZ is an advanced variant of the Adam optimiser, developed to enhance convergence efficiency in neural network training. This optimiser dynamically adjusts the learning rate by incorporating mechanisms to address overshooting and stagnation, that are common challenges in optimisation. Specifically, AdamZ reduces the learning rate when overshooting is detected and increases it during periods of stagnation, utilising hyperparameters such as overshoot and stagnation factors, thresholds, and patience levels to guide these adjustments. While AdamZ may lead to slightly longer training times compared to some other optimisers, it consistently excels in minimising the loss function, making it particularly advantageous for applications where precision is critical. Benchmarking results demonstrate the effectiveness of AdamZ in maintaining optimal learning rates, leading to improved model performance across diverse tasks.

Paper Structure

This paper contains 11 sections, 13 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: AdamZ mechanism of detecting and responding to overshooting and stagnation
  • Figure 2: Visualisation of the make_circles dataset
  • Figure 3: A neural network architecture applied to the make_circles dataset.
  • Figure 4: Accuracy and training duration on the make_circles dataset.
  • Figure 5: Optimzers' loss evolution for the make_circles dataset.
  • ...and 4 more figures