Explore the Loss space with Hill-ADAM

Meenakshi Manikandan; Leilani Gilpin

Explore the Loss space with Hill-ADAM

Meenakshi Manikandan, Leilani Gilpin

TL;DR

Hill-ADAM addresses the problem of local minima trapping in gradient-based optimization by introducing a deterministic two-phase strategy that alternates between loss minimization and maximization to explore the loss landscape. Building on the ADAM framework, it derives a step-size approximation and an escape condition, then implements a Hill-ADAM algorithm that stores the best encountered state. Empirical results on polynomial loss surfaces and a color-correction task show Hill-ADAM achieving lower minima than ADAM and performing competitively with NADAM and RMSprop, highlighting improved exploration and reduced stagnation. The work offers a principled mechanism for escaping local minima in multi-modal landscapes, with potential impact on robust optimization in practical ML applications.

Abstract

This paper introduces Hill-ADAM. Hill-ADAM is an optimizer with its focus towards escaping local minima in prescribed loss landscapes to find the global minimum. Hill-ADAM escapes minima by deterministically exploring the state space. This eliminates uncertainty from random gradient updates in stochastic algorithms while seldom converging at the first minimum that visits. In the paper we first derive an analytical approximation of the ADAM Optimizer step size at a particular model state. From there define the primary condition determining ADAM limitations in escaping local minima. The proposed optimizer algorithm Hill-ADAM alternates between error minimization and maximization. It maximizes to escape the local minimum and minimizes again afterward. This alternation provides an overall exploration throughout the loss space. This allows the deduction of the global minimum's state. Hill-ADAM was tested with 5 loss functions and 12 amber-saturated to cooler-shade image color correction instances.

Explore the Loss space with Hill-ADAM

TL;DR

Abstract

Explore the Loss space with Hill-ADAM

TL;DR

Abstract

Paper Structure

Table of Contents