Fuzzy hyperparameters update in a second order optimization

Abdelaziz Bensadok; Muhammad Zeeshan Babar

Fuzzy hyperparameters update in a second order optimization

Abdelaziz Bensadok, Muhammad Zeeshan Babar

TL;DR

The paper tackles the inefficiency of traditional first-order optimizers in deep learning by introducing SALO, a second-order optimizer that uses an online diagonal Hessian approximation $H_d$ computed via finite differences to guide weight updates with $\Delta w = - lr \cdot H_d^{-1} \cdot g$. It couples this with a fuzzy logic scheduler that adaptively tunes the learning rate and second-derivative momentum ($\beta_1$, $\beta_3$), reducing sensitivity to hyperparameter choices. Empirical results on TinyImageNet and ImageNet demonstrate SALO’s ability to achieve lower training loss and higher validation accuracy than SGD, Adam, and AdamW, with competitive runtime overhead. The work suggests that combining online curvature information with fuzzy control yields a robust, scalable second-order optimization paradigm for large-scale vision models, and points to future gains from precomputed fuzzy policies and expanded rule sets.

Abstract

This research will present a hybrid approach to accelerate convergence in a second order optimization. An online finite difference approximation of the diagonal Hessian matrix will be introduced, along with fuzzy inferencing of several hyperparameters. Competitive results have been achieved

Fuzzy hyperparameters update in a second order optimization

TL;DR

The paper tackles the inefficiency of traditional first-order optimizers in deep learning by introducing SALO, a second-order optimizer that uses an online diagonal Hessian approximation

computed via finite differences to guide weight updates with

. It couples this with a fuzzy logic scheduler that adaptively tunes the learning rate and second-derivative momentum (

), reducing sensitivity to hyperparameter choices. Empirical results on TinyImageNet and ImageNet demonstrate SALO’s ability to achieve lower training loss and higher validation accuracy than SGD, Adam, and AdamW, with competitive runtime overhead. The work suggests that combining online curvature information with fuzzy control yields a robust, scalable second-order optimization paradigm for large-scale vision models, and points to future gains from precomputed fuzzy policies and expanded rule sets.

Abstract

Paper Structure (14 sections, 7 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 7 equations, 5 figures, 1 table, 1 algorithm.

Introduction
Second Order Optimization Review
Diagonal Hessian Approximation
Diagonal Approximation of the Hessian by Finite Differences for Unconstrained Optimization
ADAHESSIAN
Fuzzy Logic Based Scheduling: Literature Review
Diagonal Hessian Approximation: Introducing Our Method
Initial Experimental Results: Discussion On Enhancing the Method
Fuzzy Logic Based Scheduling: Our Implementation
Empirical Performance: Application on ImageNet
Experiment Setup
Experimental Results: TinyImageNet
Experimental Results: Discussion
Conclusion and Future Work

Figures (5)

Figure 1: Comparison of our optimizer against ADAM, NewtonCG, BFGS and CG
Figure 3: Bloc Diagram of Our Fuzzy Scheduling System
Figure 4: Output Variables
Figure 5: Training Accuracy: SALO VS. Adam , AdamW and SGD
Figure 6: Validation Accuracy: SALO VS. Adam , AdamW and SGD

Fuzzy hyperparameters update in a second order optimization

TL;DR

Abstract

Fuzzy hyperparameters update in a second order optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (5)