Table of Contents
Fetching ...

Factored Levenberg-Marquardt for Diffeomorphic Image Registration: An efficient optimizer for FireANTs

Rohit Jena, Pratik Chaudhari, James C. Gee

Abstract

FireANTs introduced a novel Eulerian descent method for plug-and-play behavior with arbitrary optimizers adapted for diffeomorphic image registration as a test-time optimization problem, with a GPU-accelerated implementation. FireANTs uses Adam as its default optimizer for fast and more robust optimization. However, Adam requires storing state variables (i.e. momentum and squared-momentum estimates), each of which can consume significant memory, prohibiting its use for significantly large images. In this work, we propose a modified Levenberg-Marquardt (LM) optimizer that requires only a single scalar damping parameter as optimizer state, that is adaptively tuned using a trust region approach. The resulting optimizer reduces memory by up to 24.6% for large volumes, and retaining performance across all four datasets. A single hyperparameter configuration tuned on brain MRI transfers without modification to lung CT and cross-modal abdominal registration, matching or outperforming Adam on three of four benchmarks. We also perform ablations on the effectiveness of using Metropolis-Hastings style rejection step to prevent updates that worsen the loss function.

Factored Levenberg-Marquardt for Diffeomorphic Image Registration: An efficient optimizer for FireANTs

Abstract

FireANTs introduced a novel Eulerian descent method for plug-and-play behavior with arbitrary optimizers adapted for diffeomorphic image registration as a test-time optimization problem, with a GPU-accelerated implementation. FireANTs uses Adam as its default optimizer for fast and more robust optimization. However, Adam requires storing state variables (i.e. momentum and squared-momentum estimates), each of which can consume significant memory, prohibiting its use for significantly large images. In this work, we propose a modified Levenberg-Marquardt (LM) optimizer that requires only a single scalar damping parameter as optimizer state, that is adaptively tuned using a trust region approach. The resulting optimizer reduces memory by up to 24.6% for large volumes, and retaining performance across all four datasets. A single hyperparameter configuration tuned on brain MRI transfers without modification to lung CT and cross-modal abdominal registration, matching or outperforming Adam on three of four benchmarks. We also perform ablations on the effectiveness of using Metropolis-Hastings style rejection step to prevent updates that worsen the loss function.
Paper Structure (28 sections, 14 equations, 2 figures, 5 tables)

This paper contains 28 sections, 14 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Sensitivity of mean Dice on LUMIR (38 pairs) to all three LM hyperparameters. Our defaults ($\lambda_0=0.006$, $\mu^+=1.5$, $\mu^-=0.975$, red dashed) were selected via Bayesian optimization on LUMIR and are marked in each panel. Left: $\lambda_0$ is entirely insensitive---Dice varies by $<0.001$ over a $5000\times$ range. Center: $\mu^+$ has a catastrophic cliff between $2.3$ and $3.0$; our default of $1.5$ sits at the safe peak. Right: $\mu^-$ shows monotone improvement up to $0.963$ then a small drop; our default of $0.975$ sits in the optimal plateau.
  • Figure 2: Mean Dice ($\pm$ std) on LUMIR (38 pairs) as tile size $k$ varies from 1 to 10, with all other LM hyperparameters fixed to their defaults. Pointwise updates ($k{=}1$, red dashed) achieve the best mean Dice ($0.8716$) and lowest variance; performance drops $-0.009$ at $k{=}2$ and degrades monotonically to $0.828$ by $k{=}7$--$10$, with standard deviation rising throughout.