Table of Contents
Fetching ...

Scalable higher-order nonlinear solvers via higher-order automatic differentiation

Songchen Tan, Keming Miao, Alan Edelman, Christopher Rackauckas

TL;DR

The paper tackles the inefficiency of traditional Newton-based nonlinear solvers by leveraging Taylor-mode automatic differentiation to compute higher-order directional derivatives, enabling scalable higher-order solvers. It develops Householder's method for univariate problems and generalizes Halley's method to multivariate systems, showing that Halley's method can rival or surpass Newton's method in dense, sparse, and stiff settings. Through implementations in Julia (including SimpleNonlinearSolve.jl), the work demonstrates significant speedups in large-scale and ill-conditioned problems and demonstrates integration into stiff ODE solvers. A key practical insight is that the second-order Halley variant often yields the best cost-accuracy trade-off, though the approach assumes the availability of Jacobian factorizations; extending to matrix-free, iterative linear solves remains for future work.

Abstract

This paper demonstrates new methods and implementations of nonlinear solvers with higher-order of convergence, which is achieved by efficiently computing higher-order derivatives. Instead of computing full derivatives, which could be expensive, we compute directional derivatives with Taylor-mode automatic differentiation. We first implement Householder's method with arbitrary order for one variable, and investigate the trade-off between computational cost and convergence order. We find that the second-order variant, i.e., Halley's method, to be the most valuable, and further generalize Halley's method to systems of nonlinear equations and demonstrate that it can scale efficiently to large-scale problems. We further apply Halley's method on solving large-scale ill-conditioned nonlinear problems, as well as solving nonlinear equations inside stiff ODE solvers, and demonstrate that it could outperform Newton's method.

Scalable higher-order nonlinear solvers via higher-order automatic differentiation

TL;DR

The paper tackles the inefficiency of traditional Newton-based nonlinear solvers by leveraging Taylor-mode automatic differentiation to compute higher-order directional derivatives, enabling scalable higher-order solvers. It develops Householder's method for univariate problems and generalizes Halley's method to multivariate systems, showing that Halley's method can rival or surpass Newton's method in dense, sparse, and stiff settings. Through implementations in Julia (including SimpleNonlinearSolve.jl), the work demonstrates significant speedups in large-scale and ill-conditioned problems and demonstrates integration into stiff ODE solvers. A key practical insight is that the second-order Halley variant often yields the best cost-accuracy trade-off, though the approach assumes the availability of Jacobian factorizations; extending to matrix-free, iterative linear solves remains for future work.

Abstract

This paper demonstrates new methods and implementations of nonlinear solvers with higher-order of convergence, which is achieved by efficiently computing higher-order derivatives. Instead of computing full derivatives, which could be expensive, we compute directional derivatives with Taylor-mode automatic differentiation. We first implement Householder's method with arbitrary order for one variable, and investigate the trade-off between computational cost and convergence order. We find that the second-order variant, i.e., Halley's method, to be the most valuable, and further generalize Halley's method to systems of nonlinear equations and demonstrate that it can scale efficiently to large-scale problems. We further apply Halley's method on solving large-scale ill-conditioned nonlinear problems, as well as solving nonlinear equations inside stiff ODE solvers, and demonstrate that it could outperform Newton's method.

Paper Structure

This paper contains 10 sections, 10 equations, 5 figures, 2 algorithms.

Figures (5)

  • Figure 1: Cost-effectiveness for Householder's method with different orders. Left: the computation time per iteration for each nonlinear function and each Householder's method with different order. Right: the relative speedup for each function of the total computation time to solve the nonlinear equation, normalized by the $p=1$ solver (Newton's method). The functions and initial conditions are: (1) $f_1(x)=x^2-2$, $x_0=1.0$; (2) $f_2(x)=\sqrt{x}-\pi$, $x_0=10.0$; (3) $f_3(x)=x-\exp(-x)$, $x_0=0.0$; (4) $f_4(x)=x^2-2^x$, $x_0=3.3$; (5) $f_5(x)=x+\sin(x)-1$, $x_0=0.5$; (6) $f_6(x)=\log(x)+x$, $x_0=1.0$.
  • Figure 1: Scaling for solving a nonlinear problem that has a dense Jacobian, at different problem sizes. Solvers: Newton, Halley (Taylor-mode AD), and Naive Halley (full Hessian).
  • Figure 2: Work-precision diagram for solving a nonlinear problem that has a dense Jacobian, with different sizes. Solvers: Newton and Halley (Taylor-mode AD).
  • Figure 3: Scaling for solving discretized two-dimensional Brusselator steady-state problem with different problem sizes. The problems are solved to default tolerance. Left: computation time for Newton and Halley's method, each with or without sparsity detection, for each problem size. Right: relative speedup of Halley's method over Newton's method with or without sparsity detection, for each problem size.
  • Figure 4: Work-precision diagram for solving discretized two-dimensional Brusselator time-dependent PDE. Implicit Solvers: Trapezoidvladimirescu_spice_1994, TRBDF2hosea_analysis_1996, FBDFshampine_solving_2002, QNDFshampine_matlab_1997, QBDF (alias of QNDF with $\kappa=0$), KenCarp4kennedy_additive_2003, Kvaerno5kvaerno_singly_2004. For each solver, the Newton's method is drawn in circle and solid line, and the Halley's method is drawn in diamond and dashed line.