Table of Contents
Fetching ...

Affine Normal Directions via Log-Determinant Geometry: Scalable Computation under Sparse Polynomial Structure

Yi-Shuai Niu, Artan Sheshmani, Shing-Tung Yau

Abstract

Affine normal directions provide intrinsic affine-invariant descent directions derived from the geometry of level sets. Their practical use, however, has long been hindered by the need to evaluate third-order derivatives and invert tangent Hessians, which becomes computationally prohibitive in high dimensions. In this paper, we show that affine normal computation admits an exact reduction to second-order structure: the classical third-order contraction term is precisely the gradient of the log-determinant of the tangent Hessian. This identity replaces explicit third-order tensor contraction by a matrix-free formulation based on tangent linear solves, Hessian-vector products, and log-determinant gradient evaluation. Building on this reduction, we develop exact and stochastic matrix-free procedures for affine normal evaluation. For sparse polynomial objectives, the algebraic closure of derivatives further yields efficient sparse kernels for gradients, Hessian-vector products, and directional third-order contractions, leading to scalable implementations whose cost is governed by the sparsity structure of the polynomial representation. We establish end-to-end complexity bounds showing near-linear scaling with respect to the relevant sparsity scale under fixed stochastic and Krylov budgets. Numerical experiments confirm that the proposed MF-LogDet formulation reproduces the original autodifferentiation-based affine normal direction to near machine precision, delivers substantial runtime improvements in moderate and high dimensions, and exhibits empirical near-linear scaling in both dimension and sparsity. These results provide a practical computational route for affine normal evaluation and reveal a new connection between affine differential geometry, log-determinant curvature, and large-scale structured optimization.

Affine Normal Directions via Log-Determinant Geometry: Scalable Computation under Sparse Polynomial Structure

Abstract

Affine normal directions provide intrinsic affine-invariant descent directions derived from the geometry of level sets. Their practical use, however, has long been hindered by the need to evaluate third-order derivatives and invert tangent Hessians, which becomes computationally prohibitive in high dimensions. In this paper, we show that affine normal computation admits an exact reduction to second-order structure: the classical third-order contraction term is precisely the gradient of the log-determinant of the tangent Hessian. This identity replaces explicit third-order tensor contraction by a matrix-free formulation based on tangent linear solves, Hessian-vector products, and log-determinant gradient evaluation. Building on this reduction, we develop exact and stochastic matrix-free procedures for affine normal evaluation. For sparse polynomial objectives, the algebraic closure of derivatives further yields efficient sparse kernels for gradients, Hessian-vector products, and directional third-order contractions, leading to scalable implementations whose cost is governed by the sparsity structure of the polynomial representation. We establish end-to-end complexity bounds showing near-linear scaling with respect to the relevant sparsity scale under fixed stochastic and Krylov budgets. Numerical experiments confirm that the proposed MF-LogDet formulation reproduces the original autodifferentiation-based affine normal direction to near machine precision, delivers substantial runtime improvements in moderate and high dimensions, and exhibits empirical near-linear scaling in both dimension and sparsity. These results provide a practical computational route for affine normal evaluation and reveal a new connection between affine differential geometry, log-determinant curvature, and large-scale structured optimization.

Paper Structure

This paper contains 84 sections, 16 theorems, 184 equations, 9 figures, 3 tables, 3 algorithms.

Key Result

Lemma 3.1

Assume that the tangent Hessian block $H_T(x)$ is invertible in a neighborhood of $z$. Then for each tangent index $i\in\{1,\dots,n\}$, $\blacktriangleleft$$\blacktriangleleft$

Figures (9)

  • Figure 1: Accuracy of MF-LogDet relative to the original AD-based affine normal computation. The normalized direction error remains around $10^{-9}$, while the angular error stays near machine precision, showing that the proposed reformulation reproduces the same affine normal direction very accurately.
  • Figure 2: Speedup of affine normal computation across dimensions, defined by $T_{\mathrm{AD}}/T_{\mathrm{MF\hbox{-}LogDet}}$. Values above $1$ indicate that MF-LogDet is faster than AD. The crossover occurs around $d\approx 11$, and by $d=20$ the speedup reaches about $11.9\times$.
  • Figure 3: Average runtime per affine normal evaluation for AD and MF-LogDet as a function of the dimension $d$. While AD is competitive in very low dimensions, its runtime grows much more rapidly. MF-LogDet becomes more efficient once the dimension is moderately large.
  • Figure 4: Effect of the Hutchinson probe count $q$ on affine normal accuracy. The left panel shows the mean normalized direction error, and the right panel shows the maximum angular error. In all tested dimensions, increasing $q$ improves the accuracy of the recovered affine normal direction.
  • Figure 5: Runtime of the stochastic trace approximation as a function of the number of Hutchinson probes $q$. After warm-up and repeated averaging, the cost grows approximately linearly with $q$, consistent with the theoretical complexity model.
  • ...and 4 more figures

Theorems & Definitions (43)

  • Lemma 3.1: Log-determinant gradient identity
  • proof
  • Remark 3.2
  • Proposition 3.3: Equivalence of the aligned and ambient representations
  • proof
  • Lemma 4.1: Algebraic closure of polynomial derivatives
  • proof
  • Remark 4.2
  • Proposition 4.3: Matrix-free Hessian--vector kernel
  • proof
  • ...and 33 more