Non-Euclidean Broximal Point Method: A Blueprint for Geometry-Aware Optimization
Kaja Gruntkowska, Peter Richtárik
TL;DR
This work extends the convergence theory of the Broximal Point Method (BPM) from Euclidean to non-Euclidean geometries by replacing the Euclidean ball constraint with a general norm ball and proving that core BPM guarantees persist. The Non-Euclidean BPM retains key properties such as linear-type progress in function values and gradient norms, and finite-time convergence with a fixed radius, while noting that distance-to-solution monotonicity may fail in general normed spaces; the contraction bound $f(x_{k+1})-f_\star \le \left(1 + \frac{t_k}{\|x_{k+1}-x_\star\|}\right)^{-1}(f(x_k)-f_\star)$ exemplifies the renewal of this property. The analysis clarifies which BPM guarantees survive leaving Euclidean geometry and positions Non-Euclidean BPM as a blueprint for geometry-aware optimization, linking to practical LMObased methods like Muon and Scion and revealing geometric preconditioning through norm selection. It also discusses the trade-offs between exact subproblem solutions and implementable approximations, and outlines future work to incorporate stochastic gradients, momentum, and non-convex objectives.
Abstract
The recently proposed Broximal Point Method (BPM) [Gruntkowska et al., 2025] offers an idealized optimization framework based on iteratively minimizing the objective function over norm balls centered at the current iterate. It enjoys striking global convergence guarantees, converging linearly and in a finite number of steps for proper, closed and convex functions. However, its theoretical analysis has so far been confined to the Euclidean geometry. At the same time, emerging trends in deep learning optimization, exemplified by algorithms such as Muon [Jordan et al., 2024] and Scion [Pethick et al., 2025], demonstrate the practical advantages of minimizing over balls defined via non-Euclidean norms which better align with the underlying geometry of the associated loss landscapes. In this note, we ask whether the convergence theory of BPM can be extended to this more general, non-Euclidean setting. We give a positive answer, showing that most of the elegant guarantees of the original method carry over to arbitrary norm geometries. Along the way, we clarify which properties are preserved and which necessarily break down when leaving the Euclidean realm. Our analysis positions Non-Euclidean BPM as a conceptual blueprint for understanding a broad class of geometry-aware optimization algorithms, shedding light on the principles behind their practical effectiveness.
