Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

Brendan Keith; Thomas M. Surowiec

Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

Brendan Keith, Thomas M. Surowiec

TL;DR

The paper introduces proximal Galerkin, a nonlinear, high-order finite element method that preserves the multiplicative structure of pointwise bound constraints via entropy regularization and a latent-variable reformulation (LVPP). A central idea is to replace bound-constraint VIs with a sequence of semilinear PDE subproblems (e.g., the entropic Poisson equation $-\Delta u+\theta\ln u=f$) whose solutions converge to the constrained optimum as the entropy weight vanishes, while remaining interior in $L^\infty_+$. The LVPP framework yields two coupled representations and a saddle-point discretization that guarantees positivity of the primal variable $\widetilde{u}_h=\phi+\exp(\psi_h)$ and preserves the underlying algebraic structure, with a pair of stable finite-element spaces enabling mesh-independent iteration performance. The methodology solves the obstacle problem, enforces discrete maximum principles, and extends to nonconvex objectives and topology optimization via entropic mirror-descent, all supported by open-source code. Collectively, the approach offers a unifying, structure-preserving pathway for bound-constrained variational problems, with demonstrated high-order accuracy and robust convergence properties across benchmark tests and applications in optimal design.

Abstract

The proximal Galerkin finite element method is a high-order, low-iteration complexity, nonlinear numerical method that preserves the geometric and algebraic structure of point-wise bound constraints in infinite-dimensional function spaces. This paper introduces the proximal Galerkin method and applies it to solve free boundary problems, enforce discrete maximum principles, and develop a scalable, mesh-independent algorithm for optimal design with pointwise bound constraints. This paper also introduces the latent variable proximal point (LVPP) algorithm, from which the proximal Galerkin method derives. When analyzing the classical obstacle problem, we discover that the underlying variational inequality can be replaced by a sequence of second-order partial differential equations (PDEs) that are readily discretized and solved with, e.g., the proximal Galerkin method. Throughout this work, we arrive at several contributions that may be of independent interest. These include (1) a semilinear PDE we refer to as the entropic Poisson equation; (2) an algebraic/geometric connection between high-order positivity-preserving discretizations and certain infinite-dimensional Lie groups; and (3) a gradient-based, bound-preserving algorithm for two-field, density-based topology optimization. The complete proximal Galerkin methodology combines ideas from nonlinear programming, functional analysis, tropical algebra, and differential geometry and can potentially lead to new synergies among these areas as well as within variational and numerical analysis. This work is accompanied by open-source implementations of our methods to facilitate reproduction and broader adoption.

Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

TL;DR

) whose solutions converge to the constrained optimum as the entropy weight vanishes, while remaining interior in

. The LVPP framework yields two coupled representations and a saddle-point discretization that guarantees positivity of the primal variable

and preserves the underlying algebraic structure, with a pair of stable finite-element spaces enabling mesh-independent iteration performance. The methodology solves the obstacle problem, enforces discrete maximum principles, and extends to nonconvex objectives and topology optimization via entropic mirror-descent, all supported by open-source code. Collectively, the approach offers a unifying, structure-preserving pathway for bound-constrained variational problems, with demonstrated high-order accuracy and robust convergence properties across benchmark tests and applications in optimal design.

Abstract

Paper Structure (48 sections, 17 theorems, 295 equations, 21 figures, 4 tables)

This paper contains 48 sections, 17 theorems, 295 equations, 21 figures, 4 tables.

Introduction
Notation
Outline
Preserving multiplicative structure
Deconstructing the semiring of non-negative functions
Dirichlet free energy
Pointwise-positivity for every polynomial degree
Contributions and related work
Optimization methods for pointwise bound constraints
Numerical methods for pointwise bound constraints
Contributions of the present work
The obstacle problem
The entropy gradient
The entropy gradient is an isomorphism
Relative entropy
...and 33 more sections

Key Result

Theorem 4.1

\newlabellem:EntropyDifferentiability0 Let $S: L^{p}(\Omega) \to \mathbb R \cup \left\{+\infty\right\}$, $p \in [1,\infty]$, be the negative entropy functional defined by

Figures (21)

Figure 1: A trinity is formed by the three isomorphic representations of the iterates in the latent variable proximal point method. In this figure, equations for the three representations are given for the problem of minimizing the Dirichlet energy \ref{['eq:DirichletEnergy']} over non-negative functions $u \in H^1_g(\Omega) \cap H^1_+(\Omega)$. Note that, for simplicity, the step size here is set to $\alpha = 1$. See \ref{['thm:PrimalProblem', 'thm:ConvergenceContinuousLevel']} for further details and consequences for variable step sizes. \newlabelfig:Trinity0
Figure 1: The exponential map $(\nabla S)^{-1}(\varphi) = \exp \varphi$ is an analytic isomorphism between the Banach algebra $L^\infty(\Omega)$ and the Banach--Lie group $\mathop{\mathrm{int}}\nolimits L^\infty_+(\Omega) = \{v \in L^\infty(\Omega) \mid \mathop{\mathrm{\newline ess\,inf}}\limits v > 0 \}$; see \ref{['prop:Equivalence']}. Moreover, its restriction to the subalgebra $H^1(\Omega)\cap L^\infty(\Omega)$ forms an isomorphism with the subgroup $H^1(\Omega) \cap \mathop{\mathrm{int}}\nolimits L^\infty_+(\Omega)$; see \ref{['prop:logexpChainRule']}. \newlabelfig:ExponentialMap0
Figure 1: The sigmoid map $(\nabla B)^{-1}(\varphi) = \tanh (\varphi/2)$ is a diffeomorphism between the Banach algebra $L^\infty(\Omega)$ and the $L^\infty(\Omega)$-unit ball. \newlabelfig:ExponentialMap20
Figure 1: Illustration of convergence to the solution $x^\ast = 0$ for the constrained minimization problem $\mathop{\mathrm{\newline min}}\limits_{x\in[0,\infty)}\, e(x)$, where $e(x) = \frac{1}{2}x^2 + x$, by solving the sequence of minimization problems $x_{k+1} = \mathop{\mathrm{\newline arg\,min}}\limits_{x\in[0,\infty)}\, \{ e'(x_k)x + D_s(x,x_k) \}$ starting at $x_0 = 1$. \newlabelfig:MirrorPlot0
Figure 2: The convex function $s(x) = x \ln x - x$, its supporting hyperplane $\{s^\prime(x_0) + s^\prime(x_0)(x - x_0) \mid x \in \mathbb{R}\}$, and its Bregman divergence $D_s(x_1,x_0) = x_1 \ln ({x_1}/{x_0}) - x_1 + x_0$. \newlabelfig:BregmanPlot0
...and 16 more figures

Theorems & Definitions (66)

Remark 1.1: Latent variable proximal point vs. proximal Galerkin
Remark 1.2: Why pursue high-order discretizations?
Definition 2.1: Semiring
Remark 2.2: Dirichlet free energy
Theorem 4.1: Gradient representation
Corollary 4.2: Gradient of the shifted entropy functional
Remark 4.3: Empty interior in the $H^1(\Omega)$ topology
Remark 4.4: No Riesz representation theorem
Remark 4.5: Exploiting the geometry of the feasible set
Proposition 4.6: Properties of Bregman divergences
...and 56 more

Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

TL;DR

Abstract

Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (66)