Table of Contents
Fetching ...

Hybrid Nitsche method for distributed computing

Tom Gustafsson, Antti Hannukainen, Vili Kohonen, Juha Videman

TL;DR

This work presents a domain-decomposition framework that integrates a hybrid Nitsche interface with local model order reduction to enable arbitrary polynomial degree FEM on distributed hardware. By introducing a trace variable on the skeleton and performing local low-rank reductions via a lifting operator, the method achieves a reduced global system whose solution converges optimally in the mesh parameter $h$ and linearly in the tolerance $\\epsilon$. The key theoretical result is a bound of the form $\\|u-\\tilde{u}_h\\|_h \\\le C (h^p + \\\epsilon m) \\\|u^C\\|_{p+1}$, with extensive numerical validation on unit cubes, large-scale meshes, and engineering geometries demonstrating scalability and accuracy. The approach streamlines implementation, avoids extra overlap layers, and reduces memory requirements by transforming subdomain problems to small, diagonal-like systems, thereby enabling large-scale computations on laptops and cloud resources. Future work includes developing specialized preconditioners for the reduced Schur complement to further enhance performance.

Abstract

We extend a distributed finite element method built upon model order reduction to arbitrary polynomial degree using a hybrid Nitsche scheme. The new method considerably simplifies the transformation of the finite element system to the reduced basis for large problems. We prove that the error of the reduced Nitsche solution converges optimally with respect to the approximation order of the finite element spaces and linearly with respect to the dimension reduction parameter. Numerical tests with nontrivial tetrahedral meshes using second-degree polynomial bases support the theoretical results.

Hybrid Nitsche method for distributed computing

TL;DR

This work presents a domain-decomposition framework that integrates a hybrid Nitsche interface with local model order reduction to enable arbitrary polynomial degree FEM on distributed hardware. By introducing a trace variable on the skeleton and performing local low-rank reductions via a lifting operator, the method achieves a reduced global system whose solution converges optimally in the mesh parameter and linearly in the tolerance . The key theoretical result is a bound of the form , with extensive numerical validation on unit cubes, large-scale meshes, and engineering geometries demonstrating scalability and accuracy. The approach streamlines implementation, avoids extra overlap layers, and reduces memory requirements by transforming subdomain problems to small, diagonal-like systems, thereby enabling large-scale computations on laptops and cloud resources. Future work includes developing specialized preconditioners for the reduced Schur complement to further enhance performance.

Abstract

We extend a distributed finite element method built upon model order reduction to arbitrary polynomial degree using a hybrid Nitsche scheme. The new method considerably simplifies the transformation of the finite element system to the reduced basis for large problems. We prove that the error of the reduced Nitsche solution converges optimally with respect to the approximation order of the finite element spaces and linearly with respect to the dimension reduction parameter. Numerical tests with nontrivial tetrahedral meshes using second-degree polynomial bases support the theoretical results.

Paper Structure

This paper contains 13 sections, 6 theorems, 50 equations, 5 figures, 4 tables.

Key Result

lemma thmcounterlemma

Fix $\epsilon >0$ and $g_h\in V_{h,i}^+|_{\partial\Omega_i^+}$. Let $w_{h,i}|_{\Omega_i}\in V_{h,i}$ and $\widetilde{w}_{h,i}|_{\Omega_i}\in \widetilde{V}_{h,i}$ be restrictions of finite element solutions to eq:localproblem. Further, let $w_{h,i}\in V_{h,i}^+$ such that $\| g_h \|_{1/2, \partial\Om

Figures (5)

  • Figure 1: The computational implementation on a high level and its separation between a local main node and distributed cloud worker nodes. For more details, see gustafsson2024 and our code sourcepackage. The most expensive worker subtask complexities were analyzed empirically in gustafsson2024.
  • Figure 2: Convergence of the method using second-degree polynomials on a log-log scale. On the $x$-axis is the mesh parameter $h$, and on the $y$-axis is the error \ref{['eq:energynormerror']} of the approximation. The lines depict approximations $\tilde{u}_h$ with different reduction tolerance parameter $\epsilon$, and the gray line has the slope of the theoretical FEM convergence rate. The approximation error converges quadratically in $h$ as expected until the reduction error becomes the dominant factor for tolerance $\epsilon={1}\mathrm{e}{-2}$.
  • Figure 3: Errors and reduction errors for the approximations $\tilde{u}_h$ in Figure \ref{['fig:hepsilonplot']} with different tolerances $\epsilon$ on a log-log scale. Each subplot displays the error \ref{['eq:energynormerror']} (varying color line) of the respective $\tilde{u}_h$ and the reduction error \ref{['eq:reductionerror']} (dark red line). For the smaller tolerances the reduction error does not affect the approximation, but for $\epsilon={1}\mathrm{e}{-2}$ it is larger than the conventional FEM error for smaller $h$ and becomes the dominating factor.
  • Figure 4: The sorted singular values of weighted $\bm Z_i$ with three different subdomain extension parameters $r$ plotted on a logarithmic $y$-axis using a second-degree polynomial basis. The original subdomain had $\numprint{3045}$ DOFs and the extensions ranged from $\numprint{12159}-\numprint{29069}$ DOFs. The subdomain diameter is roughly doubled for $r=4h$. Larger extensions produce faster spectral decay as high frequency modes diminish faster. Number of singular vectors for different extensions and tolerances are presented in Table \ref{['tbl:localcutoffs']}.
  • Figure 5: The pipe geometry discretized into $\numprint{345821}$ nodes and 800 subdomains.

Theorems & Definitions (23)

  • remark thmcounterremark
  • definition thmcounterdefinition: $\mathcal{Z}_i$ operator
  • definition thmcounterdefinition: Low-rank approximation of $\mathcal{Z}_i$
  • definition thmcounterdefinition: Reduced space $\widetilde{V}_{h,i}$
  • lemma thmcounterlemma: Local error
  • proof
  • remark thmcounterremark
  • remark thmcounterremark
  • remark thmcounterremark
  • lemma thmcounterlemma: Coercivity of $\mathcal{B}_h$
  • ...and 13 more