Table of Contents
Fetching ...

Adaptive Mesh-Quantization for Neural PDE Solvers

Winfried van den Dool, Maksim Zhdanov, Yuki M. Asano, Max Welling

TL;DR

This paper introduces Adaptive Mesh Quantization (AMQ), a spatially adaptive, mixed-precision framework for neural PDE solvers that allocates higher bit-width to regions of greater complexity using a lightweight auxiliary model trained to predict local loss. By integrating AMQ with state-of-the-art GNN-based PDE solvers (MP-PDE) and mesh-transformer architectures (GraphViT), the authors demonstrate consistent Pareto improvements over uniformly quantized baselines across diverse tasks, including Darcy flow, large-scale 2D unsteady dynamics, 3D Navier–Stokes, and 2D hyper-elasticity, achieving up to 50% gains at the same computational cost. The method hinges on a fixed budget of compute, node/edge/cluster bit-width assignments guided by predicted spatial complexity, and a training procedure that jointly optimizes the auxiliary predictor and the quantized main model. Hardware-aware optimizations for non-uniform quantization and a bucketed, single-GEMM implementation underpin practical efficiency. Overall, AMQ enables higher-resolution PDE surrogates within fixed budgets, with notable improvements in low-bit regimes and robustness across model scales and datasets.

Abstract

Physical systems commonly exhibit spatially varying complexity, presenting a significant challenge for neural PDE solvers. While Graph Neural Networks can handle the irregular meshes required for complex geometries and boundary conditions, they still apply uniform computational effort across all nodes regardless of the underlying physics complexity. This leads to inefficient resource allocation where computationally simple regions receive the same treatment as complex phenomena. We address this challenge by introducing Adaptive Mesh Quantization: spatially adaptive quantization across mesh node, edge, and cluster features, dynamically adjusting the bit-width used by a quantized model. We propose an adaptive bit-width allocation strategy driven by a lightweight auxiliary model that identifies high-loss regions in the input mesh. This enables dynamic resource distribution in the main model, where regions of higher difficulty are allocated increased bit-width, optimizing computational resource utilization. We demonstrate our framework's effectiveness by integrating it with two state-of-the-art models, MP-PDE and GraphViT, to evaluate performance across multiple tasks: 2D Darcy flow, large-scale unsteady fluid dynamics in 2D, steady-state Navier-Stokes simulations in 3D, and a 2D hyper-elasticity problem. Our framework demonstrates consistent Pareto improvements over uniformly quantized baselines, yielding up to 50% improvements in performance at the same cost.

Adaptive Mesh-Quantization for Neural PDE Solvers

TL;DR

This paper introduces Adaptive Mesh Quantization (AMQ), a spatially adaptive, mixed-precision framework for neural PDE solvers that allocates higher bit-width to regions of greater complexity using a lightweight auxiliary model trained to predict local loss. By integrating AMQ with state-of-the-art GNN-based PDE solvers (MP-PDE) and mesh-transformer architectures (GraphViT), the authors demonstrate consistent Pareto improvements over uniformly quantized baselines across diverse tasks, including Darcy flow, large-scale 2D unsteady dynamics, 3D Navier–Stokes, and 2D hyper-elasticity, achieving up to 50% gains at the same computational cost. The method hinges on a fixed budget of compute, node/edge/cluster bit-width assignments guided by predicted spatial complexity, and a training procedure that jointly optimizes the auxiliary predictor and the quantized main model. Hardware-aware optimizations for non-uniform quantization and a bucketed, single-GEMM implementation underpin practical efficiency. Overall, AMQ enables higher-resolution PDE surrogates within fixed budgets, with notable improvements in low-bit regimes and robustness across model scales and datasets.

Abstract

Physical systems commonly exhibit spatially varying complexity, presenting a significant challenge for neural PDE solvers. While Graph Neural Networks can handle the irregular meshes required for complex geometries and boundary conditions, they still apply uniform computational effort across all nodes regardless of the underlying physics complexity. This leads to inefficient resource allocation where computationally simple regions receive the same treatment as complex phenomena. We address this challenge by introducing Adaptive Mesh Quantization: spatially adaptive quantization across mesh node, edge, and cluster features, dynamically adjusting the bit-width used by a quantized model. We propose an adaptive bit-width allocation strategy driven by a lightweight auxiliary model that identifies high-loss regions in the input mesh. This enables dynamic resource distribution in the main model, where regions of higher difficulty are allocated increased bit-width, optimizing computational resource utilization. We demonstrate our framework's effectiveness by integrating it with two state-of-the-art models, MP-PDE and GraphViT, to evaluate performance across multiple tasks: 2D Darcy flow, large-scale unsteady fluid dynamics in 2D, steady-state Navier-Stokes simulations in 3D, and a 2D hyper-elasticity problem. Our framework demonstrates consistent Pareto improvements over uniformly quantized baselines, yielding up to 50% improvements in performance at the same cost.

Paper Structure

This paper contains 38 sections, 10 equations, 7 figures, 5 tables, 3 algorithms.

Figures (7)

  • Figure 1: Overview of the proposed framework. Given a point cloud or graph, a bit assigner returns a node-wise quantization scheme for a larger model. The goal is to assign higher precision to more difficult and complex regions.
  • Figure 2: Overview of the training phase (A). A lightweight auxiliary model assigns a complexity weight to every mesh node to guide the resource allocation of the large main message-passing model. Gradients do not flow through the dotted black arrows, i.e., backpropagation happens through the solid black arrows. (B). The bit assignment process is schematically shown in (C).
  • Figure 3: Validation loss vs cost (estimated in $10^9$ MACs) for uniform vs adaptive (ours) quantization. The latter is obtained through increasing the fraction of Int8 nodes used, and includes the cost of the auxiliary model. Our framework demonstrates consistent Pareto improvements across all benchmarks. For Darcy, ShapeNet-Car and Elasticity, main models are MP-PDEs, for EAGLE - GraphViT.
  • Figure 4: The proposed framework on an arbitrary example from the EAGLE dataset. The model predicts the field based on the PDE solution of the previous timestep. The bitmap assignment process manages to successfully capture the airflow and allocate resources to the relevant region. In this example 30 % of the nodes is assigned Int8, leaving the rest in low (Int4) precision.
  • Figure 5: Performance of smaller (left) vs larger (right) main model on ShapeNet-Car and EAGLE tasks.
  • ...and 2 more figures