Table of Contents
Fetching ...

Metriplector: From Field Theory to Neural Architecture

Dan Oprisa, Peter Toth

Abstract

We present Metriplector, a neural architecture primitive in which the input configures an abstract physical system -- fields, sources, and operators -- and the dynamics of that system is the computation. Multiple fields evolve via coupled metriplectic dynamics, and the stress-energy tensor $T^{μν}$, derived from Noether's theorem, provides the readout. The metriplectic formulation admits a natural spectrum of instantiations: the dissipative branch alone yields a screened Poisson equation solved exactly via conjugate gradient; activating the full structure -- including the antisymmetric Poisson bracket -- gives field dynamics for image recognition and language modeling. We evaluate Metriplector across four domains, each using a task-specific architecture built from this shared primitive with progressively richer physics: F1=1.0 on maze pathfinding, generalizing from 15x15 training grids to unseen 39x39 grids; 97.2% exact Sudoku solve rate with zero structural injection; 81.03% on CIFAR-100 with 2.26M parameters; and 1.182 bits/byte on language modeling with 3.6x fewer training tokens than a GPT baseline.

Metriplector: From Field Theory to Neural Architecture

Abstract

We present Metriplector, a neural architecture primitive in which the input configures an abstract physical system -- fields, sources, and operators -- and the dynamics of that system is the computation. Multiple fields evolve via coupled metriplectic dynamics, and the stress-energy tensor , derived from Noether's theorem, provides the readout. The metriplectic formulation admits a natural spectrum of instantiations: the dissipative branch alone yields a screened Poisson equation solved exactly via conjugate gradient; activating the full structure -- including the antisymmetric Poisson bracket -- gives field dynamics for image recognition and language modeling. We evaluate Metriplector across four domains, each using a task-specific architecture built from this shared primitive with progressively richer physics: F1=1.0 on maze pathfinding, generalizing from 15x15 training grids to unseen 39x39 grids; 97.2% exact Sudoku solve rate with zero structural injection; 81.03% on CIFAR-100 with 2.26M parameters; and 1.182 bits/byte on language modeling with 3.6x fewer training tokens than a GPT baseline.

Paper Structure

This paper contains 83 sections, 2 theorems, 29 equations, 13 figures, 7 tables.

Key Result

Theorem 1

If the action $\mathcal{A}[\bm{\psi}] = \int \mathcal{L}(\bm{\psi}, \partial_\mu\bm{\psi})\, d^n x$ is invariant under a continuous one-parameter family of transformations $\bm{\psi} \to \bm{\psi} + \epsilon\, \delta\bm{\psi}$, then the current is conserved: $\partial_\mu j^\mu = 0$.

Figures (13)

  • Figure 1: Metriplector field interaction.Top:$K$ fields $\bm{\psi}_k$ evolve via metriplectic dynamics over the spatial grid (gradient arrows show $\nabla\bm{\psi}_k$); the outer product $\nabla\bm{\psi}_a \otimes \nabla\bm{\psi}_b$ yields three stress-energy components. Bottom: per-field gradient energy $E_k = |\nabla\bm{\psi}_k|^2$, cross-field correlation $E_{ab} = \nabla\bm{\psi}_a \cdot \nabla\bm{\psi}_b$, and vorticity $V_{ab} = \nabla\bm{\psi}_a \times \nabla\bm{\psi}_b$, summed and projected via Conv$1{\times}1$ into $\mathbf{h}$. Shown for $K{=}3$; the full model uses $K{=}32$.
  • Figure 2: The metriplectic spectrum. All four domains instantiate the same GENERIC equation. Maze and Sudoku use only the dissipative branch ($M$), solved at equilibrium via CG. Language uses causal dissipation via scan. CIFAR-100 activates full metriplectic structure ($M{+}L$) via Euler integration.
  • Figure 3: Metriplector architecture. A single round of the recurrent V-cycle (repeated $R$ times with shared weights). The cell encoder produces per-cell features $\mathbf{h}$ from input, previous predictions, position, and round fraction. Learned symmetric conductances $w_{ij}$ define the graph Laplacian. Damping and source MLPs produce per-cell screening and forcing terms. $K$ independent screened Poisson equations are solved via CG. The dissipation readout $D_k = \sum_j w_{ij}(\bm{\psi}_i - \bm{\psi}_j)^2 \approx |\nabla\bm{\psi}_k|^2$ extracts the diagonal of the stress-energy tensor (the same $T^{\mu\nu}$ readout used in CIFAR-100, cf. Section \ref{['sec:discussion']}); 8-way directional scans add global context. The multigrid object layer restricts to learned soft-assignment groups, solves a coarse Poisson, and prolongates back. The decoder reads all features to produce per-cell predictions, fed back via annealed softmax ($\tau\!: 1.0 \to 0.2$). Blue: prediction feedback across rounds. Orange: multigrid V-cycle within each round. Reasoning domains (maze, Sudoku) share this CG-based architecture with different $K$, $R$, and output dimension $C$; CIFAR-100 uses a distinct Euler-based instantiation (Section \ref{['sec:cifar100_method']}).
  • Figure 4: Causal Poisson language model. Tokens are embedded and passed through $L{=}6$ non-shared CausalPoissonLayers. Each layer solves the causal Poisson recurrence via $O(N \log N)$ parallel associative scan, applies progressive multigrid (token $\to$ chunk $\to$ section scales with shifted pooling for causal safety), computes cross-field outer products ($\bm{\psi} \otimes \bm{\psi}$), and integrates all features into the hidden state $\mathbf{h}$ via a round MLP with residual connection.
  • Figure 5: CIFAR-100 metriplectic layer ($\times 12$, non-shared weights). The representation $\mathbf{h}$ ($D{=}128$) flows along the top via residual connections. Each layer projects $\mathbf{h}$ down to $K{=}32$ physics fields $\bm{\psi}$, evolves them under the full metriplectic equation (diffusion $+$ advection $+$ damping $+$ source), extracts physically meaningful features via the stress-energy tensor (Noether readout), and projects back to $D$ via gated mixing. The $K$-field bottleneck reduces pairwise interaction cost by $16\times$ compared to operating in the full $D$ space.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Theorem 1: Noether's theorem, 1918
  • Proposition 2: Well-posedness