Table of Contents
Fetching ...

Transformer Causality Regularization for Dynamic Inverse Problems

Gesa Sarnighausen, Anne Wald, Andreas Hauptmann

Abstract

We study the concept of including the causality principle as regularizer into the solution of linear time-dependent inverse problems. This is achieved by combining transformer-based predictions with classical variational regularization, resulting in what we call transformer causality regularization (TCR). The causality principle states that an object at time $t'$ depends only on its previous states at $t < t'$ and is independent of future states at $t > t'$. Since the transformer architecture represents sequence-to-sequence functions and can be equipped with a causal attention mask, transformers are the natural choice for a learned causality function that predicts the state of an object at time $t'$ given the previous states at $t < t'$. We combine this with the inductive bias of convolutional neural networks (CNNs) for imaging tasks to treat the spatial variable. The output of the spatial-temporal transformer is then used as a prior for variational regularization, such that classical results on regularization and convergence for solution methods directly transfer to our case. Using the example of dynamic computerized tomography, we compare TCR to a static and dynamic version of the earlier introduced unrolled adversarial regularizer for simulated and measured data. The results show that using TCR within a variational framework improves reconstruction results and data-consistency.

Transformer Causality Regularization for Dynamic Inverse Problems

Abstract

We study the concept of including the causality principle as regularizer into the solution of linear time-dependent inverse problems. This is achieved by combining transformer-based predictions with classical variational regularization, resulting in what we call transformer causality regularization (TCR). The causality principle states that an object at time depends only on its previous states at and is independent of future states at . Since the transformer architecture represents sequence-to-sequence functions and can be equipped with a causal attention mask, transformers are the natural choice for a learned causality function that predicts the state of an object at time given the previous states at . We combine this with the inductive bias of convolutional neural networks (CNNs) for imaging tasks to treat the spatial variable. The output of the spatial-temporal transformer is then used as a prior for variational regularization, such that classical results on regularization and convergence for solution methods directly transfer to our case. Using the example of dynamic computerized tomography, we compare TCR to a static and dynamic version of the earlier introduced unrolled adversarial regularizer for simulated and measured data. The results show that using TCR within a variational framework improves reconstruction results and data-consistency.
Paper Structure (31 sections, 21 equations, 6 figures, 3 tables)

This paper contains 31 sections, 21 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Left: Architecture of temporal transformer encoder with rotary position embedding (RoPE) with $h = 8$ heads and $6$ transformer layers. Right: Architecture of next frame prediction/refinement model, Number of patches N=356, transformer model dimension D = 512, T length of input sequence, B = 8 batch size for training
  • Figure 2: Results for a test phantom with 20 measurement angles for the initial time steps 0 and 1 for filtered backprojection (FBP), the $L^1$ transformer-based reconstruction (causality reconstruction) and corresponding prediction (causality prediction). For the remaining time steps measurements from 3 equidistantly rotating angles are taken. For the UAR reconstructions trained on static 2D phantoms (UAR static) and on dynamic 3D phantoms (UAR dynamic) measurements from 3 angles are taken for all time steps.
  • Figure 3: Results for last time step of several test phantoms with 3 measurement angles. From top to bottom: groundtruth, the $L^1$-transformer based reconstruction obtained with the learned causality function, the associated prediction and the prediction obtained auto-regressively by only using the transformer with two input frames (Landweber reconstructions obtained from 20 measurement angles).
  • Figure 4: From top to bottom: Reference solution for the rolling stones data set (reference), $L^1$ transformer based reconstruction for 10 initial angles and $3$ angles for the remaining time steps ($L^1$-reco), corresponding transformer prediction ($L^1$-pred), $L^1$-TV transformer based reconstruction for 10 initial angles and $3$ angles for the remaining time steps ($L^1$-TV-reco), corresponding transformer prediction ($L^1$-TV-pred).
  • Figure 5: Last time step of the test phantom in Figure \ref{['fig:Phantom4']} of initial baseline generator of UAR optimized using \ref{['eq:lossGenDiscr']} in the static (trained on 2D phantoms) and the dynamic case (trained on 3D phantoms).
  • ...and 1 more figures