Table of Contents
Fetching ...

Multimodal Neural Operators for Real-Time Biomechanical Modelling of Traumatic Brain Injury

Anusha Agarwal, Dibakar Roy Sarkar, Somdatta Goswami

TL;DR

Traumatic brain injury modeling requires integrating heterogeneous multimodal data, which traditional neural operators cannot handle. The authors formulate multimodal operator learning and extend four architectures (FNO, F-FNO, MG-FNO, DeepONet) with fusion strategies to predict full-field brain displacements from MRI anatomy and scalar demographics, evaluated on 249 MRE samples. MG-FNO achieves the highest accuracy (MSE ≈ 0.0023; 94.3% spatial fidelity) while DeepONet offers the fastest inference for edge deployment; all models provide real-time predictions orders of magnitude faster than finite-element baselines. This work establishes a generalizable framework for heterogeneous data fusion in scientific domains, enabling real-time digital twins for TBI and informing precision neurobiomechanics and beyond.

Abstract

Background: Traumatic brain injury (TBI) is a major global health concern with 69 million annual cases. While neural operators have revolutionized scientific computing, existing architectures cannot handle the heterogeneous multimodal data (anatomical imaging, scalar demographics, and geometric constraints) required for patient-specific biomechanical modeling. Objective: This study introduces the first multimodal neural operator framework for biomechanics, fusing heterogeneous inputs to predict brain displacement fields for rapid TBI risk assessment. Methods: TBI modeling was reformulated as a multimodal operator learning problem. We proposed two fusion strategies: field projection for Fourier Neural Operator (FNO) architectures and branch decomposition for Deep Operator Networks (DeepONet). Four architectures (FNO, Factorized FNO, Multi-Grid FNO, and DeepONet) were extended with fusion mechanisms and evaluated on 249 in vivo Magnetic Resonance Elastography (MRE) datasets (20-90 Hz). Results: Multi-Grid FNO achieved the highest accuracy (MSE = 0.0023, 94.3% spatial fidelity). DeepONet offered the fastest inference (14.5 iterations/s, 7x speedup), suitable for edge deployment. All architectures reduced computation from hours to milliseconds. Conclusion: Multimodal neural operators enable efficient, real-time, patient-specific TBI risk assessment. This framework establishes a generalizable paradigm for heterogeneous data fusion in scientific domains, including precision medicine.

Multimodal Neural Operators for Real-Time Biomechanical Modelling of Traumatic Brain Injury

TL;DR

Traumatic brain injury modeling requires integrating heterogeneous multimodal data, which traditional neural operators cannot handle. The authors formulate multimodal operator learning and extend four architectures (FNO, F-FNO, MG-FNO, DeepONet) with fusion strategies to predict full-field brain displacements from MRI anatomy and scalar demographics, evaluated on 249 MRE samples. MG-FNO achieves the highest accuracy (MSE ≈ 0.0023; 94.3% spatial fidelity) while DeepONet offers the fastest inference for edge deployment; all models provide real-time predictions orders of magnitude faster than finite-element baselines. This work establishes a generalizable framework for heterogeneous data fusion in scientific domains, enabling real-time digital twins for TBI and informing precision neurobiomechanics and beyond.

Abstract

Background: Traumatic brain injury (TBI) is a major global health concern with 69 million annual cases. While neural operators have revolutionized scientific computing, existing architectures cannot handle the heterogeneous multimodal data (anatomical imaging, scalar demographics, and geometric constraints) required for patient-specific biomechanical modeling. Objective: This study introduces the first multimodal neural operator framework for biomechanics, fusing heterogeneous inputs to predict brain displacement fields for rapid TBI risk assessment. Methods: TBI modeling was reformulated as a multimodal operator learning problem. We proposed two fusion strategies: field projection for Fourier Neural Operator (FNO) architectures and branch decomposition for Deep Operator Networks (DeepONet). Four architectures (FNO, Factorized FNO, Multi-Grid FNO, and DeepONet) were extended with fusion mechanisms and evaluated on 249 in vivo Magnetic Resonance Elastography (MRE) datasets (20-90 Hz). Results: Multi-Grid FNO achieved the highest accuracy (MSE = 0.0023, 94.3% spatial fidelity). DeepONet offered the fastest inference (14.5 iterations/s, 7x speedup), suitable for edge deployment. All architectures reduced computation from hours to milliseconds. Conclusion: Multimodal neural operators enable efficient, real-time, patient-specific TBI risk assessment. This framework establishes a generalizable paradigm for heterogeneous data fusion in scientific domains, including precision medicine.

Paper Structure

This paper contains 29 sections, 11 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Representative visualization of the dataset, showing T1-weighted anatomical MRI (left), brain mask (middle), and displacement fields (right), along the central axial cross section for two samples. Displacement magnitude combines both the real and imaginary components of the displacement field. Direction relates to how the T1-weighted MRI was scanned and frequency relates to the rate at which mechanical vibrations are applied for MRE.
  • Figure 2: Fourier Neural Operator (FNO) Architecture for Brain Displacement Prediction. The network takes T1 MRI images [80, 80, 44] as primary input, which are augmented with scalar subject features (age, sex, volume, frequency, direction) and 3D positional encodings (X, Y, Z coordinates), all projected to matching grid dimensions [80, 80, 44]. These multi-modal inputs are concatenated along the channel dimension and processed through the core Spectral Conv3D operation, which performs convolution in the frequency domain via Fast Fourier Transform (FFT), spectral convolution (R), and Inverse FFT. The spectral convolution is followed by standard 3D convolution, GeLU activation, and dropout layers (repeated 4 times) to predict 3D brain displacement fields for Magnetic Resonance Elastography (MRE).
  • Figure 3: Multi-Grid Fourier Neural Operator (FNO) Architecture for Hierarchical Brain Displacement Prediction. The input domain [80, 80, 44] is spatially decomposed into 32 non-overlapping subsections of size [20, 20, 22] to enable multi-scale processing. The brain MRI data is processed at multiple resolution levels: Level 0 operates on downsampled blocks [20, 20, 22], while Level 1 processes the full domain context. Each level is fed into separate FNO networks that independently predict 3-component displacement fields [3, 20, 20, 22] for their respective scales. The multi-scale predictions are then recombined through a prediction recombination module to reconstruct the final high-resolution brain displacement output [3, 80, 80, 44] for MRE analysis. This hierarchical approach allows the model to capture both local fine-grained deformation patterns within individual patches and global long-range dependencies across the entire brain domain, improving computational efficiency while maintaining prediction accuracy for biomechanical modeling.
  • Figure 4: Deep Operator Network (DeepONet) Architecture for Brain Displacement Prediction. The architecture employs six branch networks and one trunk network to map multimodal inputs to 3D displacement fields. A CNN branch processes T1-weighted MRI slices [80$\times$80$\times$44] through convolutional layers with batch normalization and pooling to extract 300D anatomical features, while five separate FNN branches encode scalar parameters (scan direction, vibration frequency, sex, brain volume, age) into 300D embeddings each. All branch outputs are fused via element-wise multiplication to create a unified 300D representation encoding both anatomical and demographic information. The trunk network processes 3D spatial coordinates ($x$, $y$, $z$) through multilayer perceptrons to generate 300D spatial basis functions at each voxel location. For displacement prediction, the 300D branch vector is partitioned into three 100D segments corresponding to $x$-, $y$-, and $z$-components, which undergo inner-product fusion with corresponding trunk segments via Einstein summation. This operator learning framework generates continuous 3D displacement fields [Batch $\times$ N_voxels $\times$ 3] by decoupling spatial and functional dependencies, enabling prediction of brain tissue deformation from multimodal anatomical and acquisition parameters for MR elastography applications.
  • Figure 5: Training and Validation Losses across Models. Both the FNO and F-FNO were able to converge much faster on the imaginary displacement fields in comparison to the real displacement. In general, the F-FNO was able to converge in fewer epochs than the other FNO variants.
  • ...and 2 more figures