Table of Contents
Fetching ...

Evaluation of data driven low-rank matrix factorization for accelerated solutions of the Vlasov equation

Bhavana Jonnalagadda, Stephen Becker

TL;DR

A data-driven factorization method using artificial neural networks, specifically with convolutional layer architecture, that trains on existing simulation data that achieves comparable reconstruction accuracy for interpolation tasks and is best suited for interpolation rather than for predicting future states in time-evolving systems.

Abstract

Low-rank methods have shown success in accelerating simulations of a collisionless plasma described by the Vlasov equation, but still rely on computationally costly linear algebra every time step. We propose a data-driven factorization method using artificial neural networks, specifically with convolutional layer architecture, that trains on existing simulation data. At inference time, the model outputs a low-rank decomposition of the distribution field of the charged particles, and we demonstrate that this step is faster than the standard linear algebra technique. Numerical experiments show that the method effectively interpolates time-series data, generalizing to unseen test data in a manner beyond just memorizing training data; patterns in factorization also inherently followed the same numerical trend as those within algebraic methods (e.g., truncated singular-value decomposition). However, when training on the first 70% of a time-series data and testing on the remaining 30%, the method fails to meaningfully extrapolate. Despite this limiting result, the technique may have benefits for simulations in a statistical steady-state or otherwise showing temporal stability.

Evaluation of data driven low-rank matrix factorization for accelerated solutions of the Vlasov equation

TL;DR

A data-driven factorization method using artificial neural networks, specifically with convolutional layer architecture, that trains on existing simulation data that achieves comparable reconstruction accuracy for interpolation tasks and is best suited for interpolation rather than for predicting future states in time-evolving systems.

Abstract

Low-rank methods have shown success in accelerating simulations of a collisionless plasma described by the Vlasov equation, but still rely on computationally costly linear algebra every time step. We propose a data-driven factorization method using artificial neural networks, specifically with convolutional layer architecture, that trains on existing simulation data. At inference time, the model outputs a low-rank decomposition of the distribution field of the charged particles, and we demonstrate that this step is faster than the standard linear algebra technique. Numerical experiments show that the method effectively interpolates time-series data, generalizing to unseen test data in a manner beyond just memorizing training data; patterns in factorization also inherently followed the same numerical trend as those within algebraic methods (e.g., truncated singular-value decomposition). However, when training on the first 70% of a time-series data and testing on the remaining 30%, the method fails to meaningfully extrapolate. Despite this limiting result, the technique may have benefits for simulations in a statistical steady-state or otherwise showing temporal stability.
Paper Structure (24 sections, 8 equations, 10 figures, 1 table)

This paper contains 24 sections, 8 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Visualization of generated data. The phase-space visual representation of the generated plasma data used, at size $128\times256$ pixels, with selected frames (matrices) from the full timeseries sequence. The $x$-axis is space and $y$-axis is velocity. Left: One of the first frames in the sequence, and also the "easiest" frame for the networks to learn (lowest average loss over all networks). Middle: A randomly selected frame from the middle of the sequence. Right: One of the last frames, and also the frame that was "hardest" to learn (highest average loss over all networks).
  • Figure 2: ConvMF structure. The structure of the selected best network after training and hyperparameter testing. The input goes through several convolutional and then linear layers, before splitting into two paths (fork) with several linear layers for each output, U and V. Visual sizes of the layers are not to actual scale of the dimensions used, and are instead to indicate the relative scale of the layers to each other.
  • Figure 3: Rank vs average scaled loss. The average scaled loss, on validation and held-out test data, for the best neural network tested ("ConvMF") and for the SVD, across different ranks of the resulting $U$ and $V$. Different ranks for ConvMF were done with separately trained networks for each rank. Left: The resulting loss per rank for input data of size $64\times128$. Right: The resulting loss per rank for input data of size $128\times256$. Bottom: The same two plots in log scale for $y$.
  • Figure 4: Reconstructed output from models vs original input. The display of matrices formed from reconstruction (matrix multiplication) from the constituent output, shown against the original input of size $128\times256$, for selected ranks. The specific input at time index 16025 was chosen as it had the highest average loss across all models (and ranks) used. Top row: The reconstruction from the output $U, V$ from ConvMF, at ranks 6, 12, 30. Middle row: The reconstruction from the output $U, \Sigma, V$ from SVD, at ranks 6, 12, 30. Bottom image: The original input to both methods, at size $128\times256$ and 98% into the timeseries.
  • Figure 5: Average execution time across ranks. The averaged execution time for the two SVD methods tested, along with the selected network, evaluating on test data. Left: The times for input matrix size $64\times128$. Right: The times for input matrix size $128\times256$.
  • ...and 5 more figures