RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

Ajay Bati; Spencer H. Bryngelson

RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

Ajay Bati, Spencer H. Bryngelson

TL;DR

CFD practitioners face a language gap between Python-based ML tools and HPC solvers. RoseNNa provides a fast, non-invasive neural network inference library that converts ONNX models into optimized Fortran/C code using a Python-to-Fortran metaprogramming step (fypp). The tool targets small CFD-relevant architectures like MLPs and LSTMs, with optional support for convolutions and pooling, and demonstrates speedups of roughly 10x for small networks and 2x for larger ones relative to PyTorch and libtorch on a single core, without external dependencies. This work offers a practical path to integrating learned closures in CFD solvers, supported by open-source access and a clear contributor workflow for extending architectures.

Abstract

The rise of neural network-based machine learning ushered in high-level libraries, including TensorFlow and PyTorch, to support their functionality. Computational fluid dynamics (CFD) researchers have benefited from this trend and produced powerful neural networks that promise shorter simulation times. For example, multilayer perceptrons (MLPs) and Long Short Term Memory (LSTM) recurrent-based (RNN) architectures can represent sub-grid physical effects, like turbulence. Implementing neural networks in CFD solvers is challenging because the programming languages used for machine learning and CFD are mostly non-overlapping, We present the roseNNa library, which bridges the gap between neural network inference and CFD. RoseNNa is a non-invasive, lightweight (1000 lines), and performant tool for neural network inference, with focus on the smaller networks used to augment PDE solvers, like those of CFD, which are typically written in C/C++ or Fortran. RoseNNa accomplishes this by automatically converting trained models from typical neural network training packages into a high-performance Fortran library with C and Fortran APIs. This reduces the effort needed to access trained neural networks and maintains performance in the PDE solvers that CFD researchers build and rely upon. Results show that RoseNNa reliably outperforms PyTorch (Python) and libtorch (C++) on MLPs and LSTM RNNs with less than 100 hidden layers and 100 neurons per layer, even after removing the overhead cost of API calls. Speedups range from a factor of about 10 and 2 faster than these established libraries for the smaller and larger ends of the neural network size ranges tested.

RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

TL;DR

Abstract

Paper Structure (12 sections, 4 figures)

This paper contains 12 sections, 4 figures.

Introduction
Design strategy
Design options
ONNX
Metaprogramming
RoseNNa capabilities
User interface
Results
Flexibility and portability
Performance on example cases
Comparison to a lower-level implementation
Conclusions

Figures (4)

Figure 1: RoseNNa (c) is a neural network converter that integrates into the inference process. It encodes the ONNX-converted neural network and transforms it into performant Fortran code with C and Fortran APIs. The user provides the components outside of (c).
Figure 2: Multilayer perceptron (MLP) time comparison (RoseNNa versus PyTorch). $\delta$ represents a specific hidden size (neurons per layer), and the x-axis represents the depth (number of hidden layers). Random activation functions (ReLu, Tanh, Sigmoid) were chosen for each MLP and assigned to each hidden layer.
Figure 3: Long Short-Term Memory (LSTM) time comparison (RoseNNa/PyTorch). The horizontal axis is the number of time steps (depth), and $\lambda$ is the hidden dimension size. All the typical operations and activation functions were incorporated into the timing of the LSTM cells.
Figure 4: Multilayer perceptron (MLP) model time comparison (RoseNNa/libtorch). $\delta$ is the hidden size, and the horizontal axis is the number of layers. Libtorch is PyTorch's C++ API. The same scheme for testing the RoseNNa to PyTorch speed ratio for MLPs was used for these tests.

RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

TL;DR

Abstract

RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

Authors

TL;DR

Abstract

Table of Contents

Figures (4)