RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics
Ajay Bati, Spencer H. Bryngelson
TL;DR
CFD practitioners face a language gap between Python-based ML tools and HPC solvers. RoseNNa provides a fast, non-invasive neural network inference library that converts ONNX models into optimized Fortran/C code using a Python-to-Fortran metaprogramming step (fypp). The tool targets small CFD-relevant architectures like MLPs and LSTMs, with optional support for convolutions and pooling, and demonstrates speedups of roughly 10x for small networks and 2x for larger ones relative to PyTorch and libtorch on a single core, without external dependencies. This work offers a practical path to integrating learned closures in CFD solvers, supported by open-source access and a clear contributor workflow for extending architectures.
Abstract
The rise of neural network-based machine learning ushered in high-level libraries, including TensorFlow and PyTorch, to support their functionality. Computational fluid dynamics (CFD) researchers have benefited from this trend and produced powerful neural networks that promise shorter simulation times. For example, multilayer perceptrons (MLPs) and Long Short Term Memory (LSTM) recurrent-based (RNN) architectures can represent sub-grid physical effects, like turbulence. Implementing neural networks in CFD solvers is challenging because the programming languages used for machine learning and CFD are mostly non-overlapping, We present the roseNNa library, which bridges the gap between neural network inference and CFD. RoseNNa is a non-invasive, lightweight (1000 lines), and performant tool for neural network inference, with focus on the smaller networks used to augment PDE solvers, like those of CFD, which are typically written in C/C++ or Fortran. RoseNNa accomplishes this by automatically converting trained models from typical neural network training packages into a high-performance Fortran library with C and Fortran APIs. This reduces the effort needed to access trained neural networks and maintains performance in the PDE solvers that CFD researchers build and rely upon. Results show that RoseNNa reliably outperforms PyTorch (Python) and libtorch (C++) on MLPs and LSTM RNNs with less than 100 hidden layers and 100 neurons per layer, even after removing the overhead cost of API calls. Speedups range from a factor of about 10 and 2 faster than these established libraries for the smaller and larger ends of the neural network size ranges tested.
