A Preliminary Assessment of Coding Agents for CFD Workflows

Ke Xiao; Haoze Zhang; Yangchen Xu; Runze Mao; Han Li; Zhi X. Chen

A Preliminary Assessment of Coding Agents for CFD Workflows

Ke Xiao, Haoze Zhang, Yangchen Xu, Runze Mao, Han Li, Zhi X. Chen

TL;DR

The paper investigates tool-using coding agents to automate OpenFOAM CFD workflows, proposing a lightweight prompt strategy that prioritizes tutorial reuse and log-driven repair to achieve end-to-end execution. Evaluated on FoamBench-Advanced, the approach demonstrates dramatic gains in completion and stability for tutorial-derivative tasks, while stronger LLMs like GPT-5.2 substantially improve mesh generation for obstacle flows. The results indicate that coding agents can automate substantial portions of CFD pipelines with minimal configuration, yet geometry, meshing, and complex 3D/physics regimes still require human oversight and further study. Overall, coding agents hold practical potential to streamline CFD workflows, with tangible benefits for 2D and tutorial-like tasks and clear directions for extending to more demanding simulations.

Abstract

We investigate the use of tool-using coding agents to automate end-to-end workflows in the open-source CFD package OpenFOAM. Building on general-purpose coding agent interfaces, we introduce a lightweight configuration that guides an agent toward tutorial reuse and log-driven repair to improve case setup and execution. We evaluate this approach on the FoamBench-Advanced benchmark, covering both tutorial-derivative and planar 2D obstacle-flow tasks. For tutorial-derivative cases, prompt guidance dramatically increases execution completion rates and reduces unnecessary tool calls. For obstacle-flow cases, stronger language models such as GPT-5.2 markedly improve mesh generation and overall task completion compared to earlier models. Our findings show that coding agents can correctly execute a range of CFD simulations with minimal configuration and that model capability significantly influences performance on tasks requiring geometry and mesh creation. These results suggest that coding agents have practical utility for automating portions of CFD workflows while highlighting areas that require further investigation.

A Preliminary Assessment of Coding Agents for CFD Workflows

TL;DR

Abstract

Paper Structure (15 sections, 7 figures)

This paper contains 15 sections, 7 figures.

Introduction
Related Work
LLM coding agents
LLM agents for CFD
Benchmarks
Experimental Setup
Agent configuration
Benchmark and compared settings
Results
FoamBench-Advanced tutorial-derivative tasks
FoamBench-Advanced planar 2D obstacle flows
Discussion
Conclusion
Agent System Prompt
Descriptions for FoamBench-Advanced Planar 2D Obstacle-Flow Cases

Figures (7)

Figure 1: Overview of our setup. (a) A tool-using coding agent (OpenCode) executes OpenFOAM workflows by issuing function calls (e.g., bash, read, edit) to use external tools (OpenFOAM/Gmsh/Python). The OpenFOAM-focused system prompt (CFD prompt) replaces the default agent prompt while keeping the execution loop unchanged. (b) The CFD prompt emphasizes tutorial-first retrieval and reuse, minimal dictionary edits, and an iterative log-driven repair loop that reruns from the appropriate stage until the required endTime is reached, with completion evidence reported.
Figure 2: Tutorial-derivative tasks (9 cases). Comparison between the OpenFOAM-focused system prompt (CFD Prompt) and the default OpenCode prompt (Default Prompt). (a) FoamBench metrics aggregated over the 9 tasks. (b) Average token usage and tool-call count. (c) Tool-call breakdown by case.
Figure 3: MiniMax-M2.1 meshing outcomes on four planar 2D obstacle-flow cases under the OpenFOAM-focused prompt. Two cases fail to represent the obstacle; the offset-cylinder case reuses a tutorial mesh without updating the offset and fluid domain. The only successful case uses snappyHexMesh rather than a multi-block blockMesh setup.
Figure 4: GPT-5.2 prompt ablation on four planar 2D obstacle-flow cases. (a) Average token usage and total tool-call count under the default OpenCode prompt and the OpenFOAM-focused prompt. (b) Tool-call breakdown by case.
Figure 5: Meshes generated by GPT-5.2 for the cylinder obstacle case under the default OpenCode prompt and the OpenFOAM-focused prompt. The OpenFOAM-focused prompt produces a more reasonable initial mesh on the first attempt.
...and 2 more figures

A Preliminary Assessment of Coding Agents for CFD Workflows

TL;DR

Abstract

A Preliminary Assessment of Coding Agents for CFD Workflows

Authors

TL;DR

Abstract

Table of Contents

Figures (7)