Table of Contents
Fetching ...

CFDagent: A Language-Guided, Zero-Shot Multi-Agent System for Complex Flow Simulation

Zhaoyue Xu, Long Wang, Chunyu Wang, Yixin Chen, Qingyong Luo, Hua-Dong Yao, Shizhao Wang, Guowei He

TL;DR

CFDagent introduces a zero-shot, language-guided, end-to-end CFD framework that couples three GPT-4o–driven agents to generate geometry, configure and run an immersed boundary flow solver, and postprocess results from natural-language prompts. The system leverages Point-E for text/image–to–geometry generation, a marching cubes–based mesh pipeline, and an IB solver on parallel grids to simulate flows around arbitrary geometries, validated against canonical sphere benchmarks at $Re=100$ and $Re=300$ with $C_d$ and $C_l$ in agreement with literature. It demonstrates robust generalization to real-world objects described textually or via images, producing multimodal visualizations, vortex structures via the $Q$-criterion, and even GPT-4o–generated synthesis of visualizations. The work highlights the potential to lower barriers to expert CFD by integrating generative AI with high-fidelity simulations, enabling education, research, and engineering workflows, while acknowledging limitations in geometry quality and computational efficiency and outlining future extensions to moving/deforming objects and smarter visualizations.

Abstract

We introduce CFDagent, a zero-shot, multi-agent system that enables fully autonomous computational fluid dynamics (CFD) simulations from natural language prompts. CFDagent integrates three specialized LLM-driven agents: (i) the Preprocessing Agent that generates 3D geometries from textual or visual inputs using a hybrid text-to-3D diffusion model (Point-E) and automatically meshes the geometries; (ii) the Solver Agent that configures and executes an immersed boundary flow solver; and (iii) the Postprocessing Agent that analyzes and visualizes the results, including multimodal renderings. These agents are interactively guided by GPT-4o via conversational prompts, enabling intuitive and user-friendly interaction. We validate CFDagent by reproducing canonical sphere flows at Reynolds numbers of 100 and 300 using three distinct inputs: a simple text prompt (i.e., "sphere"), an image-based input, and a standard sphere model. The computed drag and lift coefficients from meshes produced by each input approach closely match available data. The proposed system enables synthesization of flow simulations and photorealistic visualizations for complex geometries. Through extensive tests on canonical and realistic scenarios, we demonstrate the robustness, versatility, and practical applicability of CFDagent. By bridging generative AI with high-fidelity simulations, CFDagent significantly lowers barriers to expert-level CFD, unlocking broad opportunities in education, scientific research, and practical engineering applications.

CFDagent: A Language-Guided, Zero-Shot Multi-Agent System for Complex Flow Simulation

TL;DR

CFDagent introduces a zero-shot, language-guided, end-to-end CFD framework that couples three GPT-4o–driven agents to generate geometry, configure and run an immersed boundary flow solver, and postprocess results from natural-language prompts. The system leverages Point-E for text/image–to–geometry generation, a marching cubes–based mesh pipeline, and an IB solver on parallel grids to simulate flows around arbitrary geometries, validated against canonical sphere benchmarks at and with and in agreement with literature. It demonstrates robust generalization to real-world objects described textually or via images, producing multimodal visualizations, vortex structures via the -criterion, and even GPT-4o–generated synthesis of visualizations. The work highlights the potential to lower barriers to expert CFD by integrating generative AI with high-fidelity simulations, enabling education, research, and engineering workflows, while acknowledging limitations in geometry quality and computational efficiency and outlining future extensions to moving/deforming objects and smarter visualizations.

Abstract

We introduce CFDagent, a zero-shot, multi-agent system that enables fully autonomous computational fluid dynamics (CFD) simulations from natural language prompts. CFDagent integrates three specialized LLM-driven agents: (i) the Preprocessing Agent that generates 3D geometries from textual or visual inputs using a hybrid text-to-3D diffusion model (Point-E) and automatically meshes the geometries; (ii) the Solver Agent that configures and executes an immersed boundary flow solver; and (iii) the Postprocessing Agent that analyzes and visualizes the results, including multimodal renderings. These agents are interactively guided by GPT-4o via conversational prompts, enabling intuitive and user-friendly interaction. We validate CFDagent by reproducing canonical sphere flows at Reynolds numbers of 100 and 300 using three distinct inputs: a simple text prompt (i.e., "sphere"), an image-based input, and a standard sphere model. The computed drag and lift coefficients from meshes produced by each input approach closely match available data. The proposed system enables synthesization of flow simulations and photorealistic visualizations for complex geometries. Through extensive tests on canonical and realistic scenarios, we demonstrate the robustness, versatility, and practical applicability of CFDagent. By bridging generative AI with high-fidelity simulations, CFDagent significantly lowers barriers to expert-level CFD, unlocking broad opportunities in education, scientific research, and practical engineering applications.

Paper Structure

This paper contains 11 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Workflow diagram of CFDagent. The system consists of three core agents: the Preprocessing Agent, the Solver Agent, and the Postprocessing Agent. Each agent's role is defined through a tailored system prompt that conditions the LLM to perform stage-specific tasks within the simulation workflow. The agents interact with the user via natural language prompts, enabling dynamic input interpretation and autonomous execution of key operations. These include geometry and mesh generation, solver configuration and flow simulation, as well as postprocessing for quantitative analysis and result visualization.
  • Figure 2: Instantaneous vortical structure for flow past a sphere at Re = 300. The vortical structures are visualized using iso-surfaces of the $Q$-criterion, highlighting regions with a high ratio of vorticity to strain. The iso-surfaces are colored by the streamwise velocity, ranging from -0.2 to 1, with colors transitioning from blue to white, illustrating the variation of the flow speed around the vortex structures.
  • Figure 3: Postprocessing results from CFDagent for flow past a human geometry at Re$= 300$. (a) instantaneous drag coefficient, (b) streamwise velocity component at slice $y=0$, (c) streamlines colored by velocity magnitude at slice $y=0$, (d) iso-surface visualization of the $Q$-criterion ($Q=0.1$) colored by streamwise velocity, and (e) GPT-4 synthesized visualization based on simulation results.
  • Figure 4: CFDagent postprocessing results demonstrating generalization for different prompts: Streamlines, velocity distributions, vortex structures ($Q$-criterion iso-surface), and GPT-4o synthesized visualizations.
  • Figure 5: Text-to-3D Mesh Generation Pipeline. A natural language prompt is processed to a fine-tuned GLIDE model to generate a synthetic image, which is then converted into a 3D point cloud using point cloud diffusion with CLIP and a transformer. The resulting point cloud is transformed into a 3D mesh using SDF estimation and Marching Cubes.