A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence
Ali Kashefi
TL;DR
The paper evaluates the reliability of several generative AI tools for producing text-to-image, text-to-video, image-to-text, and video-to-text outputs for classical fluid dynamics phenomena. It systematically compares outputs from Midjourney, DALL·E 3, Gemini Advanced, Meta AI, Runway ML, Leonardo Ai, Video-LLaMA, LLaVA, and ChatGPT-4 against ground-truth references from laboratory experiments and numerical simulations. The findings reveal widespread misrepresentation of fluid dynamics concepts, with only occasional partial alignment and no tool delivering consistently faithful results. The authors attribute this largely to limited domain-specific training data, influenced by copyright restrictions on scientific imagery, and advocate for targeted data curation and collaboration between AI developers and fluid mechanics experts to improve educational and research utility.
Abstract
In this technical report, we extensively investigate the accuracy of outputs from well-known generative artificial intelligence (AI) applications in response to prompts describing common fluid motion phenomena familiar to the fluid mechanics community. We examine a range of applications, including Midjourney, Dall-E, Runway ML, Microsoft Designer, Gemini, Meta AI, and Leonardo AI, introduced by prominent companies such as Google, OpenAI, Meta, and Microsoft. Our text prompts for generating images or videos include examples such as "Von Karman vortex street", "flow past an airfoil", "Kelvin-Helmholtz instability", "shock waves on a sharp-nosed supersonic body", etc. We compare the images generated by these applications with real images from laboratory experiments and numerical software. Our findings indicate that these generative AI models are not adequately trained in fluid dynamics imagery, leading to potentially misleading outputs. Beyond text-to-image/video generation, we further explore the transition from image/video to text generation using these AI tools, aiming to investigate the accuracy of their descriptions of fluid motion phenomena. This report serves as a cautionary note for educators in academic institutions, highlighting the potential for these tools to mislead students. It also aims to inform researchers at these renowned companies, encouraging them to address this issue. We conjecture that a primary reason for this shortcoming is the limited access to copyright-protected fluid motion images from scientific journals.
