Adding New Capability in Existing Scientific Application with LLM Assistance
Anshu Dubey, Akash Dhruv
TL;DR
This work tackles the challenge of generating code for new algorithms with LLM assistance by proposing a specification-first, interactive prompting workflow to reduce hallucinations. It showcases enhancements to CodeScribe for seamless generation and integration of new code into the Flash-X framework. The authors detail a CIC-based deposition algorithm on a Cartesian AMR mesh and introduce virtual particles to avoid expensive reverse ghost-fill, illustrating the approach in both $2$D and $3$D contexts. The findings suggest that specification-driven prompts combined with test-driven development and persistent chat logs substantially improve code correctness, readability, and developer productivity when using LLMs for software engineering in scientific codes.
Abstract
With the emergence and rapid evolution of large language models (LLM), automating coding tasks has become an important research topic. Many efforts are underway and literature abounds about the efficacy of models and their ability to generate code. A less explored aspect of code generation is for new algorithms, where the training dataset would not have included any previous example of similar code. In this paper we propose a new methodology for writing code from scratch for a new algorithm using LLM assistance, and describe enhancement of a previously developed code-translation tool, Code-Scribe, for new code generation.
