Table of Contents
Fetching ...

Adding New Capability in Existing Scientific Application with LLM Assistance

Anshu Dubey, Akash Dhruv

TL;DR

This work tackles the challenge of generating code for new algorithms with LLM assistance by proposing a specification-first, interactive prompting workflow to reduce hallucinations. It showcases enhancements to CodeScribe for seamless generation and integration of new code into the Flash-X framework. The authors detail a CIC-based deposition algorithm on a Cartesian AMR mesh and introduce virtual particles to avoid expensive reverse ghost-fill, illustrating the approach in both $2$D and $3$D contexts. The findings suggest that specification-driven prompts combined with test-driven development and persistent chat logs substantially improve code correctness, readability, and developer productivity when using LLMs for software engineering in scientific codes.

Abstract

With the emergence and rapid evolution of large language models (LLM), automating coding tasks has become an important research topic. Many efforts are underway and literature abounds about the efficacy of models and their ability to generate code. A less explored aspect of code generation is for new algorithms, where the training dataset would not have included any previous example of similar code. In this paper we propose a new methodology for writing code from scratch for a new algorithm using LLM assistance, and describe enhancement of a previously developed code-translation tool, Code-Scribe, for new code generation.

Adding New Capability in Existing Scientific Application with LLM Assistance

TL;DR

This work tackles the challenge of generating code for new algorithms with LLM assistance by proposing a specification-first, interactive prompting workflow to reduce hallucinations. It showcases enhancements to CodeScribe for seamless generation and integration of new code into the Flash-X framework. The authors detail a CIC-based deposition algorithm on a Cartesian AMR mesh and introduce virtual particles to avoid expensive reverse ghost-fill, illustrating the approach in both D and D contexts. The findings suggest that specification-driven prompts combined with test-driven development and persistent chat logs substantially improve code correctness, readability, and developer productivity when using LLMs for software engineering in scientific codes.

Abstract

With the emergence and rapid evolution of large language models (LLM), automating coding tasks has become an important research topic. Many efforts are underway and literature abounds about the efficacy of models and their ability to generate code. A less explored aspect of code generation is for new algorithms, where the training dataset would not have included any previous example of similar code. In this paper we propose a new methodology for writing code from scratch for a new algorithm using LLM assistance, and describe enhancement of a previously developed code-translation tool, Code-Scribe, for new code generation.

Paper Structure

This paper contains 7 sections, 1 equation, 4 figures, 2 algorithms.

Figures (4)

  • Figure 1: Deposition from a particle onto the mesh. The darker cells are where the deposition occurs. The left panel shows a situation where the particle lies on an interior cell, while the right panel shows a particle that lies on a cell at the boundary, and periodic boundary conditions are assumed.
  • Figure 2: Deposition on the edge of the block where halo cells get some of the mass from the particle as density.
  • Figure 3: Mesh used for testing the code as we develop it.
  • Figure 4: Snapshot of interaction to correct specifications as understood by the model