Table of Contents
Fetching ...

Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing

Akash Dhruv, Anshu Dubey

TL;DR

The paper tackles the productivity gap in translating legacy Fortran scientific codes to modern C++ for HPC interoperability. It introduces CodeScribe, a four-command workflow (Index, Inspect, Draft, Translate) that uses seed prompts and retrieval-augmented generation to guide AI-assisted translation while mandating human verification, and it explicitly handles the $1$-based Fortran vs $0$-based C++ indexing during data-structure translation. Through MCFM and other targets such as Noah-MP and ERF, the work provides a quantitative comparison of several configurations and analyzes the impact of RAG on translation quality. The results show that GPT-4o offers the strongest overall performance, while highlighting persistent correctness challenges and suggesting directions for integration with frameworks like LASSI to broaden applicability and reliability in scientific workflows.

Abstract

The emergence of foundational models and generative artificial intelligence (GenAI) is poised to transform productivity in scientific computing, especially in code development, refactoring, and translating from one programming language to another. However, because the output of GenAI cannot be guaranteed to be correct, manual intervention remains necessary. Some of this intervention can be automated through task-specific tools, alongside additional methodologies for correctness verification and effective prompt development. We explored the application of GenAI in assisting with code translation, language interoperability, and codebase inspection within a legacy Fortran codebase used to simulate particle interactions at the Large Hadron Collider (LHC). In the process, we developed a tool, CodeScribe, which combines prompt engineering with user supervision to establish an efficient process for code conversion. In this paper, we demonstrate how CodeScribe assists in converting Fortran code to C++, generating Fortran-C APIs for integrating legacy systems with modern C++ libraries, and providing developer support for code organization and algorithm implementation. We also address the challenges of AI-driven code translation and highlight its benefits for enhancing productivity in scientific computing workflows.

Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing

TL;DR

The paper tackles the productivity gap in translating legacy Fortran scientific codes to modern C++ for HPC interoperability. It introduces CodeScribe, a four-command workflow (Index, Inspect, Draft, Translate) that uses seed prompts and retrieval-augmented generation to guide AI-assisted translation while mandating human verification, and it explicitly handles the -based Fortran vs -based C++ indexing during data-structure translation. Through MCFM and other targets such as Noah-MP and ERF, the work provides a quantitative comparison of several configurations and analyzes the impact of RAG on translation quality. The results show that GPT-4o offers the strongest overall performance, while highlighting persistent correctness challenges and suggesting directions for integration with frameworks like LASSI to broaden applicability and reliability in scientific workflows.

Abstract

The emergence of foundational models and generative artificial intelligence (GenAI) is poised to transform productivity in scientific computing, especially in code development, refactoring, and translating from one programming language to another. However, because the output of GenAI cannot be guaranteed to be correct, manual intervention remains necessary. Some of this intervention can be automated through task-specific tools, alongside additional methodologies for correctness verification and effective prompt development. We explored the application of GenAI in assisting with code translation, language interoperability, and codebase inspection within a legacy Fortran codebase used to simulate particle interactions at the Large Hadron Collider (LHC). In the process, we developed a tool, CodeScribe, which combines prompt engineering with user supervision to establish an efficient process for code conversion. In this paper, we demonstrate how CodeScribe assists in converting Fortran code to C++, generating Fortran-C APIs for integrating legacy systems with modern C++ libraries, and providing developer support for code organization and algorithm implementation. We also address the challenges of AI-driven code translation and highlight its benefits for enhancing productivity in scientific computing workflows.

Paper Structure

This paper contains 15 sections, 8 figures.

Figures (8)

  • Figure 1: Workflow diagram illustrating the interaction between C++ driver code and the core codebase, with the Fortran-C API facilitating interoperability.
  • Figure 2: Structure of the chat completion template encoded in TOML format. This template outlines code conversion rules and includes an example representing a set of files that follow a similar pattern. The chat template is appended with code from the target source file and provided to the LLM, which generates the corresponding C++ and Fortran-C interface source code.
  • Figure 3: Schematic of the workflow for LLM-based code conversion process. Steps in blue are managed using CodeScribe while steps in red are manual and require developer intervention.
  • Figure 4: Sample contents of the hierarchical index file generated in each project subdirectory. This index captures relative directory location within the project structure and lists files along with associated constructs, such as modules, subroutines, and functions.
  • Figure 5: Developer time (minutes) for converting and testing a single Fortran source file in MCFM with various GenAI models.
  • ...and 3 more figures