Towards Trustworthy AI Software Development Assistance

Daniel Maninger; Krishna Narasimhan; Mira Mezini

Towards Trustworthy AI Software Development Assistance

Daniel Maninger, Krishna Narasimhan, Mira Mezini

TL;DR

This work identifies the trust and reliability gaps in AI software development assistants, notably incorrect and insecure code generation, and proposes a holistic architecture centered on a foundational code model trained on representative real-world data. It combines graph-based code representations, a semantically enriched knowledge graph for grounding, and a modular constrained decoding framework to deliver correctness and security guarantees, supplemented by RL-based code quality optimization and explainability mechanisms. Concrete research directions include curating realistic datasets, enhancing semantic understanding with graph-based models, grounding explanations via a code knowledge graph, and implementing constrained decoding to bound output behavior. If realized, this approach could enable AI SD assistants to support software engineers across design, debugging, and deployment with improved correctness, readability, security, and explainability, ultimately advancing practical adoption in industry.

Abstract

It is expected that in the near future, AI software development assistants will play an important role in the software industry. However, current software development assistants tend to be unreliable, often producing incorrect, unsafe, or low-quality code. We seek to resolve these issues by introducing a holistic architecture for constructing, training, and using trustworthy AI software development assistants. In the center of the architecture, there is a foundational LLM trained on datasets representative of real-world coding scenarios and complex software architectures, and fine-tuned on code quality criteria beyond correctness. The LLM will make use of graph-based code representations for advanced semantic comprehension. We envision a knowledge graph integrated into the system to provide up-to-date background knowledge and to enable the assistant to provide appropriate explanations. Finally, a modular framework for constrained decoding will ensure that certain guarantees (e.g., for correctness and security) hold for the generated code.

Towards Trustworthy AI Software Development Assistance

TL;DR

Abstract

Paper Structure (24 sections, 1 figure)

This paper contains 24 sections, 1 figure.

Introduction
Approach
Representative Datasets
Challenge
Envisioned solution
Evaluation
Capturing Code Structure and Semantics
Challenge
Envisioned solution
Evaluation
Code Quality
Challenge
Envisioned solution
Evaluation
Explainability
...and 9 more sections

Figures (1)

Figure 1: High-level architecture of our envisioned AI software development assistant. It consists of five main components: (1) A curated training dataset representing real-world coding patterns and software architectures. (2) A foundational code model that uses graph representations for better understanding of program semantics. (3) An RL-based feedback mechanism to fine-tune the model for improved code quality. (4) A semantically enriched code knowledge graph to help the assistant explain its code. (5) A modular constrained decoding framework on top of the model that prevents the generation of undesired code.

Towards Trustworthy AI Software Development Assistance

TL;DR

Abstract

Towards Trustworthy AI Software Development Assistance

Authors

TL;DR

Abstract

Table of Contents

Figures (1)