Table of Contents
Fetching ...

DRAFT-ing Architectural Design Decisions using LLMs

Rudra Dhar, Adyansh Kakran, Amey Karan, Karthik Vaidhyanathan, Vasudeva Varma

TL;DR

This work tackles the persistent challenge of Architectural Knowledge Management (AKM) and the limited adoption of Architectural Decision Records (ADRs) by proposing DRAFT, a domain-specific retrieval-augmented few-shot tuning approach. DRAFT combines retrieval-augmented few-shot generation (RAG) with fine-tuning (including parameter-efficient methods like LoRA) in two phases: offline fine-tuning on retrieved context-decision exemplars and online inference that leverages retrieved ADRs to generate Architecture Design Decisions (ADDs). Across a dataset of 4,911 ADRs, DRAFT outperforms prompting, RAG, and fine-tuning baselines on automated metrics (ROUGE-1, BLEU, METEOR, BERTScore) and shows strong practical impact with in-house privacy-preserving deployment advantages. The results indicate that domain-aligned retrieval and fine-tuning can yield high-quality ADDs while addressing privacy and resource constraints, positioning DRAFT as a viable co-pilot for architects and a stepping stone toward broader AKM automation.

Abstract

Architectural Knowledge Management (AKM) is crucial for software development but remains challenging due to the lack of standardization and high manual effort. Architecture Decision Records (ADRs) provide a structured approach to capture Architecture Design Decisions (ADDs), but their adoption is limited due to the manual effort involved and insufficient tool support. Our previous work has shown that Large Language Models (LLMs) can assist in generating ADDs. However, simply prompting the LLM does not produce quality ADDs. Moreover, using third-party LLMs raises privacy concerns, while self-hosting them poses resource challenges. To this end, we experimented with different approaches like few-shot, retrieval-augmented generation (RAG) and fine-tuning to enhance LLM's ability to generate ADDs. Our results show that both techniques improve effectiveness. Building on this, we propose Domain Specific Retreival Augumented Few Shot Fine Tuninng, DRAFT, which combines the strengths of all these three approaches for more effective ADD generation. DRAFT operates in two phases: an offline phase that fine-tunes an LLM on generating ADDs augmented with retrieved examples and an online phase that generates ADDs by leveraging retrieved ADRs and the fine-tuned model. We evaluated DRAFT against existing approaches on a dataset of 4,911 ADRs and various LLMs and analyzed them using automated metrics and human evaluations. Results show DRAFT outperforms all other approaches in effectiveness while maintaining efficiency. Our findings indicate that DRAFT can aid architects in drafting ADDs while addressing privacy and resource constraints.

DRAFT-ing Architectural Design Decisions using LLMs

TL;DR

This work tackles the persistent challenge of Architectural Knowledge Management (AKM) and the limited adoption of Architectural Decision Records (ADRs) by proposing DRAFT, a domain-specific retrieval-augmented few-shot tuning approach. DRAFT combines retrieval-augmented few-shot generation (RAG) with fine-tuning (including parameter-efficient methods like LoRA) in two phases: offline fine-tuning on retrieved context-decision exemplars and online inference that leverages retrieved ADRs to generate Architecture Design Decisions (ADDs). Across a dataset of 4,911 ADRs, DRAFT outperforms prompting, RAG, and fine-tuning baselines on automated metrics (ROUGE-1, BLEU, METEOR, BERTScore) and shows strong practical impact with in-house privacy-preserving deployment advantages. The results indicate that domain-aligned retrieval and fine-tuning can yield high-quality ADDs while addressing privacy and resource constraints, positioning DRAFT as a viable co-pilot for architects and a stepping stone toward broader AKM automation.

Abstract

Architectural Knowledge Management (AKM) is crucial for software development but remains challenging due to the lack of standardization and high manual effort. Architecture Decision Records (ADRs) provide a structured approach to capture Architecture Design Decisions (ADDs), but their adoption is limited due to the manual effort involved and insufficient tool support. Our previous work has shown that Large Language Models (LLMs) can assist in generating ADDs. However, simply prompting the LLM does not produce quality ADDs. Moreover, using third-party LLMs raises privacy concerns, while self-hosting them poses resource challenges. To this end, we experimented with different approaches like few-shot, retrieval-augmented generation (RAG) and fine-tuning to enhance LLM's ability to generate ADDs. Our results show that both techniques improve effectiveness. Building on this, we propose Domain Specific Retreival Augumented Few Shot Fine Tuninng, DRAFT, which combines the strengths of all these three approaches for more effective ADD generation. DRAFT operates in two phases: an offline phase that fine-tunes an LLM on generating ADDs augmented with retrieved examples and an online phase that generates ADDs by leveraging retrieved ADRs and the fine-tuned model. We evaluated DRAFT against existing approaches on a dataset of 4,911 ADRs and various LLMs and analyzed them using automated metrics and human evaluations. Results show DRAFT outperforms all other approaches in effectiveness while maintaining efficiency. Our findings indicate that DRAFT can aid architects in drafting ADDs while addressing privacy and resource constraints.

Paper Structure

This paper contains 34 sections, 10 equations, 10 figures, 7 tables, 4 algorithms.

Figures (10)

  • Figure 1: Sample ADR after extracting Context-Decision
  • Figure 2: Prompting
  • Figure 3: RAG
  • Figure 4: finetuning
  • Figure 5: DRAFT
  • ...and 5 more figures