EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG

Maximilian Harnot; Sebastian Komarnicki; Michal Polok; Timo Oksanen

EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG

Maximilian Harnot, Sebastian Komarnicki, Michal Polok, Timo Oksanen

TL;DR

A Retrieval-Augmented Generation (RAG) pipeline is presented as a solution for partial automation of the verification process by grounding a large language model in project-specific artifacts, which reduces hallucinations and improves project alignment.

Abstract

Manual development of automatic tests for embedded C software is a strenuous and time-consuming task that does not scale well. With the accelerating pace of software release cycles, verification increasingly becomes the bottleneck in the embedded development workflow. This paper presents a Retrieval-Augmented Generation (RAG) pipeline as a solution for partial automation of the verification process. By grounding a large language model in project-specific artifacts, the approach reduces hallucinations and improves project alignment. An industrial evaluation showed that the generated tests are 100 % syntactically correct, with 85 % successfully passing runtime validation. The proposed solution has the potential to save up to 66 % of the testing time compared to manual test writing while generating 270 tests per hour.

EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG

TL;DR

Abstract

Paper Structure (11 sections, 9 figures, 1 table)

This paper contains 11 sections, 9 figures, 1 table.

Introduction
Background
Problem Statement and Objectives
System Architecture
Knowledge Base and Chunking Strategies
Hybrid Retrieval
Prompt Construction and LLM choice
Evaluation Methodology
Results
Discussion and Industrial Application
Conclusion

Figures (9)

Figure 1: AI-assisted software testing ecosystem at Hydac Software
Figure 2: UMAP visualization of the embedding space for fixed-size chunking (1446 chunks). The clusters overlap due to arbitrary split boundaries.
Figure 3: UMAP visualization of the embedding space for AST-based chunking (833 chunks). In AST-based chunking, the clusters are grouped more tightly together.
Figure 4: Test generation prompt structure.
Figure 5: Comparison of the best-performing RAG configuration and manual test baseline.
...and 4 more figures

EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG

TL;DR

Abstract

EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG

Authors

TL;DR

Abstract

Table of Contents

Figures (9)