Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

Francesco Vitale; Francesco Flammini; Mauro Caporuscio; Nicola Mazzocca

Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

Francesco Vitale, Francesco Flammini, Mauro Caporuscio, Nicola Mazzocca

TL;DR

The paper tackles detecting control-flow anomalies in complex computer-based systems under unknown unknowns by combining LLM-powered source-code instrumentation with conformance checking against design-time process models. It introduces a three-phase methodology (software development, monitoring design, anomaly detection) and demonstrates it on the ERTMS/ETCS Start of Mission case study. Results show that LLM-based instrumentation can achieve up to 84.8% control-flow coverage and anomaly-detection performance up to 96.6% F1 and 93.5% AUC, indicating robust, explainable run-time monitoring. The approach promises practical impact for safety-critical domains and digital twins, with future work extending case studies and refining LLM prompting strategies.

Abstract

Context: Ensuring high levels of dependability in modern computer-based systems has become increasingly challenging due to their complexity. Although systems are validated at design time, their behavior can be different at run-time, possibly showing control-flow anomalies due to "unknown unknowns". Objective: We aim to detect control-flow anomalies through software monitoring, which verifies run-time behavior by logging software execution and detecting deviations from expected control flow. Methods: We propose a methodology to develop software monitors for control-flow anomaly detection through Large Language Models (LLMs) and conformance checking. The methodology builds on existing software development practices to maintain traditional V&V while providing an additional level of robustness and trustworthiness. It leverages LLMs to link design-time models and implementation code, automating source-code instrumentation. The resulting event logs are analyzed via conformance checking, an explainable and effective technique for control-flow anomaly detection. Results: We test the methodology on a case-study scenario from the European Railway Traffic Management System / European Train Control System (ERTMS/ETCS), which is a railway standard for modern interoperable railways. The results obtained from the ERTMS/ETCS case study demonstrate that LLM-based source-code instrumentation can achieve up to 84.775% control-flow coverage of the reference design-time process model, while the subsequent conformance checking-based anomaly detection reaches a peak performance of 96.610% F1-score and 93.515% AUC. Conclusion: Incorporating domain-specific knowledge to guide LLMs in source-code instrumentation significantly allowed obtaining reliable and quality software logs and enabled effective control-flow anomaly detection through conformance checking.

Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

TL;DR

Abstract

Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)

Theorems & Definitions (7)