Table of Contents
Fetching ...

Rigorous Assessment of Model Inference Accuracy using Language Cardinality

Donato Clun, Donghwan Shin, Antonio Filieri, Domenico Bianculli

TL;DR

The paper tackles the problem of reliably assessing the accuracy of inferred finite-state models when ground-truth references are available, by replacing probabilistic trace sampling with deterministic, language-cardinality measures grounded in analytic combinatorics. It introduces a deterministic precision/recall framework that counts traces up to a user-defined maximum length using ordinary generating functions, and provides per-length assessments to reveal how model accuracy varies with trace length. A fast state-elimination method for computing OGFs is proposed to make the approach scalable, and the method is validated against a wide set of reference models with comparisons to existing trace-similarity and MBT-based approaches. The results show that the proposed method avoids sampling bias, yields reproducible results, and provides valuable per-length insights, with practical scalability demonstrated on real model-inference benchmarks.

Abstract

Models such as finite state automata are widely used to abstract the behavior of software systems by capturing the sequences of events observable during their execution. Nevertheless, models rarely exist in practice and, when they do, get easily outdated; moreover, manually building and maintaining models is costly and error-prone. As a result, a variety of model inference methods that automatically construct models from execution traces have been proposed to address these issues. However, performing a systematic and reliable accuracy assessment of inferred models remains an open problem. Even when a reference model is given, most existing model accuracy assessment methods may return misleading and biased results. This is mainly due to their reliance on statistical estimators over a finite number of randomly generated traces, introducing avoidable uncertainty about the estimation and being sensitive to the parameters of the random trace generative process. This paper addresses this problem by developing a systematic approach based on analytic combinatorics that minimizes bias and uncertainty in model accuracy assessment by replacing statistical estimation with deterministic accuracy measures. We experimentally demonstrate the consistency and applicability of our approach by assessing the accuracy of models inferred by state-of-the-art inference tools against reference models from established specification mining benchmarks.

Rigorous Assessment of Model Inference Accuracy using Language Cardinality

TL;DR

The paper tackles the problem of reliably assessing the accuracy of inferred finite-state models when ground-truth references are available, by replacing probabilistic trace sampling with deterministic, language-cardinality measures grounded in analytic combinatorics. It introduces a deterministic precision/recall framework that counts traces up to a user-defined maximum length using ordinary generating functions, and provides per-length assessments to reveal how model accuracy varies with trace length. A fast state-elimination method for computing OGFs is proposed to make the approach scalable, and the method is validated against a wide set of reference models with comparisons to existing trace-similarity and MBT-based approaches. The results show that the proposed method avoids sampling bias, yields reproducible results, and provides valuable per-length insights, with practical scalability demonstrated on real model-inference benchmarks.

Abstract

Models such as finite state automata are widely used to abstract the behavior of software systems by capturing the sequences of events observable during their execution. Nevertheless, models rarely exist in practice and, when they do, get easily outdated; moreover, manually building and maintaining models is costly and error-prone. As a result, a variety of model inference methods that automatically construct models from execution traces have been proposed to address these issues. However, performing a systematic and reliable accuracy assessment of inferred models remains an open problem. Even when a reference model is given, most existing model accuracy assessment methods may return misleading and biased results. This is mainly due to their reliance on statistical estimators over a finite number of randomly generated traces, introducing avoidable uncertainty about the estimation and being sensitive to the parameters of the random trace generative process. This paper addresses this problem by developing a systematic approach based on analytic combinatorics that minimizes bias and uncertainty in model accuracy assessment by replacing statistical estimation with deterministic accuracy measures. We experimentally demonstrate the consistency and applicability of our approach by assessing the accuracy of models inferred by state-of-the-art inference tools against reference models from established specification mining benchmarks.
Paper Structure (42 sections, 9 equations, 12 figures, 8 tables, 4 algorithms)

This paper contains 42 sections, 9 equations, 12 figures, 8 tables, 4 algorithms.

Figures (12)

  • Figure 1: Example of reference and inferred models
  • Figure 2: Sensitivity of statistical estimation to changes in the random walk for the models in Figure \ref{['fig:trace-sim-failure']}
  • Figure 3: A example case where the assessment based on the W-method would give misleading results.
  • Figure 4: Example of digraph construction
  • Figure 5: Node elimination
  • ...and 7 more figures