Outline of an Independent Systematic Blackbox Test for ML-based Systems

Hans-Werner Wiesbrock; Jürgen Großmann

Outline of an Independent Systematic Blackbox Test for ML-based Systems

Hans-Werner Wiesbrock, Jürgen Großmann

TL;DR

The paper tackles the challenge of validating ML-based systems in a training-independent, statistically sound manner by introducing Probabilistically Extended Ontologies (PEON) that attach probability distributions to partitions of the Operational Design Domain (ODD). It develops a formal testing framework where test outcomes follow a Bernoulli process per partition with end-of-test criteria derived from significance levels and power, and demonstrates the approach with toy and real-data experiments (e.g., COCO/CenterNet object detection and PETA) showing improved representativeness when marginal and conditional distributions are modeled. The work highlights the limitations of purely combinatorial (N-wise) testing for ML systems and offers a concrete data-generation pipeline that transitions abstract PEONs into executable test cases via simulation. Overall, PEON provides a principled path toward reproducible, statistically valid black-box testing and potential certification of ML-based systems, along with planned enhancements to data generation, ethical assessment, and sample-size estimation ideas.

Abstract

This article proposes a test procedure that can be used to test ML models and ML-based systems independently of the actual training process. In this way, the typical quality statements such as accuracy and precision of these models and system can be verified independently, taking into account their black box character and the immanent stochastic properties of ML models and their training data. The article presents first results from a set of test experiments and suggest extensions to existing test methods reflecting the stochastic nature of ML models and ML-based systems.

Outline of an Independent Systematic Blackbox Test for ML-based Systems

TL;DR

Abstract

Paper Structure (15 sections, 7 equations, 7 figures, 1 table)

This paper contains 15 sections, 7 equations, 7 figures, 1 table.

Introduction
Related work
Structure of this work
Testing ML-based systems
Systematic testing of conventional systems
Ontologies
Statistical nature of ML-based systems
Statistic of N-wise Testing
Probabilistic extension of ontologies
A constructive approach to probabilisticly extend ontologies
Some experimental results
End-of-test criteria for ML-based systems
Ethical points of view
A Probabilictically extended ontology and testdata generation
Conclusion and future work

Figures (7)

Figure 1: Simplified ontology specifiying a person
Figure 2: Partitioning of a dataset according to an ontology
Figure 3: COCO ontology
Figure 4: 2-, 3-, 4- wise Combinatorial Testing vs. random sampling
Figure 5: Probabilistic distribution over some dataset
...and 2 more figures

Theorems & Definitions (6)

Remark 1.1
Example 2.1
Example 2.2
Example 2.3
Remark 2.1
Remark 3.1

Outline of an Independent Systematic Blackbox Test for ML-based Systems

TL;DR

Abstract

Outline of an Independent Systematic Blackbox Test for ML-based Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (6)