Causal Models in Requirement Specifications for Machine Learning: A vision
Hans-Martin Heyn, Yufei Mao, Roland Weiss, Eric Knauss
TL;DR
This paper addresses the challenge of specifying data requirements for ML within requirements engineering by introducing causal modelling as a formal RE activity. It presents a workflow to translate high-level prior knowledge into causal graphs, from which data and model requirements and runtime testing criteria can be derived. The authors demonstrate the approach in an industrial cooling fault-detection prototype for electric motors, deriving a DAG with observable and latent variables and concrete data collection and testing rules. The discussion outlines a research agenda to establish causal modelling as a practical RE practice for ML systems, including integration with natural language requirements and validation of causality-driven criteria.
Abstract
Specifying data requirements for machine learning (ML) software systems remains a challenge in requirements engineering (RE). This vision paper explores causal modelling as an RE activity that allows the systematic integration of prior domain knowledge into the design of ML software systems. We propose a workflow to elicit low-level model and data requirements from high-level prior knowledge using causal models. The approach is demonstrated on an industrial fault detection system. This paper outlines future research needed to establish causal modelling as an RE practice.
