A Workflow for Full Traceability of AI Decisions
Julius Wenzel, Syeda Umaima Alam, Andreas Schmidt, Hanwei Zhang, Holger Hermanns
TL;DR
This work tackles the problem of trustworthy AI in high-stakes contexts by proposing a practical, end-to-end workflow for tamper-proof traceability of AI decisions. It extends the Bill of Materials concept into running, verifiable traces for both training and inference (TBOM and IBOM) using confidential computing to protect data-in-use and ensure integrity. The authors present a FungAI use case to demonstrate implementation, threat analysis, and a first ecosystem of tools to support auditing, compliance, and vigilance. The study reports substantial overhead from confidential computing but argues that traceability is essential for accountability, with potential for deployment-focused adoption and gradual ecosystem expansion as confidential hardware matures. The work aims to enable robust responsibility chains and regulatory alignment, offering concrete artifacts, formats, and an architecture for verifiable AI decision processes.
Abstract
An ever increasing number of high-stake decisions are made or assisted by automated systems employing brittle artificial intelligence technology. There is a substantial risk that some of these decision induce harm to people, by infringing their well-being or their fundamental human rights. The state-of-the-art in AI systems makes little effort with respect to appropriate documentation of the decision process. This obstructs the ability to trace what went into a decision, which in turn is a prerequisite to any attempt of reconstructing a responsibility chain. Specifically, such traceability is linked to a documentation that will stand up in court when determining the cause of some AI-based decision that inadvertently or intentionally violates the law. This paper takes a radical, yet practical, approach to this problem, by enforcing the documentation of each and every component that goes into the training or inference of an automated decision. As such, it presents the first running workflow supporting the generation of tamper-proof, verifiable and exhaustive traces of AI decisions. In doing so, we expand the DBOM concept into an effective running workflow leveraging confidential computing technology. We demonstrate the inner workings of the workflow in the development of an app to tell poisonous and edible mushrooms apart, meant as a playful example of high-stake decision support.
