Table of Contents
Fetching ...

A Workflow for Full Traceability of AI Decisions

Julius Wenzel, Syeda Umaima Alam, Andreas Schmidt, Hanwei Zhang, Holger Hermanns

TL;DR

This work tackles the problem of trustworthy AI in high-stakes contexts by proposing a practical, end-to-end workflow for tamper-proof traceability of AI decisions. It extends the Bill of Materials concept into running, verifiable traces for both training and inference (TBOM and IBOM) using confidential computing to protect data-in-use and ensure integrity. The authors present a FungAI use case to demonstrate implementation, threat analysis, and a first ecosystem of tools to support auditing, compliance, and vigilance. The study reports substantial overhead from confidential computing but argues that traceability is essential for accountability, with potential for deployment-focused adoption and gradual ecosystem expansion as confidential hardware matures. The work aims to enable robust responsibility chains and regulatory alignment, offering concrete artifacts, formats, and an architecture for verifiable AI decision processes.

Abstract

An ever increasing number of high-stake decisions are made or assisted by automated systems employing brittle artificial intelligence technology. There is a substantial risk that some of these decision induce harm to people, by infringing their well-being or their fundamental human rights. The state-of-the-art in AI systems makes little effort with respect to appropriate documentation of the decision process. This obstructs the ability to trace what went into a decision, which in turn is a prerequisite to any attempt of reconstructing a responsibility chain. Specifically, such traceability is linked to a documentation that will stand up in court when determining the cause of some AI-based decision that inadvertently or intentionally violates the law. This paper takes a radical, yet practical, approach to this problem, by enforcing the documentation of each and every component that goes into the training or inference of an automated decision. As such, it presents the first running workflow supporting the generation of tamper-proof, verifiable and exhaustive traces of AI decisions. In doing so, we expand the DBOM concept into an effective running workflow leveraging confidential computing technology. We demonstrate the inner workings of the workflow in the development of an app to tell poisonous and edible mushrooms apart, meant as a playful example of high-stake decision support.

A Workflow for Full Traceability of AI Decisions

TL;DR

This work tackles the problem of trustworthy AI in high-stakes contexts by proposing a practical, end-to-end workflow for tamper-proof traceability of AI decisions. It extends the Bill of Materials concept into running, verifiable traces for both training and inference (TBOM and IBOM) using confidential computing to protect data-in-use and ensure integrity. The authors present a FungAI use case to demonstrate implementation, threat analysis, and a first ecosystem of tools to support auditing, compliance, and vigilance. The study reports substantial overhead from confidential computing but argues that traceability is essential for accountability, with potential for deployment-focused adoption and gradual ecosystem expansion as confidential hardware matures. The work aims to enable robust responsibility chains and regulatory alignment, offering concrete artifacts, formats, and an architecture for verifiable AI decision processes.

Abstract

An ever increasing number of high-stake decisions are made or assisted by automated systems employing brittle artificial intelligence technology. There is a substantial risk that some of these decision induce harm to people, by infringing their well-being or their fundamental human rights. The state-of-the-art in AI systems makes little effort with respect to appropriate documentation of the decision process. This obstructs the ability to trace what went into a decision, which in turn is a prerequisite to any attempt of reconstructing a responsibility chain. Specifically, such traceability is linked to a documentation that will stand up in court when determining the cause of some AI-based decision that inadvertently or intentionally violates the law. This paper takes a radical, yet practical, approach to this problem, by enforcing the documentation of each and every component that goes into the training or inference of an automated decision. As such, it presents the first running workflow supporting the generation of tamper-proof, verifiable and exhaustive traces of AI decisions. In doing so, we expand the DBOM concept into an effective running workflow leveraging confidential computing technology. We demonstrate the inner workings of the workflow in the development of an app to tell poisonous and edible mushrooms apart, meant as a playful example of high-stake decision support.

Paper Structure

This paper contains 39 sections, 10 figures.

Figures (10)

  • Figure 1: A generic AI system can be separated in components and information flows in the direction of the arrows. Our contributions are marked in blue.
  • Figure 2: Inspector -- Showing FungAI's exemplary analysis wrt. confidence, uncertainty, and explainability.
  • Figure 3: Runtimes for different configurations. The (r) variants refer to the runs where only dependencies were loaded and no training took place.
  • Figure A4: Overview Tab
  • Figure A5: Performance Tab
  • ...and 5 more figures