TracE2E: Easily Deployable Middleware for Decentralized Data Traceability
Daniel Pressensé, Elisavet Kozyri
TL;DR
The paper addresses the need for explainability and regulatory compliance in distributed data processing by proposing TracE2E, a Rust-based middleware that records provenance and enforces data-protection policies. It achieves end-to-end traceability across multiple nodes by wrapping the Rust IO library to mediate inputs and outputs, while supporting a modular compliance layer built on top of provenance. Key contributions include a decentralized provenance layer with a synchronous, atomically recorded data-flow protocol (P2M and M2M), a global resource identification system, a labeled provenance/compliance structure, and a gRPC-based interface for coordination. The evaluation demonstrates that the framework enforces local confidentiality and integrity policies, albeit with measurable I/O overhead that scales with small I/O operations, illustrating the practical viability and trade-offs of decentralized traceability in real-world systems.
Abstract
This paper presents TracE2E, a middleware written in Rust, that can provide both data explainability and compliance across multiple nodes. By mediating inputs and outputs of processes, TracE2E records provenance information and enforces data-protection policies (e.g., confidentiality, integrity) that depend on the recorded provenance. Unlike existing approaches that necessitate substantial application modifications, TracE2E is designed for easy integration into existing and future applications through a wrapper of the Rust standard library's IO module. We describe how TracE2E consistently records provenance information across nodes, and we demonstrate how the compliance layer of TracE2E can accommodate the enforcement of multiple policies.
