Enabling Personal Dataflow Sovereignty via Bolt-on Data Escrow
Zhiru Zhu, Raul Castro Fernandez
TL;DR
This work tackles the lack of transparency and control in personal data usage by introducing a bolt-on data escrow that enables delegated computation inside the individual's trust zone. Data access is declaratively specified via a run(access(), compute()) interface, while the escrow enforces purpose-based data use and supports on-device or escrow-controlled server execution, ensuring that raw data $D_s$ never leaves the trust zone and only the result $D_t$ is exposed. The authors implement this architecture in the Apple ecosystem, designing a relational data virtualization layer and evaluating three relational-engine strategies—Materialized Tables, Virtual Tables, and Virtual Tables with Pushdown—alongside an end-to-end offloading pipeline with strong security guarantees. Qualitative and quantitative evaluations show that the escrow can express real-world dataflows with negligible overhead, especially when employing predicate pushdown, supporting broad applicability toward practical personal dataflow sovereignty.
Abstract
The digital economy is powered by a continuous and massive exchange of personal data. Individuals provide data to platforms in return for services, from social networking and search to health monitoring, entertainment, and access to LLMs. This exchange has created immense value, but it has also established a fundamental asymmetry of power: individuals possess only coarse-grained control over data access rather than fine-grained control over its purpose of use, creating a gap where data can be repurposed for undisclosed uses, e.g., platforms selling the data to data brokers, which results in a critical loss of personal data sovereignty. This paper reframes this socio-technical challenge as a dataflow management problem. We propose a bolt-on data escrow architecture through delegated computation. In our model, instead of data flowing to platforms, platforms delegate their computation to a trustworthy escrow. This inversion empowers individuals with transparency and control over their dataflows. We present four contributions: (1) a dataflow model that explicitly incorporates computational purpose as a first-class primitive; (2) a minimally invasive programming interface, run(access(), compute()), built on a unified relational interface that virtualizes on-device data sources and a computation offloading component; (3) a concrete implementation of our escrow within the Apple ecosystem, demonstrating its practicality; and (4) both qualitative and quantitative evaluations demonstrating that our solution is expressive enough to implement a wide range of dataflows from real-world applications and introduces minimal runtime overhead. In summary, our work serves as a stepping stone toward achieving personal dataflow sovereignty.
