Data Mesh: a Systematic Gray Literature Review
Abel Goedegebuure, Indika Kumara, Stefan Driessen, Dario Di Nucci, Geert Monsieur, Willem-jan van den Heuvel, Damian Andrew Tamburri
TL;DR
This paper conducts a systematic gray literature review of 114 industrial sources to define data mesh and its four core principles, emphasizing domain-oriented ownership, data as a product, self-serve platforms, and federated governance. It constructs three reference architectures by mapping gray literature findings to established SOA layers and concepts, providing a practical blueprint for organization, development, and runtime aspects of data mesh. The work combines practitioner insights with academic perspectives to identify benefits, concerns, and open research challenges, offering replication data and a roadmap for further theoretical refinement. The study advances the field by delivering a concrete, SOA-grounded framework and highlighting areas where academic inquiry can formalize concepts such as data contracts, tiered governance, and domain-driven platform design, ultimately aiding practitioners in implementing data mesh at scale.
Abstract
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.
