Table of Contents
Fetching ...

Data Mesh: a Systematic Gray Literature Review

Abel Goedegebuure, Indika Kumara, Stefan Driessen, Dario Di Nucci, Geert Monsieur, Willem-jan van den Heuvel, Damian Andrew Tamburri

TL;DR

This paper conducts a systematic gray literature review of 114 industrial sources to define data mesh and its four core principles, emphasizing domain-oriented ownership, data as a product, self-serve platforms, and federated governance. It constructs three reference architectures by mapping gray literature findings to established SOA layers and concepts, providing a practical blueprint for organization, development, and runtime aspects of data mesh. The work combines practitioner insights with academic perspectives to identify benefits, concerns, and open research challenges, offering replication data and a roadmap for further theoretical refinement. The study advances the field by delivering a concrete, SOA-grounded framework and highlighting areas where academic inquiry can formalize concepts such as data contracts, tiered governance, and domain-driven platform design, ultimately aiding practitioners in implementing data mesh at scale.

Abstract

Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.

Data Mesh: a Systematic Gray Literature Review

TL;DR

This paper conducts a systematic gray literature review of 114 industrial sources to define data mesh and its four core principles, emphasizing domain-oriented ownership, data as a product, self-serve platforms, and federated governance. It constructs three reference architectures by mapping gray literature findings to established SOA layers and concepts, providing a practical blueprint for organization, development, and runtime aspects of data mesh. The work combines practitioner insights with academic perspectives to identify benefits, concerns, and open research challenges, offering replication data and a roadmap for further theoretical refinement. The study advances the field by delivering a concrete, SOA-grounded framework and highlighting areas where academic inquiry can formalize concepts such as data contracts, tiered governance, and domain-driven platform design, ultimately aiding practitioners in implementing data mesh at scale.

Abstract

Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.
Paper Structure (40 sections, 8 figures)

This paper contains 40 sections, 8 figures.

Figures (8)

  • Figure 1: Google trends for the term "Data Mesh".
  • Figure 2: Systematic gray literature review process.
  • Figure 3: The search and selection process. Steps are shown sequentially, and each step reduces the number of candidate sources until the final selection. After the final exclusion step, an inter-rater reliability test---featuring the well-established Cohen Kappa coefficient calculation---was performed with $\kappa = 0.79$ to ensure that reviewer bias was not inappropriately high.
  • Figure 4: Source statistics.
  • Figure 5: An example of data products in different domains, adopted from S110.
  • ...and 3 more figures