Table of Contents
Fetching ...

Towards Avoiding the Data Mess: Industry Insights from Data Mesh Implementations

Jan Bode, Niklas Kühl, Dominik Kreuzberger, Sebastian Hirschl, Carsten Holtmann

TL;DR

The paper investigates empirical industry insights into data mesh adoption through 15 semi-structured interviews across multiple sectors. It identifies motivational factors, challenges, and concrete implementation strategies, and documents observed impacts on accessibility, speed, data quality, and organizational data culture. The findings largely support Dehghani’s data mesh framework while highlighting persistent issues in federated governance, ownership transitions, and metadata quality, and it proposes practical guidelines and two organizational archetypes. This work fills a critical empirical gap and offers a foundation for future quantitative studies and more detailed architectural realizations.

Abstract

With the increasing importance of data and artificial intelligence, organizations strive to become more data-driven. However, current data architectures are not necessarily designed to keep up with the scale and scope of data and analytics use cases. In fact, existing architectures often fail to deliver the promised value associated with them. Data mesh is a socio-technical, decentralized, distributed concept for enterprise data management. As the concept of data mesh is still novel, it lacks empirical insights from the field. Specifically, an understanding of the motivational factors for introducing data mesh, the associated challenges, implementation strategies, its business impact, and potential archetypes is missing. To address this gap, we conduct 15 semi-structured interviews with industry experts. Our results show, among other insights, that organizations have difficulties with the transition toward federated governance associated with the data mesh concept, the shift of responsibility for the development, provision, and maintenance of data products, and the comprehension of the overall concept. In our work, we derive multiple implementation strategies and suggest organizations introduce a cross-domain steering unit, observe the data product usage, create quick wins in the early phases, and favor small dedicated teams that prioritize data products. While we acknowledge that organizations need to apply implementation strategies according to their individual needs, we also deduct two archetypes that provide suggestions in more detail. Our findings synthesize insights from industry experts and provide researchers and professionals with preliminary guidelines for the successful adoption of data mesh.

Towards Avoiding the Data Mess: Industry Insights from Data Mesh Implementations

TL;DR

The paper investigates empirical industry insights into data mesh adoption through 15 semi-structured interviews across multiple sectors. It identifies motivational factors, challenges, and concrete implementation strategies, and documents observed impacts on accessibility, speed, data quality, and organizational data culture. The findings largely support Dehghani’s data mesh framework while highlighting persistent issues in federated governance, ownership transitions, and metadata quality, and it proposes practical guidelines and two organizational archetypes. This work fills a critical empirical gap and offers a foundation for future quantitative studies and more detailed architectural realizations.

Abstract

With the increasing importance of data and artificial intelligence, organizations strive to become more data-driven. However, current data architectures are not necessarily designed to keep up with the scale and scope of data and analytics use cases. In fact, existing architectures often fail to deliver the promised value associated with them. Data mesh is a socio-technical, decentralized, distributed concept for enterprise data management. As the concept of data mesh is still novel, it lacks empirical insights from the field. Specifically, an understanding of the motivational factors for introducing data mesh, the associated challenges, implementation strategies, its business impact, and potential archetypes is missing. To address this gap, we conduct 15 semi-structured interviews with industry experts. Our results show, among other insights, that organizations have difficulties with the transition toward federated governance associated with the data mesh concept, the shift of responsibility for the development, provision, and maintenance of data products, and the comprehension of the overall concept. In our work, we derive multiple implementation strategies and suggest organizations introduce a cross-domain steering unit, observe the data product usage, create quick wins in the early phases, and favor small dedicated teams that prioritize data products. While we acknowledge that organizations need to apply implementation strategies according to their individual needs, we also deduct two archetypes that provide suggestions in more detail. Our findings synthesize insights from industry experts and provide researchers and professionals with preliminary guidelines for the successful adoption of data mesh.
Paper Structure (15 sections, 5 figures, 1 table)

This paper contains 15 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Conceptual overview of a data mesh based on the four key principles: 1) domain-oriented decentralized data ownership, 2) data as a product, 3) self-serve data platform, and 4) federated data governance. The figure shows different levels of granularity (high on the left and low on the right).
  • Figure 2: World-cloud of main concepts discussed during the interviewees. Stop-words were excluded and terms lemmatized. Concepts with a larger type-size represent a higher relevance across interviews based on linear scaling. For reference, mesh was found 570 times, whereas ml was found 51 times. The term data was excluded because its high frequency of 2447 would have skewed the visualization of other concepts.
  • Figure 3: Pie-chart of interview themes. Themes are sorted clockwise according to the interview guideline. 39 codes of the archive theme are omitted.
  • Figure 4: Bar-chart of coded segments for each interviewee. Number of coded segments on the y-axis; interviewees as defined in \ref{['tab:interviews']} on the x-axis.
  • Figure 5: Data Mesh-relationships between challenges and implementation strategies framed by motivational factors and impacts.