Data mesh
Data mesh is a sociotechnical approach to building a decentralized data architecture by leveraging a
domain-oriented, self-serve design (in a software development perspective), and borrows Eric Evans’
theory of domain-driven design[1] and Manuel Pais’ and Matthew Skelton’s theory of team topologies.[2]
Data mesh mainly concerns itself with the data itself, taking the data lake and the pipelines as a secondary
concern. [3] The main proposition is scaling analytical data by domain-oriented decentralization.[4] With
data mesh, the responsibility for analytical data is shifted from the central data team to the domain teams,
supported by a data platform team that provides a domain-agnostic data platform.[5]
History
The term data mesh was first defined by Zhamak Dehghani in 2019[6] while she was working as a
principal consultant at the technology company Thoughtworks.[7][8] Dehghani introduced the term in 2019
and then provided greater detail on its principles and logical architecture throughout 2020. The process was
predicted to be a “big contender” for companies in 2022.[9][10] Data meshes have been implemented by
companies such as Zalando,[11] Netflix,[12] Intuit,[13] VistaPrint, PayPal[14] and others.
In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data.[15]
Principles
Data mesh is based on four core principles:[16]
Domain ownership
Data as a product[17]
Self-serve data platform
Federated computational governance
In addition to these principles, Dehghani writes that the data products created by each domain team should
be discoverable, addressable, trustworthy, possess self-describing semantics and syntax, be interoperable,
secure, and governed by global standards and access controls.[18] In other words, the data should be treated
as a product that is ready to use and reliable.[19]
Data mesh in practice
After its' introduction in 2017[6] multiple companies started to implement a data mesh[11][13][14] and share
their experiences. Challenges (C) and best practices (BP) for practitioners, include:
C1. Federated data governance
Companies report difficulties to adopt a federated governance structure for activities and
processes that were previously centrally owned and enforced. This is especially true for
security, privacy, and regulatory topics.[20][21][22]
C2. Responsibility shift
In data mesh individuals within domains are end-to-end responsible for data products.
This new responsibility can be challenging, because it is rarely compensated and usually
benefits other domains.[20][21]
C3. Comprehension
Research has shown a severe lack of comprehension for the data mesh paradigm among
employees of companies implementing a data mesh.[20]
BP1. Cross-domain unit
Addressing C1, organizations should introduce a cross-domain steering unit responsible
for strategic planning, use case prioritization, and the enforcement of specific governance
rules—especially concerning security, regulatory, and privacy-related topics.
Nevertheless, a cross-domain steering unit can only complement and support the
federated governance structure and may grow obsolete with the increasing maturity of the
data mesh.[20][23]
BP2. Track and observe
Addressing C2., organizations should observe and score data product quality as tracking
and ranking key data products can encourage high-quality offerings, motivate domain
owners, and support budget negotiations.[20]
BP3. Conscious adoption
Organizations should thoroughly assess and evaluate their existing data systems,
consider organizational factors, and weigh the potential benefits before implementing a
data mesh. When introducing data mesh, it is advised to carefully and consciously
introduce data mesh terminology to ensure a clear understanding of the concept (C3).[20]
Community
Scott Hirleman has started a data mesh community that contains over 7,500 people in their Slack
channel.[24]
See also
Data management
Data platform
Data vault modeling, method of data modeling with storage of data from various operational
systems and tracing of data origin, facilitating auditing, loading speeds and resilience
Data warehouse, a well established type of database system for organizing data in a
thematic way
ETL and ELT
References
1. Evans, Eric (2004). Domain-driven design : tackling complexity in the heart of software (http
s://www.worldcat.org/oclc/52134890). Boston: Addison-Wesley. ISBN 0-321-12521-5.
OCLC 52134890 (https://www.worldcat.org/oclc/52134890).
2. Skelton, Matthew (2019). Team topologies : organizing business and technology teams for
fast flow (https://www.worldcat.org/oclc/1108538721). Manuel Pais. Portland, OR. ISBN 978-
1-942788-84-3. OCLC 1108538721 (https://www.worldcat.org/oclc/1108538721).
3. Machado, Inês Araújo; Costa, Carlos; Santos, Maribel Yasmina (2022-01-01). "Data Mesh:
Concepts and Principles of a Paradigm Shift in Data Architectures" (https://doi.org/10.1016%
2Fj.procs.2021.12.013). Procedia Computer Science. International Conference on
ENTERprise Information Systems / ProjMAN - International Conference on Project
MANagement / HCist - International Conference on Health and Social Care Information
Systems and Technologies 2021. 196: 263–271. doi:10.1016/j.procs.2021.12.013 (https://do
i.org/10.1016%2Fj.procs.2021.12.013). ISSN 1877-0509 (https://www.worldcat.org/issn/187
7-0509). S2CID 245864612 (https://api.semanticscholar.org/CorpusID:245864612).
4. "Data Mesh Architecture" (https://datamesh-architecture.com/). datamesh-architecture.com.
Retrieved 2022-06-13.
5. Dehghani, Zhamak (2022). Data Mesh (https://www.worldcat.org/oclc/1260236796).
Sebastopol, CA. ISBN 978-1-4920-9236-0. OCLC 1260236796 (https://www.worldcat.org/oc
lc/1260236796).
6. "How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh" (https://martinfowl
er.com/articles/data-monolith-to-mesh.html). martinfowler.com. Retrieved 28 January 2022.
7. Baer (dbInsight), Tony. "Data Mesh: Should you try this at home?" (https://www.zdnet.com/art
icle/data-mesh-should-you-try-this-at-home/). ZDNet. Retrieved 2022-02-10.
8. Andy Mott (2022-01-12). "Driving Faster Insights with a Data Mesh" (https://www.rtinsights.c
om/driving-faster-insights-with-a-data-mesh/). RTInsights. Retrieved 2022-03-01.
9. "Developments that will define data governance and operational security in 2022" (https://w
ww.helpnetsecurity.com/2021/12/28/data-governance-2022/). Help Net Security. 2021-12-
28. Retrieved 2022-03-01.
10. Bane, Andy. "Council Post: Where Is Industrial Transformation Headed In 2022?" (https://ww
w.forbes.com/sites/forbestechcouncil/2022/01/13/where-is-industrial-transformation-headed-
in-2022/). Forbes. Retrieved 2022-03-01.
11. Schultze, Max; Wider, Arif (2021). Data Mesh in Practice. ISBN 978-1-09-810849-6.
12. Netflix Data Mesh: Composable Data Processing - Justin Cunningham (https://www.youtub
e.com/watch?v=TO_IiN06jJ4), retrieved 2022-04-29
13. Baker, Tristan (2021-02-22). "Intuit's Data Mesh Strategy" (https://medium.com/intuit-enginee
ring/intuits-data-mesh-strategy-778e3edaa017). Intuit Engineering. Retrieved 2022-04-29.
14. "The next generation of Data Platforms is the Data Mesh" (https://medium.com/paypal-tech/t
he-next-generation-of-data-platforms-is-the-data-mesh-b7df4b825522/). 2022-08-03.
Retrieved 2023-02-08.
15. "Why We Started Nextdata" (https://medium.com/@zhamakd/why-we-started-nextdata-dd30
b8528fca/). 2022-01-16. Retrieved 2023-02-08.
16. Dehghani, Zhamak (2022). Data Mesh (https://www.worldcat.org/oclc/1260236796).
Sebastopol, CA. ISBN 978-1-4920-9236-0. OCLC 1260236796 (https://www.worldcat.org/oc
lc/1260236796).
17. "Data Mesh defined | James Serra's Blog" (https://www.jamesserra.com/archive/2021/02/dat
a-mesh/). 16 February 2021. Retrieved 28 January 2022.
18. "Analytics in 2022 Means Mastery of Distributed Data Politics" (https://thenewstack.io/analyti
cs-in-2022-means-mastery-of-distributed-data-politics/). The New Stack. 2021-12-29.
Retrieved 2022-03-03.
19. "Developments that will define data governance and operational security in 2022" (https://w
ww.helpnetsecurity.com/2021/12/28/data-governance-2022/). Help Net Security. 2021-12-
28. Retrieved 2022-03-01.
20. Bode, Jan; Kühl, Niklas; Kreuzberger, Dominik; Hirschl, Sebastian; Holtmann, Carsten
(2023-05-04). "Data Mesh: Motivational Factors, Challenges, and Best Practices".
arXiv:2302.01713v2 (https://arxiv.org/abs/2302.01713v2).
21. Vestues, Kathrine; Hanssen, Geir Kjetil; Mikalsen, Marius; Buan, Thor Aleksander; Conboy,
Kieran (2022). Agile Data Management in NAV: A Case Study. Lecture Notes in Business
Information Processing 445 LNBIP. Springer. pp. 220–235. doi:10.1007/978-3-031-08169-
9_14 (https://doi.org/10.1007%2F978-3-031-08169-9_14).
22. Joshi, Divya; Pratik, Sheetal; Rao, Madhu Podila (2021). "Datagovernanceindata mesh
infrastructures: The Saxo bank case study". Proceedings of the International Conference on
Electronic Business (ICEB). Vol. 21. pp. 599–604.
23. Whyte, Martin; Odenkirchen, Andreas; Bautz, Stephan; Heringer, Agnes; Krukow, Oliver
(2022). "Data Mesh - Just another buzzword or the next generation data platform?" (https://w
ww.pwc.de/en/digitale-transformation/data-mesh-the-next-generation-enterprise-data-platfor
m.html). PwC study 2022: Changing data platforms.
24. "The Global Home for Data Mesh" (https://datameshlearning.com/). The Global Home for
Data Mesh. Retrieved 2022-04-24.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Data_mesh&oldid=1163511882"