The Data Mesh is a new-new thing. In concept, it’s really just a way to describe a large data platform made up of several distributed data estates, but managed and understood as one environment. The Microsoft Learn site actually has a very nice description of a Data Mesh concept. The core concept of the “Data Domain” is really at the heart of making the Data Mesh real, which is a way of thinking about the structure of data from a business point of view. Let’s explore this concept:
The Data Domain concept (using a diagram from Microsoft Cloud Adoption Framework) is the segmentation of functional areas of data, but still covering all those functional areas with one governance and lineage platform. It assumes that there may be some diversity across the platform, but also that the data ultimately can map to each other. The following diagram shows a Data Mesh, containing several Data Landing Zones, each which contain Data Domains and Data Products.
In reference to above, think about each Data Domain functionally or structurally. A data domain might focus on Marketing data, or Financial data, or OT data from the manufacturing floor. There may be a diversity of best practice tools that accelerate time-to-value in the area and are used. The Data Domains may each have different stewards and use tools that are best suited for the use case. Of course, there is advantage in using the same tool across data domains when possible.
In relationship to a greater Azure environment, notice that each of these represents its own Landing Zone. Remember that a Landing Zone is a sub-concept of the greater Azure architecture.
So, in the diagram above see that each Landing Zone A1, A2, etc. maps to a Data Domain that might be present in the Data Mesh. In fact, there might even be data living in another cloud, such as Google or Amazon, etc. that is part of the same mesh.
Transporting Data Between Data Domains
What do we do when we need to leverage data from one domain in another? Likely there is transport of data between domains vs. sharing from another domain, although the latter is possible. The following diagram shows a concept of transport between domains where one group needs to leverage another’s data.
Notice above that a diverse set of tools is used across the domains (in this case all Microsoft, but it doesn’t need to be). The goal is to establish unique and standard services, all of which facilitate each other’s needs.
Management of the Data and Domains
The connected question is “how do I manage all this data?” This is based ultimately on creating a management layer that can understand and map the Certified Data Set, each individual data point’s owner, and the lineage between them. This is where tools like Microsoft Purview come in, with a purpose of creating one view of the picture. For more on that, check out a blog I wrote on the topic of Purview and the “Picture of the Elephant“.
For me reading, check out Microsoft’s site on Data Mesh.