KEMBAR78
Data Mesh With Amazon Datazone | PDF | Amazon Web Services | Information Science
0% found this document useful (0 votes)
49 views1 page

Data Mesh With Amazon Datazone

The document outlines a reference architecture for implementing a data mesh using Amazon DataZone, which facilitates decentralized data management across an enterprise. It details the processes of data ingestion, transformation, storage, and governance, utilizing various AWS services. End users can access and request data assets through a centralized portal, enabling efficient data utilization for analytics and machine learning.

Uploaded by

ciu92500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views1 page

Data Mesh With Amazon Datazone

The document outlines a reference architecture for implementing a data mesh using Amazon DataZone, which facilitates decentralized data management across an enterprise. It details the processes of data ingestion, transformation, storage, and governance, utilizing various AWS services. End users can access and request data assets through a centralized portal, enabling efficient data utilization for analytics and machine learning.

Uploaded by

ciu92500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Data is gathered from data sources across the

1 enterprise through databases, file shares, edge


Data Mesh Reference Architecture with Amazon DataZone devices, logs, social networks, SaaS applications, and
streaming media.
Data Mesh is a decentralized architectural and organizational framework that helps organizations accelerate innovation
and drive business value. This reference architecture shows how Amazon Web Services customers can leverage Amazon Based on the source system and end user
DataZone to build a data mesh-based data solution. 2 requirements, raw data is ingested into Amazon
AppFlow, AWS Database Migration Service,
Amazon Kinesis, AWS IoT Core, and Amazon
Managed Streaming for Apache Kafka.

Data Data ingestion Member account 1 Central governance account In the producer account, raw data is transformed
sources data producer 3 using AWS Glue. Metadata is stored in AWS Glue
2 Catalog, data quality is measured using AWS Glue
1 3 Member Account 1
Data Quality. Data stored in Amazon Simple
data consumer
Data storage Storage Service (Amazon S3), Amazon Redshift,
SQL/NoSQL Amazon DataZone AWS IAM
Amazon Relational Database Service, and third-
databases Amazon domain Identity Center
analytics and party sources is registered as an asset in the Amazon
Appflow DataZone catalog hosted in the central governance
Amazon RDS reporting
Amazon S3 account using Amazon DataZone data source jobs.
Amazon DataZone 7
file shares data portal Amazon
4 The central governance account hosts the Amazon
SageMaker
Amazon Athena
4 DataZone domain and the related data portal. The
AWS DMS Amazon DataZone
AWS accounts of the data producers and consumers
Amazon third-party domain units
are associated with the Amazon DataZone domain.
devices Redshift sources
Amazon DataZone projects belonging to the data
Amazon Redshift producers and consumers are created under related
Amazon Amazon DataZone AWS Glue Amazon DataZone domain units.
Kinesis projects Data Catalog
Data processing
logs End users of assets log into the Amazon DataZone
5 5 data portal hosted in the central governance account
Amazon QuickSight
using their AWS Identity and Access Management
AWS IoT AWS Glue search and filter (IAM) credentials or single sign-on (SSO) integration
social Core data products through IAM Identity Center. They search, filter, and
view asset information like data quality, business,
and technical metadata.
Amazon DataZone
SaaS AWS Glue AWS Glue data catalog
applications Amazon Managed
Data Quality Data Catalog 6 6 When the end users find assets of interest, they
Member Account N request access using the subscription feature of
Streaming for data subscription Amazon DataZone. Based on the validity of the
data consumer
Apache Kafka Member account N workflow request, the asset owner approves or rejects the
media data producer request.

7 After the subscription request is granted and


fulfilled, the asset is accessed in the consumer
account for AI/ML model development using
Reviewed for technical accuracy November 2024 Amazon SageMaker; for analytics and reporting, use
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture Amazon Athena, Amazon Redshift, and Amazon
QuickSight.

You might also like