January 2024
Azure Data Engineer Learning Pathway (1/2) www.aka.ms/pathways
Getting started Microsoft Learn/Documentation Role based Certification
Learn how to implement and manage data Design and implement data storage Design and implement data storage DP-203: Azure Data Engineer
engineering workloads on Microsoft Azure, • Understand Azure Data Lake Storage Gen2 • Views in Synapse serverless SQL pools Skills Measured
using Azure services such as Azure Synapse • Access tiers for Azure Blob Storage • Tutorial: Load data to Azure Synapse Analytics SQL
Analytics, Azure Data Lake Storage Gen2, Azure • Design and implement data storage
• Storage considerations when using Azure Synapse pool
Stream Analytics, Azure Databricks, and others. • Develop data processing
serverless SQL pools • Create, develop, and maintain Synapse notebooks
in Azure Synapse Analytics • Secure, monitor, and optimize data storage
Explore common data engineering tasks such • Query a Parquet file using Azure Synapse serverless
and data processing
as orchestrating data transfer and SQL pools
transformation pipelines, working with data files • Dynamic file pruning Design and develop data processing Self Study:
in a data lake, creating and loading relational • Understand table distribution design • Common practices for data loading
data warehouses, capturing and aggregating • Get started with data engineering on Azure
• Partitioning tables in dedicated SQL pool • Tutorial: Extract, transform, and load data by using
streams of real-time data, and tracking data • Build data analytics solutions using Azure Synapse
• Understand table distribution design Azure Databricks
assets and lineage. serverless SQL pools
• Best practices for dedicated SQL pools in Azure • Understand the Streaming Analytics Workflow
• Perform data engineering with Azure Synapse Apache
Audience Profile: Synapse Analytics • Handling bad records and files
Spark Pools
• Star Schema • Prepare and transform data with Azure Synapse
• Work with Data Warehouses using Azure Synapse Analytics
The primary audience for this course is data • Multidimensional Schemas and Data Analytics
professionals, data architects, and business Analyse complex data types in Azure Synapse • Transfer and transform data with Azure Synapse Analytics
• Manage retention of historical data in system- •
intelligence professionals who want to learn Analytics pipelines
versioned temporal tables
about data engineering and building analytical • Getting started with temporal tables • Understand data store models • Work with Hybrid Transactional and Analytical Processing
solutions using data platform technologies that Solutions using Azure Synapse Analytics
• Create and configure a self-hosted integration • Prepare and transform data
exist on Microsoft Azure. • Implement a Data Streaming Solution with Azure Stream
runtime • Define a modern data warehouse architecture
Analytics
• New to the Cloud or Azure? Start with Azure • Manage self-hosted integration runtime • Choosing a batch processing technology
• Choosing an analytical data store in Azure • Govern data across an enterprise
Fundamentals • Manage source data files
• New to data solutions on Azure? Build your • Synapse Analytics shared metadata tables • Copy activity in Azure Data Factory • Data engineering with Azure Databricks
knowledge with Data Fundamentals • When do you use Apache Spark pools? • MERGE (Transact-SQL)
• Intro to data classification and protection • Data Compression • Continuous integration and delivery for Azure Exam Study
Course Page Exam Page
Get started with data engineering on Azure • Exercise - Use table distribution and indexes to Synapse workspace Guide
• Introduction to data engineering on Azure improve performance • Handle SQL truncation error rows in Data Factory
• Introduction to Azure Data Lake Storage • Change storage account is replication mapping data flows 30 Day Practice Video on
Gen2 Challenge Assessment Demand (soon)
• Slowly Changing Dimension Transformation
• Introduction to Azure Synapse Analytics • Populate slowly changing dimensions
• Create external tables in Azure Synapse serverless Azure Data Architecture Guide
SQL pools
January 2024
Azure Data Engineer Learning Pathway (2/2) www.aka.ms/pathways
Additional Study
Microsoft Applied Skills
Design and develop data processing Design and implement data security Monitor and optimize data storage and
• Backup and restore in Azure Synapse Dedicated SQL • Implement encryption data processing
pool • Data ingestion security considerations • Auto Optimize in Azure Databricks
• Implement workload management • Configure authentication • Modify user-defined functions Targeted validation for real-world scenarios. Demonstrate
• Use extended Apache Spark history server to debug and • Designing distributed tables proficiency in specific, scenario- based skill sets so you can make
• Access control lists (ACLs) in Azure
diagnose Apache Spark applications a bigger impact on every project, at your organization, and in
Data Lake Storage Gen2 • Data spillage scenario - Search and
your career
• Enterprise Data Warehouse Architecture • Synapse access control purge
• Stream processing with Azure Databricks • Column-level security • Quickstart: Create an Azure Synapse
workspace using an ARM template
Explore Applied Skills
• Azure Synapse Analytics • Manage authorization through
• Monitoring for performance efficiency column and row level security • Indexing dedicated SQL pool tables
• Work with windowing functions • Manage user permissions • Performance tuning with result set
• Schema drift • Auditing for Azure SQL Database and caching
Azure Synapse Analytics • Optimize Apache Spark jobs 30 days to Learn it Challenge
• Time handling in Stream Analytics
• Checkpoint and replay concepts in Azure Stream • Retention Policy on storage accounts • Troubleshoot library installation errors
Analytics jobs • Understand network security options • Debug data factory pipelines 30 Days to Learn It can help you build skills and start your
• Scale an Azure Stream Analytics job to increase • Dynamic Data Masking preparation for Microsoft Certifications for AI, DevOps, Microsoft
throughput • Secure a dedicated SQL pool 365, low code, IoT, data science, cloud development, and more.
Select your challenge below, work through learning modules, and
exchange ideas with peers through a global community forum.
Design and develop data processing Monitor and optimize data storage and
• Use repartitioning to optimize processing data processing
• Azure Stream Analytics output error policy • Monitor and Alert Data Factory by
using Azure Monitor Explore the challenges
• Stream Analytics output to Cosmos DB
• Stream processing with Stream Analytics • Exercise – Implement workload
• Data Loading best practices management
• Get Started with Synapse Analytics • Monitor your Azure Synapse Analytics
dedicated SQL pool workload using
• Monitor your Synapse Workspace DMVs
• Collect custom logs with Log Analytics
agent
• Use Synapse Studio to monitor your
workspace pipeline runs
• Deploying Apache Airflow in Azure to
build and run data pipelines