0% found this document useful (0 votes)

296 views11 pages

Data Engineering With Databricks

Uploaded by

krishna.krishnasai330

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

296 views11 pages

Data Engineering With Databricks

Uploaded by

krishna.krishnasai330

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Welcome to

Data Engineering
with Databricks

©2022 Databricks Inc. — All rights reserved 1

Course Objectives
1. Use the Databricks Data Science and Engineering Workspace to perform
common code development tasks in a data engineering workflow.
2. Use Spark to extract data from a variety of sources, apply common cleaning
transformations, and manipulate complex data with advanced functions.
3. Define and schedule data pipelines that incrementally ingest and process
data through multiple tables in the lakehouse using Delta Live Tables.
4. Orchestrate data pipelines with Databricks Workflow Jobs and schedule
dashboard updates to keep analytics up-to-date.
5. Configure permissions in Unity Catalog to ensure that users have proper
access to databases for analytics and dashboarding.

©2022 Databricks Inc. — All rights reserved

Course Overview
Module 0: Get Started with PySpark Programming (OPTIONAL)
Module 1: Get Started with Databricks Data Science and Engineering Workspace
Module 2: Transform Data with Spark (SQL or PySpark)
Module 3: Manage Data with Delta Lake
Module 4: Build Data Pipelines with Delta Live Tables (SQL or PySpark)
Module 5: Deploy Workloads with Databricks Workﬂows
Module 6: Manage Data Access for Analytics with Unity Catalog

©2022 Databricks Inc. — All rights reserved

Module Agendas

©2022 Databricks Inc. — All rights reserved 4

Module Agenda
Get Started with PySpark Programming

Spark SQL Overview

DE 0.1 - Spark SQL
DE 0.2L - Spark SQL Lab
DE 0.3 - DataFrame & Column
DE 0.4L - Purchase Revenues Lab
DE 0.5 - Aggregation
DE 0.6L - Revenue by Trafﬁc Lab

©2022 Databricks Inc. — All rights reserved 5

Module Agenda
Get Started with Databricks Data Science and Engineering Workspace

Introduction to the Databricks Lakehouse Platform

Databricks Architecture and Services
Demo - Navigating the Workspace
DE 1.1 - Create and Manage Clusters Interactively
DE 1.2 - Notebook Basics
Git Versioning with Databricks Repos
Demo - Using Databricks Repos
DE 1.3L - Getting Started with the Databricks Lakehouse Platform Lab

©2022 Databricks Inc. — All rights reserved 6

Module Agenda
Transform Data with Spark SQL Transform Data with PySpark

DE 2.1 - Querying Files Directly DE 3.1 - Querying Files Directly

DE 2.2 - Options for External Sources DE 3.2 - Reader & Writer
DE 2.3L - Extract Data Lab DE 3.3L - Extract Data Lab
DE 2.4 - Cleaning Data DE 3.4 - Cleaning Data
DE 2.5 - Complex Transformations DE 3.5 - Complex Transformations
DE 2.6 - UDFs and Control Flow DE 3.6 - UDFs
DE 2.7L - Reshape Data Lab DE 3.7L - Reshape Data Lab

Module Agenda
Manage Data with Delta Lake

What is Delta Lake

DE 4.1 - Schemas and Tables
DE 4.2 - Version and Optimize Delta Tables
DE 4.3L - Manipulate Delta Tables Lab
DE 4.4 - Set Up Delta Tables
DE 4.5 - Load Data into Delta Lake
DE 4.6 - Load Data Lab

Module Agenda
Build Data Pipelines with Delta Live Tables

Introduction to Delta Live Tables

DE 5.1 - DLT UI Walkthrough
DE 5.1A - SQL Pipelines
DE 5.1B - Python Pipelines
DE 5.2 - Python vs SQL
DE 5.3 - Pipeline Results
DE 5.4 - Pipeline Event Logs

Module Agenda
Deploy Workloads with Databricks Workﬂows

Introduction to Workﬂows
Building and Monitoring Workﬂow Jobs
DE 6.1 - Scheduling Tasks with the Jobs UI
DE 6.2L - Jobs Lab
DE 6.3 - Navigating Databricks SQL and Attaching to Endpoints
DE 6.4 - Last Mile ETL with DBSQL

Module Agenda
Manage Data Access for Analytics with Unity Catalog

Introduction to Unity Catalog

DE 7.1 - Managing principals in Unity Catalog
DE 7.2 - Managing Unity Catalog metastores
DE 7.3 - Creating compute resources for Unity Catalog access
DE 7.4 - Creating and governing data objects with Unity Catalog
DE 7.5 - Create and Share Tables in Unity Catalog
DE 7.6 - Create external tables in Unity Catalog
DE 7.7 - Upgrade a table to Unity Catalog
DE 7.8 - Create views and limit table access

Data Engineering With Databricks
No ratings yet
Data Engineering With Databricks
5 pages
Data Engineering With Databricks Da
100% (3)
Data Engineering With Databricks Da
232 pages
Data Engineering With Databricks (Verma, Sumit) (Z-Library)
No ratings yet
Data Engineering With Databricks (Verma, Sumit) (Z-Library)
193 pages
Databricks Lakehouse Guide
No ratings yet
Databricks Lakehouse Guide
149 pages
Data Engineering Databricks
No ratings yet
Data Engineering Databricks
139 pages
Data Engineering With Databricks
100% (2)
Data Engineering With Databricks
63 pages
Databricks Training
100% (1)
Databricks Training
4 pages
Python and Pyspark With Databricks, With Azure Project
No ratings yet
Python and Pyspark With Databricks, With Azure Project
9 pages
Get Started With Databricks For Machine Learning
No ratings yet
Get Started With Databricks For Machine Learning
85 pages
Getting Started With Databricks
No ratings yet
Getting Started With Databricks
39 pages
DatabricksDataEngineer Associate2024
80% (5)
DatabricksDataEngineer Associate2024
157 pages
Data Engineering With Databricks (Verma, Sumit) (Z-Library)
No ratings yet
Data Engineering With Databricks (Verma, Sumit) (Z-Library)
219 pages
Course Catalog
No ratings yet
Course Catalog
57 pages
Data Engineering With Databricks
No ratings yet
Data Engineering With Databricks
2 pages
Slide Deck Data Analysis With Databricks
No ratings yet
Slide Deck Data Analysis With Databricks
115 pages
Data Analysis With Databricks Version 2
No ratings yet
Data Analysis With Databricks Version 2
137 pages
De Mod 2 Transform Data With Spark
No ratings yet
De Mod 2 Transform Data With Spark
32 pages
Databricks Lakehouse & AI Overview
No ratings yet
Databricks Lakehouse & AI Overview
60 pages
Azure Databricks Documentation
100% (1)
Azure Databricks Documentation
7,197 pages
Customer Course Catalog
No ratings yet
Customer Course Catalog
93 pages
Apache Spark Programming With Databricks
No ratings yet
Apache Spark Programming With Databricks
112 pages
Data Analysis With Databricks Version 2
No ratings yet
Data Analysis With Databricks Version 2
137 pages
Databricks Academy Course Catalog
No ratings yet
Databricks Academy Course Catalog
58 pages
APJ Lakehouse Optimisation Webinar
No ratings yet
APJ Lakehouse Optimisation Webinar
53 pages
(Exam) Data Engineering Certification Prep Guide - Partners
No ratings yet
(Exam) Data Engineering Certification Prep Guide - Partners
15 pages
Databricks Certified Data Engineer Associate Exam Guide
No ratings yet
Databricks Certified Data Engineer Associate Exam Guide
7 pages
Databricks For The SQL Developer: Gerhard Brueckl
No ratings yet
Databricks For The SQL Developer: Gerhard Brueckl
40 pages
Data Analysis With Databricks
75% (4)
Data Analysis With Databricks
80 pages
Databricks 101 Crystal
No ratings yet
Databricks 101 Crystal
65 pages
Azure Databricks Course Content - Pratap - Qbex Technologies - 8886230001
No ratings yet
Azure Databricks Course Content - Pratap - Qbex Technologies - 8886230001
3 pages
Enhanced Databricks Training Agenda
No ratings yet
Enhanced Databricks Training Agenda
3 pages
Delta Lake
No ratings yet
Delta Lake
10 pages
De Mod 3 Manage Data With Delta Lake
No ratings yet
De Mod 3 Manage Data With Delta Lake
16 pages
Data Bricks S
No ratings yet
Data Bricks S
18 pages
Matthieu - Lamairesse - Reda - Khouani - Why The Best Serverless Data Warehouse Is A Lakehouse - (DAIWT - PARIS)
No ratings yet
Matthieu - Lamairesse - Reda - Khouani - Why The Best Serverless Data Warehouse Is A Lakehouse - (DAIWT - PARIS)
38 pages
Course Catalog
No ratings yet
Course Catalog
64 pages
Azure Data Engineer + Databricks Content
No ratings yet
Azure Data Engineer + Databricks Content
7 pages
Deploy Workloads With Lakeflow Jobs
No ratings yet
Deploy Workloads With Lakeflow Jobs
91 pages
Databricks Certified Data Engineer Associate Exam Guide 25
No ratings yet
Databricks Certified Data Engineer Associate Exam Guide 25
10 pages
Explain Databricks
No ratings yet
Explain Databricks
26 pages
Day 1
No ratings yet
Day 1
10 pages
Enhanced Databricks 6 Week Training Agenda
No ratings yet
Enhanced Databricks 6 Week Training Agenda
6 pages
My Career Roadmap
No ratings yet
My Career Roadmap
3 pages
Data Engineering 101 - Databricks Q&As
No ratings yet
Data Engineering 101 - Databricks Q&As
39 pages
Databricks Certified Data Engineer Professional Exam Guide 1 Mar 2025
No ratings yet
Databricks Certified Data Engineer Professional Exam Guide 1 Mar 2025
6 pages
Databricks 4 Week Study Plan With Links
No ratings yet
Databricks 4 Week Study Plan With Links
3 pages
Data Intelligence With Azure Databricks - Virtual 22 - 02 - 2024
No ratings yet
Data Intelligence With Azure Databricks - Virtual 22 - 02 - 2024
32 pages
Azure Data Engineering Course Interview Questions 1751484980
No ratings yet
Azure Data Engineering Course Interview Questions 1751484980
20 pages
Big Book of Data Engineering 2nd Edition Final
No ratings yet
Big Book of Data Engineering 2nd Edition Final
97 pages
What Is A Data Lakehouse
No ratings yet
What Is A Data Lakehouse
4 pages
DP 3011 ENU PowerPoint - 01 Content
No ratings yet
DP 3011 ENU PowerPoint - 01 Content
42 pages
Introduction To Databricks A Beginneers Guide
No ratings yet
Introduction To Databricks A Beginneers Guide
20 pages
Data Pipelines W DLT (Template)
No ratings yet
Data Pipelines W DLT (Template)
89 pages
PySpark and Azure Data Engineer Free Notes
100% (1)
PySpark and Azure Data Engineer Free Notes
65 pages
Databricks Certified Data Engineer Associate Exam Guide 25 3
No ratings yet
Databricks Certified Data Engineer Associate Exam Guide 25 3
7 pages
Databricks Data Engineer Associate Notes
100% (1)
Databricks Data Engineer Associate Notes
5 pages
Databricks Platform & Workspace Guide
No ratings yet
Databricks Platform & Workspace Guide
131 pages
Databricks Developer Roadmap Guide
No ratings yet
Databricks Developer Roadmap Guide
2 pages
Databricks - Cheatsheet
No ratings yet
Databricks - Cheatsheet
7 pages
Aakriti Mahajan
No ratings yet
Aakriti Mahajan
45 pages
Nuclear Fuel Rod Thermal Analysis
No ratings yet
Nuclear Fuel Rod Thermal Analysis
12 pages
Design Cal of Cememt Silo PDF
100% (1)
Design Cal of Cememt Silo PDF
176 pages
Biomass 1
No ratings yet
Biomass 1
22 pages
Tool Safety for DIY Enthusiasts
No ratings yet
Tool Safety for DIY Enthusiasts
26 pages
Phyto Medicine 2002
No ratings yet
Phyto Medicine 2002
4 pages
Bommer Et Al 2015 A Sshac Level 3 Probabilistic Seismic Hazard Analysis For A New Build Nuclear Site in South Africa
No ratings yet
Bommer Et Al 2015 A Sshac Level 3 Probabilistic Seismic Hazard Analysis For A New Build Nuclear Site in South Africa
38 pages
Evolution100-200 - Overviewbrochure TK Elevator Belt
No ratings yet
Evolution100-200 - Overviewbrochure TK Elevator Belt
19 pages
IIP Mr. & Ms. Palaro 2022-2023 Guide
No ratings yet
IIP Mr. & Ms. Palaro 2022-2023 Guide
2 pages
Paybooks Employee Self Service
No ratings yet
Paybooks Employee Self Service
19 pages
Lecture 1 Parts of Speech
No ratings yet
Lecture 1 Parts of Speech
16 pages
CASE 12-159347 Redacted
No ratings yet
CASE 12-159347 Redacted
5 pages
P2AP PartIV Learnhowtodraftapatentapplication Final 0
No ratings yet
P2AP PartIV Learnhowtodraftapatentapplication Final 0
36 pages
Brief - Unit 2 Drugs and Cosmetics Act
100% (1)
Brief - Unit 2 Drugs and Cosmetics Act
28 pages
Informe en Inglés Escuela de Contabilidad 2023 - II
No ratings yet
Informe en Inglés Escuela de Contabilidad 2023 - II
10 pages
Muat Bongkarmhp-Wks Sukadaryati
No ratings yet
Muat Bongkarmhp-Wks Sukadaryati
23 pages
902900-616 Despiece
No ratings yet
902900-616 Despiece
331 pages
Networking Devices & Protocols Guide
No ratings yet
Networking Devices & Protocols Guide
34 pages
Student Centered Learning Toolkit
No ratings yet
Student Centered Learning Toolkit
72 pages
Job Search and Application Practice
No ratings yet
Job Search and Application Practice
19 pages
Educational Psychology 5664
No ratings yet
Educational Psychology 5664
9 pages
Interrupts and Its Types
No ratings yet
Interrupts and Its Types
14 pages
Laboratory Oil Testers by Megger
No ratings yet
Laboratory Oil Testers by Megger
4 pages
Study Scheme - GBS 2024-25 Circular (AB2024 C049) 0
No ratings yet
Study Scheme - GBS 2024-25 Circular (AB2024 C049) 0
11 pages
Greške Po Standardu Bogdan 08 11 22
No ratings yet
Greške Po Standardu Bogdan 08 11 22
34 pages
Halal Food
No ratings yet
Halal Food
70 pages
Sect 3. Emergency Procedures
100% (1)
Sect 3. Emergency Procedures
108 pages
MPLS L2VPN Config Commands Guide
No ratings yet
MPLS L2VPN Config Commands Guide
28 pages
Gold Standard Benchmark For Cisco IOS Routers. Gold Standard Benchmark Version 3.0.1
No ratings yet
Gold Standard Benchmark For Cisco IOS Routers. Gold Standard Benchmark Version 3.0.1
37 pages
Harmonic Reduction in VSI: SVPWM vs SPWM
No ratings yet
Harmonic Reduction in VSI: SVPWM vs SPWM
5 pages

Data Engineering With Databricks

Uploaded by

Data Engineering With Databricks

Uploaded by

Welcome to

©2022 Databricks Inc. — All rights reserved 1

©2022 Databricks Inc. — All rights reserved

©2022 Databricks Inc. — All rights reserved

©2022 Databricks Inc. — All rights reserved 4

Spark SQL Overview

©2022 Databricks Inc. — All rights reserved 5

Introduction to the Databricks Lakehouse Platform

©2022 Databricks Inc. — All rights reserved 6

DE 2.1 - Querying Files Directly DE 3.1 - Querying Files Directly

©2022 Databricks Inc. — All rights reserved 7

What is Delta Lake

©2022 Databricks Inc. — All rights reserved 8

Introduction to Delta Live Tables

©2022 Databricks Inc. — All rights reserved 9

©2022 Databricks Inc. — All rights reserved 10

Introduction to Unity Catalog

©2022 Databricks Inc. — All rights reserved 11

You might also like