Azure Data Factory
1) Azure Data Factory
    1. What is Azure Data Factory?
    2. Azure Data Factory Architecture
    3. Azure Data Factory Portal UI
    4. Top-level concepts
              1. Pipelines
              2. Activities
              3. Linked services
              4. Datasets
              5. Triggers
              6. Data Flows
              7. Integration Runtimes
2) Pipeline
    1. What is a Pipeline?
    2. Create a new pipeline
    3. Organize pipelines into folders
    4. Debug pipeline
    5. Publish pipeline
    6. Parameters / Pipeline Parameters
3) Linked Service
    1. What is a Linked Service?
    2. Create a Linked Service for –
              1. BLOB
              2. SQL Database
              3. SQL Server
              4. Data Lake Storage Gen1
              5. Azure Data Lake Storage Gen2 etc
    3. Parameters / Linked Service Parameterization
4) DataSets
    1. What is a Data Set?
    2. Create a Data Set for –
                1. Avro, Binary, CSV, Excel, JSON, ORC, Parquet, XML in BLOB/ADLS Gen1/ADLS Gen2.
                2. Table in SQL Database, SQL Server, Oracle Database etc
    3. Parameters / Data Set Parameterization
5) Activities
    1. Wait
    2. Variables
                1. Create a variable
                2. Set variable
                3. Append variable
    3. Copy Data
                1. General
                2. Source
                3. Sink
                4. Mapping
                5. Settings
                6. User Properties
    4. Copy file(s) from one BLOB Container to another Container
                1. One file from a folder
                2. All files from a folder
                3. All files and folders recursively from a folder
    5. Copy data / file from BLOB to SQL Database / ADLS Gen2
                1. As CSV, TSV, Parquet, Avro, ORC etc.
    6. Databricks Notebook
    7. Azure Function
    8. Lookup, Stored Procedure
    9. Get Metadata, Delete
    10. Execute Pipeline
    11. Validation, Fail
    12. Iteration & Conditionals
                1. Filter
             2. ForEach
             3. If Condition
             4. Switch
             5. Until
6) What is a Trigger?
    1. Types
             1. Schedule
             2. Tumbling window
             3. Storage Events
    2. Triggers with Parameters
7) Integration Runtime (IR)
    1. Azure AutoResolveIntegrationRuntime
    2. Azure Managed Virtual Network
    3. Self-Hosted
    4. Linked Self-Hosted
8) Source control
    1. Git configuration
    2. ARM Template
             1. Export / Import
    3. Azure Devops Repos
9) Global parameters
10) Credentials
11) Monitoring ADF Jobs
12) Alerts
13) Send Failure Notifications using Logic Apps
14) Data Flows
    1. What is Data Flow?
    2. Mapping Data Flow
    3. Data Flow Debug
    4. Transformations
             1. Filter, Aggregate, Join
           2. Conditional Split, Derived Column
           3. Exists, Union, Lookup, Sort,
           4. GroupBy, Pivot, Unpivot, Flatten etc.
           5. Flatten, parase, stringify
           6. Filter sort, alterrow,asset
           7. flowlet
   5. Validate Schema, Schema Drift
   6. Remove Duplicate Rows using Mapping Data Flows in Azure Data Factory
15) Azure Devops
   1. Repos
16) SDLC
17) Agile Methodology
18) ADF Interview Questions
19) ADF Resume Preparation
20) End TO End ADF Project
21) ADF Exercises
   1. Create variables using set variable activity
   2. How to use if condition using if condition activity
   3. Iterating files using for loop activity
   4. Creating linked services, Data sets
   5. Copy activity – blob to blob
   6. Copy activity – blob to azure SQL
   7. Copy activity – pattern matching files copy
   8. Copy activity – copy the filtered file formats
   9. Copy activity – copy multiple files from blob to another blob
   10. Copy activity – Delete source files after copy activity
   11. Copy activity – using parameterized data sets
   12. Copy activity – convert one file format to another file format
   13. Copy activity – add additional columns to the source columns
   14. Copy activity – filter files and copy from one blob to another
   15. Delete the files from blob with more than 100KB
16. How to use getmetdata activity
17. Bulk copy tables and files
18. How to integrate keyvault in ADF
19. How to set up integration run time
20. Copy data from on premises to azure cloud
21. How to use databricks activity activity and pass paraemeters to it
22. How to use scheduling trigger
23. How to use tumbling window trigger
24. How to use event based trigger
25. How to use with Activity
26. How to use Until Activity
27. Dataflows – select the rows
28. Dataflows – Filter the rows
29. Dataflows – join Transformations
30. Dataflows – union Transformations
31. Dataflows – look up Transformations
32. Dataflows – window functions transformations
33. Dataflows – pivot, unpivot transformations
34. Dataflows – Alter rows transformations
35. Dataflows – Removing Duplicates transformations
36. How to pass parameters to the pipeline
37. How to create alerts and rules
38. How to set global parameters
39. How to import and export ARM templates
40. How to integrate ADF with Devops
41. How to use Azure devops Repos
42. How to send mail notifications using logic apps
43. How to monitor the pipelines
44. How to debug the pipelines
45. How to schedule pipeline using triggers
46. How to create trigger dependency
   47. How to one pipeline in another pipeline
Azure Databricks
1) Introduction to BigData
      What is Data?
      What is Database?
      What is BigData?
      What are the challenges of BigData?
      Why Traditional Databases Doesn’t handle Bigdata
2) Introduction to Hadoop
      What is Hadoop?
      How Hadoop overcome bigdata challenges
      Hadoop Architecture
      Hadoop Daemons
      HDFS
      YARN
      MapReduce
3) Introduction to Spark
      Spark Architecture
      Spark internals
      Spark RDD
      Spark DataFrame
      Spark Streaming
4) Introduction To Databricks
      What is Databricks?
      Databricks Architecture
      Working in Databricks workspace
      Workign with Databricks notebook
5) Working with Databricsks FileSystem – DBFS
      What is DBFS?
      DBFS commands – mkdirs , cp , mv , head, put, rm , rmdir
      How to handle multiple files in DBFS
       How to process the files in DBFS
       How to archive the files in DBFS
6) Databricks -Sparck Core
       RDD Programming
       Operations on RDD
       Transformations- Narrow
       Transformations -Wide
       Actions
       Loading Data and Saving Data
       Key Value Pair RDD
       Broadcast variables
7) Databricks – Spark-SQL- DataFrames
       Creating Data Frames
       DataFrames internal execution
       Transformations using DataFrame API
       Actions using DataFrame API
       User-defined functions in Spark SQL
8) Databricks- Handle multiple file formats
       CSV Data
       JSON Data
       parquet files
       Excel files
       ORC file format
9) Databricks utilities
       credentials utility
       FilSystem utility
       Notebook utility
       secrets utility
       widgets utility
10) Databricks Cluster Management
       Creating and configuring clusters
      Managing Clusters
      Displaying clusters
      Starting a cluster
      Terminating a cluster
      Delete a cluster
      Cluster Information
      Cluster logs
      Types of Clusters
      All pupose clusters
      Job cluster
      Clusters Mode
      Standard
      High Concurrency
      Autoscalling
      Databricks runtime versions
11) Databricks – Batch Processing
      Historical Data load
      Incremental Data load
      Date Transformations
      Aggregations
      Join Operations
      window functions
      union operations
12) Introduction to Azure
      Azure Portal Walkthrough
      What is Subscription?
      What is a Resource Group?
      What is a Resource?
      Overview of Azure Resources / Services
      Azure Data bricks
      BLOB Storage, Data Lake Storage Gen2
      Azure SQL Server, SQL Database
      Key Vault
13) Databricks Integration with
      Blob strorage storage
      Azure Datalake storage gen2
      Azure SQL Database
      Synapse
      Azure Keyvault
14) Databricks – Streaming API
      What is streaming?
      Process streaming using Pyspark API
      Handling bad records
      Stream data into Gen2lake
      Load the data into Tables
15) Databricks – Lakehouse (Delta Lake)
      Difference between Data lake and Delta Lake
      Introduction to Deltalake
      Features of DeltaLake
      How to create delta table
      How to DML operations in Delta Table
      Merge statements
      Handling SCD Type1 and Type2
      Handling Data Deduplication in delta tables
      Handling streaming Data in Delta lake
16) Databricks – Unity Catalog
      what is Unity catalog
      Creating access connector for databricks
      creating metastore in unity catalog
      Unity catalog object model
      Roles in unity catalog
      users and group management
      unity catalog previleges
      manages external tables in unity catalog
17) Workflows in Databricks
      Introduction to workflows
      Create, run and manage Databricks jobs
      Schedule Databricks jobs
      Monitor Databricks Jobs
18) Azure DevOps – Repos
      What are DevOps Repos
      Integrate databricks notebooks with Repos
      Commit, Sync notebooks to and from Repos
19) SDLC and Agile methodology
20) End to End Data Migration Project from On Premises to Cloud.
21) Interview Questions 22) Mock Interviews
Azure Synapse
1) Introduction & Overview
   1. Azure Synapse Analytics Overview
   2. Anzure Synapse Analytics Architecture
   3. Create Azure Free Account for Synapse
2) Overview of pools in Synapse Analytics
   1. Dedicated SQL pools
   2. Serverless SQL pool
   3. Apache Spark pools
   4. Data Explorer pools
3) Using Azure Synapse Analytics to Query Data Lake
   1. Creating Azure Synapse Analytics Workspace
   2. Uploading Sample Data into Data Lake Storage
   3. Exploring Azure Synapse Workspace and Studio
   4. Querying a Data Lake Store using serverless SQL pools in Azure Synapse Analytics
   5. Creating a View for CSV Data with a Serverless SQL Pool
4) Azure Storage Account Integration with Azure Synapse
   1. Copy multiple files from blob to blob using wildcard file options
   2. Copy multuple folders from blob to blob using dataset parameters
   3. Get File Names from Folder Dynamically and copy latest file from folder
5) Azure Synapse Triggers
   1. Schedule Trigger in Azure Synapse
   2. Event Based Trigger in Azure Synapse
6) Azure SQL Database integration with Azure Synapse
   1. Azure SQL Databases – Introduction __ Relational databases in Azure
   2. Copy data from SQL Database to ADLS Gen2 using table, query and stored procedure
   3. Overwrite and Append Modes in Copy Activity in Azure Synapse
   4. Use Foreach loop activity to copy multiple Tables- Step by Step Explanation
7) Incremental Load to Azure Synapse in Azure Synapse
   1. Incremental Load or Delta load from SQL to blob Storage in Azure Synapse
   2. Multi Table Incremental Load or Delta load from SQL to to Azure Synapse
   3. Incrementally copy new and changed files based on Last Modified Date
8) Logging and Notification and keyvault integration __ Azure Logic Apps
   1. Log Pipeline Executions to SQL Table using Azure Synapse
   2. Custom Email Notifications and keyvault integration with Linked Service
   3. Send Error notification with logic app
   4. Use Foreach loop activity to copy multiple Tables with pipeline logs logic and notifications
9) Deep dive into Copy Activity in Azure Synapse
   1. Load data from on premise sql server to Azure Synapse
   2. Copy Data from sql server to to Azure Synapse with polybase & Bulk Insert
   3. Copy Data from on-premise File System to Azure Synapse
   4. Loop through REST API copy data 2 ADLS Gen2 Linked Service Parameters
10) Data Flows Introduction
   1. Azure Data Flows Introduction
   2. Setup Integration Runtime for Data Flows
   3. Basics of SQL Joins for Azure Data FlowsServerless SQL Pool Demo
   4. Joins in Azure DataFlowsDedicated SQL Pool Demo
   5. Difference Between Join vs.Lookup Transformation& Merge Functionality Spark Pool Demo
   6. Dataflows – select the rows
   7. Dataflows – Filter the rows
   8. Dataflows – join Transformations
   9. Dataflows – union Transformations
   10. Dataflows – look up Transformations
   11. Dataflows – window functions transformations
   12. Dataflows – pivot, unpivot transformations
   13. Dataflows – Alter rows transformations
   14. Dataflows – Removing Duplicates transformations
11) Spark Pool Introduction in Azure Synapse
   1. Spark Introduction and components
   2. Spark Architecture
   3. Create notbook and notebook option and create notebook in different langauges
   4. MSSparkUtils for file system
   5. MSSparkUtils for for creating notebook parameters
   6. Magic commands and calling one synapse notebook from another and returning output of
      synapse notebook
   7. Configure keyvault in azure synapse notebook
   8. Different ways to connect to ADLSGen2 from synapse notebook
   9. Different ways to connect to Blob from synapse notebook
   10. Different ways to connect to Azure SQL Database from synapse notebook
   11. Different ways to connect to on premise SQL Server from synapse notebook
   12. Optimization while Reading and writing CSV files from Azure Synapse
   13. Reading and writing parquet files from Azure Synapse
   14. Reading and writing JSON files from Azure Synapse
   15. Reading and writing avro and orc files from Azure Synapse
   16. Reading and writing EXCEL files from Azure Synapse
   17. Different ways to create RDD in synapse notebook
   18. Different ways to create dataframes in synapse notebook
   19. When to use repartition and coalesce
   20. Joins in Synapse Notebook
   21. Broadcast Joins in Synapse Notebook and configuration of spark for optimization
   22. what is catalyst optimiser and skewness issue in spark
   23. Optimization techniques in pyspark
   24. Implementing SCD1 in Synapse Notebook
   25. Implementing SCD2 in Synapse Notebook
   26. Executing synapse notebooks from synapse pipleines with input and output parameters
12) Project: End to End DataMigration using Synapse Analytics