📘 Microsoft Fabric Interview Guide
✅ 100 Interview Questions with Answers, Examples & Diagrams
1. Basics of Microsoft Fabric
Q1. What is Microsoft Fabric?\ A unified analytics platform combining data engineering, data science, real-
time analytics, and BI. It brings together Power BI, Synapse, and Data Factory into one SaaS experience.
Q2. What is OneLake?\ A single, logical, multi-cloud data lake for storing all organizational data. Acts as the
storage backbone for Fabric.
Q3. How is Fabric different from Synapse?\ Fabric is SaaS (integrated services in one UI). Synapse is PaaS
(separate services).
Q4. Difference between Lakehouse and Warehouse?
• Lakehouse = Open Delta storage (raw, semi-structured, ML-ready).
• Warehouse = Relational SQL tables (structured BI-ready).
Diagram: Fabric Ecosystem
┌───────────────────────────┐
│ Microsoft Fabric │
└───────────────┬───────────┘
│
┌──────┴──────┐
│ OneLake │
└──────┬──────┘
┌─────────────┼─────────────┐
│ │ │
Lakehouse Warehouse Pipelines
│ │ │
Raw + Curated BI Tables Orchestration
│ │ │
Semantic Models + Power BI (Reports)
Q5. What workloads exist in Fabric?
• Data Engineering (Notebooks, Lakehouse)
• Data Science (ML)
• Real-Time Analytics
1
• Data Factory (Pipelines, Dataflows)
• Warehouse (SQL)
• Power BI (Reports)
2. Lakehouse & Delta Tables
Q11. How does Lakehouse store data?\ As Delta tables (Parquet + transaction logs).
Q12. Difference between Parquet vs Delta?
• Parquet = columnar file format
• Delta = Parquet + transaction log + ACID
Q13. What is schema evolution?\ Ability to add/modify columns automatically when ingesting new data.
Q14. Create Delta Table Example (PySpark):
df.write.format("delta").mode("overwrite").save("Tables/myTable")
Q15. Query Delta Table (T-SQL):
SELECT * FROM delta."Tables/myTable";
Diagram: Medallion Architecture
┌────────────┐
│ Bronze │ → Raw ingestion (CSV, JSON, files)
└─────┬──────┘
│
┌─────▼─────┐
│ Silver │ → Cleaned & standardized data
└─────┬─────┘
│
┌─────▼─────┐
│ Gold │ → Aggregated, business-ready data
└───────────┘
Q19. How does Fabric ensure ACID?\ Delta transaction logs provide snapshot isolation + commit
consistency.
2
3. Warehouse
Q21. How is Fabric Warehouse different from Azure SQL DB?
• Fabric Warehouse is SaaS + fully managed, tightly integrated with OneLake.
• Azure SQL DB is standalone PaaS, requires provisioning.
Q23. How to implement star schema?\ Create dimension + fact tables in Warehouse.
Example:
CREATE TABLE DimCustomer (CustomerID INT PRIMARY KEY, Name VARCHAR(100));
CREATE TABLE FactSales (SaleID INT, CustomerID INT, Amount DECIMAL(10,2));
Diagram: Lakehouse vs Warehouse
Lakehouse → Stores raw + semi-structured + unstructured data
Warehouse → Stores structured data optimized for BI reporting
4. Pipelines & Data Ingestion
Q31. What is a Data Pipeline?\ An orchestration tool in Fabric for ingesting & transforming data.
Q33. Ingest on-prem SQL data?
• Use Data Gateway
• Configure Copy Activity → Lakehouse/Warehouse
Q37. How to capture pipeline run status?\ Store logs in a Metadata.IngestionBatch table.
Q39. Incremental ingestion?
• Use watermark column (e.g., ModifiedDate)
• Store last loaded value in metadata table
Q40. Copy vs Dataflow Gen2?
• Copy → Data movement only
• Dataflow Gen2 → Transformation + ingestion
Diagram: Metadata-driven Ingestion
3
Source System → Pipeline (Copy + Metadata Control) → Bronze (Lakehouse)
↓
Logging Tables
5. Dataflow Gen2
Q41. What is Dataflow Gen2?\ Low-code ETL in Fabric using Power Query engine.
Q42. How to implement incremental load?\ Filter rows using last_modified column from metadata.
Q43. Split header & line items?
• Import PO data → Split into two outputs → PO_Header & PO_Lines tables.
Q47. Apply column mapping?\ Map source → target columns in Dataflow transformation UI.
Q49. Schema drift handling?\ Enable schema drift → allows dynamic schema adaptation.
Diagram: Dataflow Transformation
Raw Data → Dataflow (Transform, Clean, Split) → Lakehouse/Warehouse Tables
6. Notebooks & Spark
Q51. Languages supported?\ Python, PySpark, SQL, R, Scala.
Q53. Load CSV Example:
df = spark.read.format("csv").option("header","true").load("Files/data.csv")
Q55. Spark SQL vs T-SQL?
• Spark SQL = distributed, big data processing
• T-SQL = relational queries on structured tables
Q57. Merge Incremental Data Example:
spark.sql("""
MERGE INTO target t
4
USING staging s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.val = s.val
WHEN NOT MATCHED THEN INSERT *
""")
Q60. Orchestration?\ Notebooks can be called inside Pipelines.
Diagram: Notebook Usage in Fabric
Raw Data → Notebook (PySpark/SQL Transformations) → Delta Tables in Lakehouse
7. CI/CD & DevOps
Q61. How to version control Fabric artifacts?\ Use Git integration with Azure Repos.
Q63. YAML Deployment Example:
trigger:
- main
jobs:
- job: Deploy
steps:
- task: PowerShell@2
inputs:
targetType: 'inline'
script: |
Write-Output "Deploying Fabric artifacts"
Q66. Environment configs?\ Use deployment pipeline rules (parameterize connections).
Q69. Git vs Deployment Pipeline?
• Git → Source control + branching
• Deployment Pipeline → Promote to TEST/PROD
Diagram: CI/CD Flow
DEV → Commit to Git → Deployment Pipeline → TEST → PROD
5
8. Security & Governance
Q71. How to implement RLS?\ Define roles in Warehouse or Semantic Model.
Q74. Example:
CREATE SECURITY POLICY SalesFilter
ADD FILTER PREDICATE dbo.fnRLS(CustomerID) ON FactSales;
Q75. Purview integration?\ Fabric integrates with Purview for data lineage & catalog.
Q76. Data masking?\ Use dynamic data masking in Warehouse.
Q78. Default OneLake security?\ Access controlled via Fabric workspaces + Microsoft Entra ID.
Diagram: RLS Security Flow
User → Role (defined in Semantic Model/Warehouse) → Filtered Query Result
9. Advanced Scenarios
Q81. How to implement Medallion?
• Bronze: Copy raw files → Lakehouse
• Silver: Transform with Dataflows/Notebooks
• Gold: Load curated tables into Warehouse
Q83. Implement SCD Type 2?\ Use MERGE with valid_from, valid_to columns.
Q85. Real-time streaming?\ Use Eventstream in Fabric → Lakehouse → Power BI.
Q87. Detect duplicates Example (PySpark):
df.groupBy("id").count().filter("count > 1")
Q90. GDPR compliance?\ Implement data retention policies + masking + audit.
Diagram: Real-time Data Flow
6
Event Stream → Fabric Lakehouse (Delta Table) → Power BI Dashboard
10. Semantic Models & Power BI
Q91. What is a Semantic Model?\ A centralized data model (like Power BI dataset) in Fabric.
Q96. Import vs Direct Lake vs DirectQuery?
• Import → Cached data, fast
• Direct Lake → Query Lakehouse directly (best for Fabric)
• DirectQuery → Query external DB (slower)
Q94. DAX Example (YTD Sales):
Sales YTD = TOTALYTD(SUM(FactSales[Amount]), DimDate[Date])
Q98. RLS in Semantic Model?\ Define roles in model view → assign filters.
Q100. Direct Lake advantages?
• Near real-time
• High performance (no duplication)
• Lower cost
Diagram: Power BI Integration
Lakehouse/Warehouse → Semantic Model → Power BI Report
✅ Summary
• Lakehouse = Big data + ML + raw storage
• Warehouse = Structured BI reporting
• Pipelines & Dataflows = ETL orchestration
• Notebooks = Advanced transformations
• Deployment Pipelines + Git = CI/CD
• Semantic Models = Power BI integration
This guide gives you 100 Q&A with examples and diagrams across all Fabric components, structured for
10–12 page PDF revision before interviews.