1. What is DBT and how does it work with Snowflake?
Answer:
DBT is a transformation tool that enables data analysts and engineers to transform data in the
warehouse (e.g., Snowflake) using SQL and Jinja. It compiles models into SQL and executes
them on the Snowflake warehouse.
2. How do you configure Snowflake in DBT?
Answer:
Add the following to your profiles.yml:
yaml
CopyEdit
my_snowflake_profile:
target: dev
outputs:
dev:
type: snowflake
account: "<account_id>"
user: "<user>"
password: "<password>"
role: "<role>"
database: "<database>"
warehouse: "<warehouse>"
schema: "<schema>"
threads: 4
3. How do you create a model in DBT?
Answer: Create a SQL file like customers.sql inside models/ folder.
sql
CopyEdit
-- models/customers.sql
SELECT id, name, email FROM {{ ref('raw_customers') }}
4. What are materializations in DBT?
Answer: Materializations define how DBT builds a model:
table (persisted)
view (virtual)
incremental (append or update)
ephemeral (in-memory)
Set in dbt_project.yml or at model level.
sql
CopyEdit
-- models/orders.sql
{{ config(materialized='incremental') }}
SELECT * FROM {{ ref('stg_orders') }}
5. How to implement incremental models in DBT with Snowflake?
Answer:
sql
CopyEdit
-- models/incremental_orders.sql
{{ config(materialized='incremental', unique_key='order_id') }}
SELECT * FROM {{ ref('raw_orders') }}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
6. How to use Jinja templating in DBT with Snowflake?
Answer: Use Jinja for dynamic SQL:
sql
CopyEdit
{% set limit = 100 %}
SELECT * FROM {{ ref('customers') }} LIMIT {{ limit }}
7. How do you define tests in DBT for Snowflake tables?
Answer:
In schema.yml:
yaml
CopyEdit
version: 2
models:
- name: customers
columns:
- name: id
tests:
- not_null
- unique
8. How to run and test a DBT project with Snowflake?
Answer:
bash
CopyEdit
dbt run
dbt test
9. How do you create snapshots in DBT for slowly changing dimensions (SCD)?
Answer:
sql
CopyEdit
-- snapshots/snapshot_customers.sql
{% snapshot snapshot_customers %}
{{
config(
target_schema='snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at'
)
}}
SELECT * FROM {{ ref('raw_customers') }}
{% endsnapshot %}
10. How do you use macros in DBT for Snowflake?
Answer:
Create a macro:
sql
CopyEdit
-- macros/get_latest_date.sql
{% macro get_latest_date(column) %}
MAX({{ column }})
{% endmacro %}
Use in a model:
sql
CopyEdit
SELECT {{ get_latest_date('created_at') }} FROM {{ ref('orders') }}
11. How to write a custom test in DBT?
Answer:
sql
CopyEdit
-- tests/no_null_emails.sql
SELECT * FROM {{ ref('customers') }} WHERE email IS NULL
Define in schema.yml:
yaml
CopyEdit
columns:
- name: email
tests:
- no_null_emails
12. How do you use seeds in DBT with Snowflake?
Answer:
Place a CSV in the seeds/ folder and run:
bash
CopyEdit
dbt seed
Use in model:
sql
CopyEdit
SELECT * FROM {{ ref('country_codes') }}
13. How do you use variables in DBT for Snowflake?
Answer:
In dbt_project.yml:
yaml
CopyEdit
vars:
target_country: 'USA'
In model:
sql
CopyEdit
SELECT * FROM {{ ref('customers') }}
WHERE country = '{{ var("target_country") }}'
14. How do you version control DBT models?
Answer: Use Git to version control .sql, .yml, and .md files in your DBT repo.
15. How to implement SCD Type 2 using DBT in Snowflake?
Answer: Use snapshot strategy:
sql
CopyEdit
{% snapshot customer_snapshot %}
{{
config(
target_schema='snapshots',
unique_key='customer_id',
strategy='check',
check_cols=['email', 'phone']
)
}}
SELECT * FROM {{ ref('customers') }}
{% endsnapshot %}
16. How do you debug SQL model in DBT for Snowflake?
Answer:
bash
CopyEdit
dbt run --select model_name --debug
Use dbt compile to check raw SQL.
17. How do you structure a production DBT project in Snowflake?
Answer: Structure:
models/staging/
models/marts/
models/intermediate/
snapshots/
macros/
tests/
seeds/
18. How do you use environment variables in DBT?
Answer:
export DBT_ENV_SECRET_PASSWORD=yourpassword
In profiles.yml:
yaml
CopyEdit
password: "{{ env_var('DBT_ENV_SECRET_PASSWORD') }}"
19. How to schedule DBT jobs with Snowflake and dbt Cloud or Airflow?
Answer: Use dbt Cloud scheduler or trigger dbt run from Airflow DAG using BashOperator or
dbt CLI in cron.
20. How do you optimize large DBT models in Snowflake?
Answer:
Use incremental models
Materialize heavy joins as tables
Avoid overuse of CTEs in large joins
Use Snowflake clustering and partitioning
Monitor with query profiler