Data Analyst Interview Prep - SQL, Power BI, and Behavioral Questions
SQL Questions & Sample Queries
1. Top 3 departments with highest salaries
SELECT department_id, SUM(salary) AS total_salary
FROM Employees
GROUP BY department_id
ORDER BY total_salary DESC
LIMIT 3;
2. Customers with orders > Rs.50,000 in 60 days
SELECT customer_id, SUM(order_amount) AS total_spent
FROM Orders
WHERE order_date >= CURRENT_DATE - INTERVAL 60 DAY
GROUP BY customer_id
HAVING total_spent > 50000;
3. First and recent transaction per customer
SELECT customer_id, MIN(transaction_date), MAX(transaction_date)
FROM Transactions
GROUP BY customer_id;
4. Orders placed during public holidays
SELECT o.* FROM Orders o JOIN Public_Holidays p ON o.order_date = p.holiday_date;
5. Revenue per product category
SELECT category, SUM(sales_amount) AS total_revenue FROM Sales GROUP BY category;
6. Third highest salary (no LIMIT/TOP/OFFSET)
SELECT salary FROM Employees e1
WHERE 2 = (SELECT COUNT(DISTINCT salary) FROM Employees e2 WHERE e2.salary > e1.salary);
7. Customers who never placed an order
SELECT c.* FROM Customers c LEFT JOIN Orders o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL;
8. Best-selling product in each region
SELECT region, product_id, SUM(quantity) AS total_quantity
FROM Orders
GROUP BY region, product_id
QUALIFY RANK() OVER (PARTITION BY region ORDER BY SUM(quantity) DESC) = 1;
9. Discount % per order and rank
SELECT order_id, category, (discount_amount / original_price) * 100 AS discount_percentage,
RANK() OVER (PARTITION BY category ORDER BY (discount_amount / original_price) * 100 DESC) AS
rank_in_category
FROM Orders;
10. Orders > 75th percentile value
WITH order_values AS (
SELECT order_id, (quantity * unit_price) AS order_value FROM Orders
), percentile_75 AS (
SELECT PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY order_value) AS p75 FROM order_values
)
SELECT o.* FROM order_values o, percentile_75 p WHERE o.order_value > p.p75;
Power BI Questions & Approaches
1. Customer Churn Dashboard:
KPI cards, trend lines, cohort analysis, reasons for churn, drill-down filters.
2. DAX - Year-over-Year Revenue Growth:
YoY Growth = DIVIDE([Total Revenue] - CALCULATE([Total Revenue], SAMEPERIODLASTYEAR('Date'[Date])),
CALCULATE([Total Revenue], SAMEPERIODLASTYEAR('Date'[Date])))
3. DAX - Running Total Sales:
Running Total Sales = CALCULATE([Total Sales], FILTER(ALLSELECTED('Date'[Date]), 'Date'[Date] <=
MAX('Date'[Date])))
4. Dynamic Filtering:
Use slicers, What-If parameters, update visuals based on selections.
5. DAX - Customer Spend Categories:
SWITCH(TRUE(), [Total Purchase] >= 100000, "High", [Total Purchase] >= 50000, "Medium", "Low")
6. Row-Level Security (RLS):
Set Department-based roles and apply USERPRINCIPALNAME() filters.
7. Highlight Low-Performing Products:
Create measure with IF logic, apply conditional formatting.
8. Power BI Dataflow Creation:
ETL from multiple sources, centralized, reusable datasets.
9. Best Performance Practices:
Use star schema, avoid calculated columns, aggregate data, optimize visuals.
10. Power Query Tasks:
- Remove duplicates: Remove Duplicates tool.
- Split column: Split by Delimiter.
- Merge datasets: Merge Queries using Join Kind.
Behavioral & Scenario-Based Sample Answers
1. Data pipeline improving reporting:
Automated sales data cleanup with Python/SQL, reduced reporting time, improved accuracy.
2. Analysis leading to business decision:
Identified high churn segment via cohort analysis, suggested discounts, improved retention.
3. Handling conflicting priorities:
Prioritized by business impact, clear stakeholder communication, and weekly syncs.
4. Resolving inconsistent data:
Root cause analysis, data reconciliation reports, DB team collaboration.
5. Communicating technical insights:
Used simple visualizations, avoided jargon, shared actionable insights in layman terms.