KEMBAR78
Data Engineer | PDF | Gini Coefficient | Data Management
0% found this document useful (0 votes)
6 views13 pages

Data Engineer

Sql queries preparation by nitya

Uploaded by

nishu12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views13 pages

Data Engineer

Sql queries preparation by nitya

Uploaded by

nishu12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Nitya CloudTech Pvt Ltd.

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
1. Calculate the Weighted Average of Scores for Each
Department
Interviewer:
How would you calculate the weighted average of scores for
each department?
Candidate:
Let’s assume the following table:
Table: Scores

Department StudentID Score Weight


HR 1 80 0.3
HR 2 90 0.7
IT 3 75 0.5
IT 4 85 0.5

SELECT
Department,
SUM(Score * Weight) / SUM(Weight) AS
WeightedAverage
FROM Scores
GROUP BY Department;

Result:
Department WeightedAverage
HR 87.0
IT 80.0

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
2. Find the Mode (Most Frequent Value) of a Column in a
Dataset
Interviewer:
How would you find the mode of a column in a dataset?
Candidate:
Let’s assume the following table:
Table: SurveyResponses

ResponseID Response
1 Excellent
2 Good
3 Excellent
4 Good
5 Good

SELECT
Response,
COUNT(*) AS Frequency
FROM SurveyResponses
GROUP BY Response
ORDER BY Frequency DESC
LIMIT 1;

Result:
Response Frequency
Good 3

3. Group Data by a Time Interval and Calculate Metrics

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
Interviewer:
How would you group data by a time interval (e.g., every 15
minutes) and calculate metrics?
Candidate:
Let’s assume the following table:
Table: SensorReadings

ReadingID Timestamp Value


1 2024-01-01 10:00:00 50
2 2024-01-01 10:10:00 60
3 2024-01-01 10:20:00 55

SELECT
DATEADD(MINUTE, DATEDIFF(MINUTE, 0,
Timestamp) / 15 * 15, 0) AS TimeInterval,
COUNT(*) AS ReadingCount,
AVG(Value) AS AverageValue
FROM SensorReadings
GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE,
0, Timestamp) / 15 * 15, 0);

Result:
TimeInterval ReadingCount AverageValue
2024-01-01 10:00:00 3 55.0

4. Aggregate Data with Multiple Levels of Grouping and


Custom Filters
Interviewer:

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
How would you aggregate data with multiple levels of
grouping and apply custom filters?
Candidate:
Let’s assume the following table:
Table: Sales

Region Product SalesAmount


North Laptop 2000
North Phone 1500
South Laptop 2500
South Phone 3000

SELECT
Region,
Product,
SUM(SalesAmount) AS TotalSales
FROM Sales
WHERE SalesAmount > 1500
GROUP BY Region, Product;

Result:
Region Product TotalSales
North Laptop 2000
South Laptop 2500
South Phone 3000

5. Calculate the Gini Coefficient for Income Inequality


Using SQL

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
Interviewer:
How would you calculate the Gini coefficient for income
inequality in SQL?
Candidate:
Let’s assume the following table:
Table: IncomeData

Individual ID Income
1 5000
2 7000
3 2000

WITH RankedIncome AS (
SELECT
Income,
SUM(Income) OVER () AS
TotalIncome,
SUM(Income) OVER (ORDER BY
Income) AS CumulativeIncome
FROM IncomeData
),
LorenzCurve AS (
SELECT
CumulativeIncome / TotalIncome AS
CumulativeShare
FROM RankedIncome
)
SELECT

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
1 - 2 * SUM(CumulativeShare -
LAG(CumulativeShare, 1, 0) OVER (ORDER BY
CumulativeShare)) AS GiniCoefficient
FROM LorenzCurve;

6. Use ROLLUP to Calculate Subtotals and Grand Totals


Dynamically
Interviewer:
How would you use ROLLUP to calculate subtotals and grand
totals dynamically?
Candidate:
Let’s assume the following table:
Table: Sales

Region Product SalesAmount


North Laptop 2000
North Phone 1500
South Laptop 2500
South Phone 3000

SELECT
Region,
Product,
SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY ROLLUP(Region, Product);

Result:
Region Product TotalSales

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
North Laptop 2000
North Phone 1500
North NULL 3500
South Laptop 2500
South Phone 3000
South NULL 5500
NULL NULL 9000

7. Count Distinct Combinations of Two Columns


Interviewer:
How would you count the distinct combinations of two
columns?
Candidate:
Let’s assume the following table:
Table: Orders

OrderID CustomerID ProductID


1 101 P1
2 101 P2
3 102 P1
4 101 P1
5 102 P3

SELECT
COUNT(DISTINCT CONCAT(CustomerID, '-
', ProductID)) AS DistinctCombinations
FROM Orders;

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
Result:
DistinctCombinations
4

8. Group Data by a Derived Column Generated from a


CASE Statement
Interviewer:
How would you group data based on a derived column created
using a CASE statement?

Candidate:
Let’s assume the following table:
Table: Transactions

TransactionID Amount
1 100
2 200
3 500
4 50
5 300

SELECT
CASE
WHEN Amount < 100 THEN 'Low'
WHEN Amount BETWEEN 100 AND 300
THEN 'Medium'
ELSE 'High'
END AS AmountCategory,

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
COUNT(*) AS TransactionCount
FROM Transactions
GROUP BY
CASE
WHEN Amount < 100 THEN 'Low'
WHEN Amount BETWEEN 100 AND 300
THEN 'Medium'
ELSE 'High'
END;

Result:
Amount Category Transaction Count
Low 1
Medium 3
High 1

9. Find the Median Value of a Column Using SQL


Interviewer:
How would you calculate the median value of a column?
Candidate:
Let’s assume the following table:
Table: Scores

Student ID Score
1 70
2 80
3 90
4 85
5 75

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.

WITH RankedScores AS (
SELECT
Score,
ROW_NUMBER() OVER (ORDER BY
Score) AS RowNum,
COUNT(*) OVER () AS TotalRows
FROM Scores
)
SELECT
AVG(Score) AS MedianScore
FROM RankedScores
WHERE RowNum IN (
(TotalRows + 1) / 2,
(TotalRows + 2) / 2
);

Result:
MedianScore
80.0

10. Calculate the Percentage Share of Each Category


Within a Group
Interviewer:
How would you calculate the percentage share of each
category within a group?
Candidate:
Let’s assume the following table:

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
Table: Sales

Region Product SalesAmount


North Laptop 2000
North Phone 1500
South Laptop 2500
South Phone 3000

SELECT
Region,
Product,
SalesAmount,
(SalesAmount * 100.0 /
SUM(SalesAmount) OVER (PARTITION BY
Region)) AS PercentageShare
FROM Sales;

Result:
Region Product SalesAmount PercentageShare
North Laptop 2000 57.14%
North Phone 1500 42.86%
South Laptop 2500 45.45%
South Phone 3000 54.55%

http://www.nityacloudtech.com/ @nityacloudtech
Nitya CloudTech Pvt Ltd.
If you like the content, please consider supporting us by
donating using the QR code below. Your support helps us
continue creating valuable resources! Feel free to share this
with others who may benefit. Thank you for your support!

http://www.nityacloudtech.com/ @nityacloudtech

You might also like