KEMBAR78
Mock Interview Questions and Answers | PDF | No Sql | Data Management
0% found this document useful (0 votes)
22 views37 pages

Mock Interview Questions and Answers

Uploaded by

Aemi Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views37 pages

Mock Interview Questions and Answers

Uploaded by

Aemi Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

1.. Theoretical quest.: Difference between left join and inner join.

How would these work in case of


duplicates?

In SQL, both INNER JOIN and LEFT JOIN are used to combine rows from two
or more tables based on a related column between them. However, they
differ in how they handle unmatched rows between the tables.

1. INNER JOIN:
 INNER JOIN returns only the rows that have matching values in
both tables based on the join condition.
 If there are duplicate values in the join column of either table,
the INNER JOIN will return all possible combinations of those
duplicates.
Example:
SELECT *
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name;

_name;
2. LEFT JOIN:
 LEFT JOIN returns all the rows from the left table (the first
table mentioned in the query) and the matched rows from the
right table. If there are no matching rows in the right table, it
returns NULL values for the columns of the right table.
 If there are duplicate values in the join column of either table,
the LEFT JOIN will still return all rows from the left table, and
for each occurrence of a duplicate value, it will produce a
separate row with the corresponding matches from the right
table.
Example:

SELECT *
FROM table1
LEFT JOIN table2 ON table1.column_name = table2.column_name;
sqlCopy code
SELECT * FROM table1 LEFT JOIN table2 ON table1.column_name =
table2.column_name;

In summary, both INNER JOIN and LEFT JOIN can handle duplicates, but
they differ in how they handle unmatched rows and how they include
duplicates in the result set. INNER JOIN only returns matching rows, while
LEFT JOIN returns all rows from the left table and includes matching rows
from the right table, with NULL values for non-matching rows.

2.Practical quest.:From the table restaurant_transactions containing the columns id, date and final
bill. Find out the highest revenue month in the year 2021
Ans:-

1. Filter the data to include only transactions from the year 2021.
2. Group the data by month and calculate the total revenue for each
month.
3. Find the month with the highest total revenue.

Here's the SQL query to achieve this:

SELECT
EXTRACT(MONTH FROM date) AS month,
SUM(final_bill) AS total_revenue
FROM
restaurant_transactions
WHERE
EXTRACT(YEAR FROM date) = 2021
GROUP BY
EXTRACT(MONTH FROM date)
ORDER BY
total_revenue DESC

LIMIT 1;

3.Practical quest.: From the above data set find out the highest revenue month in each year.

Same as above.

4. Theoretical quest.: What is the difference between dense_rank() and rank() functions.

n SQL, both RANK() and DENSE_RANK() are window functions used to assign ranks to
rows within a partition of a result set based on the values of a specified column. However,
they differ in how they handle ties (rows with equal values) in the ranking process:

1. RANK() Function:
 RANK() assigns unique ranks to rows within the partition.
 If there are ties (rows with equal values), RANK() assigns the same rank to
each tied row, but leaves gaps in the ranking sequence for the next rank. For
example, if two rows tie for the first place, the next row would be ranked third
(not second).
 The rank returned by RANK() increments by one for each distinct value in the
ordered partition.
2. DENSE_RANK() Function:
 DENSE_RANK() also assigns ranks to rows within the partition.
 If there are ties, DENSE_RANK() assigns the same rank to each tied row, but
does not leave gaps in the ranking sequence. It assigns consecutive ranks to
the tied rows, without any gaps.
 The rank returned by DENSE_RANK() increments by one for each distinct
value in the ordered partition, similar to RANK(). However, it doesn't leave
gaps in the ranking sequence when there are ties.

Here's a brief example to illustrate the difference:

Consider a set of scores: 90, 85, 85, 80, 75.

 With RANK(), the ranks would be: 1, 2, 2, 4, 5.


 With DENSE_RANK(), the ranks would be: 1, 2, 2, 3, 4.

Notice that RANK() leaves a gap between the second and third ranks because two rows are
tied for the second place, while DENSE_RANK() assigns consecutive ranks without any
gaps.

In summary, RANK() leaves gaps in the ranking sequence when there are ties, while
DENSE_RANK() assigns consecutive ranks without any gaps for tied rows.
5. Practical quest.: From two tables one containing movies data and other containing actors data
with movie_id as foreign key, find out the actors who are not having any movie for 3 or more
years.

find the actors who have not appeared in any movie for 3 or more years, you would typically
use a combination of SQL queries involving joins, date calculations, and filtering. Here's how
you can achieve this:

Assuming you have two tables: movies containing movie data and actors containing actor
data with movie_id as the foreign key.

You can start by joining the actors and movies tables based on the movie_id foreign key.
Then, you can calculate the maximum release date of each actor's movies and compare it with
the current date minus 3 years. If an actor's latest movie release date is earlier than the
calculated date, it means they haven't appeared in any movie for 3 or more years.

Here's the SQL query to achieve this:

sqlCopy code

SELECT
actors.actor_id,
actors.actor_name
FROM
actors
LEFT JOIN
movies ON actors.actor_id = movies.actor_id
GROUP BY
actors.actor_id,
actors.actor_name
HAVING
MAX(movies.release_date) IS NULL OR
Explanation:

 LEFT JOIN is used to join the actors and movies tables based on the actor_id.
 GROUP BY is used to group the result set by actor_id and actor_name.
 MAX(movies.release_date) calculates the maximum release date of each actor's
movies.
 HAVING clause filters the result set to include actors whose maximum movie release
date is either NULL (indicating they have not appeared in any movie) or is less than
or equal to the date 3 years ago ( DATE_SUB(CURRENT_DATE(), INTERVAL 3
YEAR)).

This query will return the actor_id and actor_name of actors who have not appeared in
any movie for 3 or more years.

Other than these there were some random discussions about the query structure and CTEs.

Query structure:-

Certainly! Understanding the structure of SQL queries is essential for


writing effective and efficient database queries. SQL queries generally
consist of several clauses that perform different functions in retrieving,
manipulating, and filtering data from a database. Here's a breakdown of
the typical structure of an SQL query:

1. SELECT Clause:
 The SELECT clause specifies which columns or expressions to
include in the query result set.
 It is usually the first clause in an SQL query.
 Example: SELECT column1, column2 FROM table_name;
2. FROM Clause:
 The FROM clause specifies the table or tables from which to
retrieve data.
 It comes after the SELECT clause.
 Example: SELECT column1, column2 FROM table_name;
3. JOIN Clause:
 The JOIN clause is used to combine rows from two or more
tables based on a related column between them.
 It is used when data needs to be retrieved from multiple
tables.
 Example: SELECT * FROM table1 JOIN table2 ON
table1.column_name = table2.column_name;
4. WHERE Clause:
 The WHERE clause is used to filter rows based on specified
conditions.
 It is optional but commonly used to restrict the number of
rows returned by the query.
Example: SELECT * FROM table_name WHERE condition;

5. GROUP BY Clause:
 The GROUP BY clause is used to group rows that have the
same values into summary rows.
 It is typically used with aggregate functions (e.g., COUNT,
SUM, AVG) to perform calculations on groups of rows.
 Example: SELECT column1, SUM(column2) FROM table_name
GROUP BY column1;
6. HAVING Clause:
 The HAVING clause is used in combination with the GROUP BY
clause to filter group rows based on specified conditions.
 It is similar to the WHERE clause but operates on grouped
rows rather than individual rows.
 Example: SELECT column1, SUM(column2) FROM table_name
GROUP BY column1 HAVING condition;
7. ORDER BY Clause:
 The ORDER BY clause is used to sort the result set based on
one or more columns.
 It can sort in ascending (default) or descending order.
 Example: SELECT * FROM table_name ORDER BY column_name
ASC|DESC;
8. LIMIT Clause:
 The LIMIT clause is used to restrict the number of rows
returned by the query.
 It is commonly used with ORDER BY to retrieve a subset of
rows.
 Example: SELECT * FROM table_name LIMIT 10;

Understanding and effectively using these clauses will allow you to


construct powerful and precise SQL queries tailored to your specific data
retrieval needs.

CTE’s;-

CTE" stands for Common Table Expression. It's a temporary named result set that
you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs
are particularly useful for making complex queries more readable and
manageable by breaking them down into smaller, logical parts.

1.joins - how to implement with nulls

When working with SQL joins, handling NULL values can be important,
especially when dealing with outer joins where there might be unmatched
rows. Here's how you can handle NULLs in different types of joins:

1. INNER JOIN:
In an INNER JOIN, only the rows that have matching values in

both tables are returned.
 If there are NULL values in the join columns of either table,
those rows will not be included in the result set.
 So, there's no specific handling of NULLs needed in an INNER
JOIN because NULLs are implicitly excluded from the result
set.
2. LEFT JOIN:
 In a LEFT JOIN, all the rows from the left table (the first table
mentioned in the query) are returned, along with matching
rows from the right table.
 If there are no matching rows in the right table, NULL values
are returned for the columns of the right table.
 You can use the IS NULL or IS NOT NULL operators to check for
NULL values in the columns from the right table.
Example:
sqlCopy code
SELECT * FROM table1 LEFT JOIN table2 ON table1.column_name =
table2.column_name WHERE table2.column_name IS NULL ; -- This condition
checks for unmatched rows from table2
3. RIGHT JOIN:
 In a RIGHT JOIN, all the rows from the right table (the second
table mentioned in the query) are returned, along with
matching rows from the left table.
 If there are no matching rows in the left table, NULL values
are returned for the columns of the left table.
 Similarly, you can use the IS NULL or IS NOT NULL operators to
check for NULL values in the columns from the left table.
4. FULL OUTER JOIN:
 In a FULL OUTER JOIN, all rows from both tables are returned,
with NULL values for columns that do not have a match in the
other table.
 You can use IS NULL or IS NOT NULL operators to filter rows
based on NULL values in the joined columns.

In summary, when working with joins in SQL, handling NULLs often


involves checking for NULL values explicitly using the IS NULL or IS NOT
NULL operators, especially when dealing with outer joins where NULL
values might appear in unmatched rows.

Calculate rolling avg:-

to calculate a rolling average over a 5-day period from a given table, you
can achieve this using a window function in SQL, provided that your
database system supports it (e.g., PostgreSQL, MySQL 8.0+, SQL Server,
etc.).

Here's an example SQL query to calculate a 5-day rolling average:

SELECT
date_column,
value_column,
AVG(value_column) OVER (ORDER BY date_column ROWS BETWEEN 4
PRECEDING AND CURRENT ROW) AS rolling_avg
FROM

your_table;

Window frames:-

Avg(value_column) over(order by date_column Rows between 4 PRECEEDING AND CURRENT ROW)


AS ROLLING_AVG.

Next 2nd highest salary with diff approach

second-highest salary within each department, you can use a combination


of window functions and subqueries. Here's how you can do it:

sqlCopy code

SELECT
department_id,
MAX(salary) AS second_highest_salary
FROM (
SELECT
department_id,
salary,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC)
AS salary_rank
FROM
your_table
) ranked_salaries
WHERE
salary_rank = 2
GROUP BY
department_id;

In this query:

 Replace your_table with the name of your table.


 ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY
salary DESC) assigns a row number to each row within each
department based on the salary in descending order.
 The outer query filters the results to only include rows where the
salary_rank is 2, indicating the second-highest salary.
 MAX(salary) is used to ensure that we are getting the maximum
salary value among the second-highest salaries within each
department.
 The results are grouped by department_id to get the second-highest
salary for each department.

This query will give you the second-highest salary within each
department. If there are ties for the highest salary within a department,
this query will still return the second-highest salary.

Window functions usage:-

Un bounded preceding and unbounded following examples by using aggreagate funcitons:-

If use “Un bounded preceding and unbounded following” which will give only SINGLE output
based on all INPUT Values/PARTITION(if used)
Ranking Functions:-

Note :- In Ranking funcitons ,No Need to mention the column names

Need to get more understanding on the percent_rank.

By using analytic functions:-


If you need last_value correctly we need to us “ROWS BETWEEN UNBOUNDED PRECEEDING AND
UNBOUNDED FOLLOWING “ i.e to get 700 is the value in the first of the LAST_VALUE Column

Using Lead and Lag funcitons:-

Note:-Make sure you should highlight the column name in lead and lag functions

Very important note that in the lead and lag functions that how many rows you are leading or lagging
which you need to highlight in the lead /lag functions like Lead(new_id,2) or lag(new_id,2)
nth value of the salary using dense rank, you can utilize the DENSE_RANK() window
function along with a subquery. Here's how you can do it:

sqlCopy code

In this query:

WITH ranked_salaries AS (
SELECT
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_dense_rank
FROM
your_table
)
SELECT
salary
FROM
ranked_salaries
WHERE

salary_dense_rank = @n;

 Replace your_table with the name of your table.


 DENSE_RANK() OVER (ORDER BY salary DESC) assigns a dense rank to each
salary value, ordering them in descending order.
 The CTE (Common Table Expression) ranked_salaries is used to calculate the
dense rank for each salary.
 In the outer query, we select the salary from the ranked_salaries CTE where the
dense rank equals the desired value, represented by @n. Replace @n with the desired
rank you want to retrieve.

This query will return the salary corresponding to the nth dense rank value specified by @n.
Adjust the table name and column names as per your actual database schema.

SELECT DISTINCT
department_id,
NTH_VALUE(salary, @n) OVER (PARTITION BY department_id ORDER BY salary
DESC) AS nth_salary
FROM
your_table;

In this query:

 Replace your_table with the name of your table.


 NTH_VALUE(salary, @n) calculates the nth value of the salary
column. Replace @n with the desired rank you want to retrieve.
 OVER (PARTITION BY department_id ORDER BY salary DESC)
partitions the data by department_id and orders the salaries in
descending order within each partition.
 DISTINCT is used to ensure that only distinct department_id and
nth_salary combinations are returned.

This query will return the nth salary value for each department. Adjust the
table name and column names as per your actual database schema.

Need to look into this question reminder:-

find the employees who work for the weekends for each company
Query :-

3. SELECT
4. company_name,
5. employee_name
6. FROM
7. employees
8. WHERE
DAYOFWEEK(work_day) IN (1, 7);
Hemanth document too to document all of the
questions:
order of execution:-
1. FROM: This clause specifies the tables from which the data will be
retrieved.
2. WHERE: This clause filters the rows returned by the FROM clause
based on the specified conditions.
3. GROUP BY: If grouping is specified, the rows are grouped based on
the columns specified in this clause.
4. HAVING: This clause filters the grouped rows based on the
specified conditions.
5. SELECT: This clause selects the columns that will be included in the
result set.
6. ORDER BY: This clause sorts the result set based on the specified
columns.
7. LIMIT / OFFSET: These clauses are used to limit the number of
rows returned or to skip a certain number of rows.
t's important to note that not all queries will include all of these clauses, and the
order of execution may vary based on the specific query. For example, if you're
not grouping data, the GROUP BY and HAVING clauses won't be present.
Similarly, if you're not sorting data, the ORDER BY clause won't be present.

Cross join and it's relevance,:-

A cross join, also known as a Cartesian join, is a type of join operation in a


relational database management system. Unlike other types of joins (such
as inner joins, outer joins, etc.), a cross join does not have a join condition.
Instead, it produces the Cartesian product of the two tables involved in
the join. In other words, it combines each row from the first table with
every row from the second table, resulting in a potentially large result set.

Here's an example to illustrate:


Z

As you can see, each row from Table A is combined with every row from
Table B, resulting in a total of 3 x 3 = 9 rows in the output.

Cross joins are not as commonly used as other types of joins because they
can lead to large result sets, especially when dealing with tables that
contain a large number of rows. However, they can be useful in certain
scenarios, such as when you need to generate all possible combinations of
rows from two tables.

Some common use cases for cross joins include:

1. Generating test data or sample data for analysis.


2. Performing certain types of aggregation or summary calculations.
3. Combining dimensions in data warehousing or business intelligence
applications.

It's important to exercise caution when using cross joins, as they can
easily lead to unintended consequences if not used carefully.

What is relational vs non relational database :-

Relational Databases (SQL Databases):

Relational databases organize data into tables, where each table consists
of rows and columns. They follow the relational model based on the
principles of relational algebra and set theory. Relationships between
tables are established using keys (e.g., primary keys, foreign keys),
enabling efficient querying and data integrity enforcement.

Examples of relational databases include:

1. MySQL: A popular open-source relational database management


system (RDBMS) widely used in web applications.
2. PostgreSQL: Another powerful open-source RDBMS known for its
robust features, extensibility, and SQL compliance.
3. Oracle Database: An enterprise-grade RDBMS offering a wide
range of features, scalability, and reliability.
4. Microsoft SQL Server: A relational database management system
developed by Microsoft, commonly used in enterprise environments.

Non-Relational Databases (NoSQL Databases):

Non-relational databases, also known as NoSQL databases, diverge from


the tabular structure of relational databases. They are designed to handle
unstructured, semi-structured, or polymorphic data, providing greater
flexibility and scalability for certain types of applications. NoSQL
databases come in various forms, such as document-oriented, key-value
stores, column-family stores, and graph databases.

Examples of NoSQL databases include:

1. MongoDB: A popular document-oriented NoSQL database that


stores data in flexible JSON-like documents, offering high scalability
and performance.
2. Cassandra: A distributed, highly scalable NoSQL database designed
to handle large amounts of data across multiple commodity servers.
It uses a column-family data model.
3. Redis: A fast, in-memory data structure store often used as a
caching layer or message broker due to its low latency and high
throughput.
4. Neo4j: A graph database optimized for storing and querying
interconnected data, making it suitable for applications with
complex relationship structures.
5. Amazon DynamoDB: A fully managed NoSQL database service
provided by AWS, offering seamless scalability and low-latency
performance for various use cases.

These examples illustrate the fundamental difference between relational


and non-relational databases in terms of data organization, query
language, and scalability characteristics, among other factors. The choice
between them depends on the specific requirements and constraints of
your application.

Write a query to find profession with highest average


income in each category :-
To find the profession with the highest average income in each category,
you can use a SQL query with a combination of aggregation functions and
window functions. Assuming you have a table named employees with
columns profession, category, and income, the following query can be
used:
In this query:

1. We first calculate the average income for each profession within


each category using the AVG() function along with a GROUP BY
clause.
2. We then use a window function ROW_NUMBER() to assign a rank to
each profession within each category based on their average
income. The PARTITION BY clause partitions the data by category,
and the ORDER BY clause orders the data by average income in
descending order.
3. Finally, we select the professions with the highest average income
in each category by filtering out only those rows where the rank is
equal to 1.

This query will give you the profession with the highest average income in
each category. Adjust the table and column names according to your
database schema.
In above table find percentage of female users in each
profession compared to total users :-
To find the percentage of female users in each profession compared to
the total users, you can use a SQL query with a combination of
aggregation functions and window functions. Assuming you have a table
named users with columns profession, gender, and user_id, the following
query can be used:
In this query:

1. We first calculate the count of female users, the total count of users,
and the total count of users per profession using the COUNT()
function along with a CASE statement to count only female users.
2. We use a window function ROW_NUMBER() to assign a rank to each
profession based on the total count of users. The PARTITION BY
clause partitions the data by profession, and the ORDER BY clause
orders the data by the total count of users in descending order.
3. We then join the result of the first CTE with the result of the second
CTE to get the total count of users per profession.
4. Finally, we select the profession, gender, and the percentage of
female users in each profession compared to the total users, where
the gender is female and the rank is 1.

1. Difference between row number, rank and dense


rank:-

In SQL, ROW_NUMBER(), RANK(), and DENSE_RANK() are window functions


that are used to assign a unique numerical value to each row within a
partition of a result set. However, they differ in how they assign these
values and handle ties:

1. ROW_NUMBER():
 Assigns a unique integer to each row within the partition.
 The numbering starts at 1 for the first row in the partition and
increments by 1 for each subsequent row.
 It does not handle ties; each row gets a distinct number even
if multiple rows have the same values.
2. RANK():
 Assigns a unique rank to each distinct value within the
partition.
 If there are ties (i.e., rows with the same values), they receive
the same rank, and the next rank is skipped.
 For example, if two rows tie for the first position, the next row
receives a rank of 3, not 2.
3. DENSE_RANK():
 Similar to RANK(), assigns a unique rank to each distinct value
within the partition.
 Handles ties like RANK(), but it does not skip ranks. Instead, it
assigns consecutive ranks to tied rows.
 For example, if two rows tie for the first position, the next row
receives a rank of 2, not 3.
Here's a comparison using an example:

Consider a dataset of students' scores:

Student Score
Alice 90
Bob 85
Carol 90
David 80
Eve 85

In summary, the main difference lies in how they handle ties.


ROW_NUMBER() always assigns unique numbers to each row, while
RANK() and DENSE_RANK() handle ties differently by either skipping ranks
(RANK()) or maintaining consecutive ranks (DENSE_RANK()).

4. Subqueries and other windows functions


Subqueries and window functions are powerful features in SQL that serve different purposes
but are often used for similar tasks. Here's an overview of why each is used:

1. Subqueries:
 Filtering and Aggregation: Subqueries are often used to filter results based
on conditions that cannot be directly expressed in a WHERE clause. For
example, you might want to filter rows based on the result of another query.
 Nested Queries: Subqueries allow you to nest one query inside another. This
is useful when you need to perform a calculation or retrieve data based on the
result of another query.
 Subquery Expressions: Subqueries can also be used to return a single value
or a list of values that can be used in various parts of a query, such as
SELECT, WHERE, HAVING, and even as part of an expression.
 Correlated Subqueries: These are subqueries where the inner query
references a column from the outer query. Correlated subqueries can be used
to perform row-by-row processing or to filter data based on values from the
outer query.
2. Window Functions:
 Analytical Calculations: Window functions are used to perform calculations
across a set of rows related to the current row. They allow you to calculate
running totals, moving averages, rank items, and perform other analytical
tasks without grouping the result set.
 Avoiding Subqueries: Window functions can often replace subqueries and
are generally more efficient and easier to read. They allow you to achieve
similar results without the need for nested queries.
 Partitioning Data: Window functions partition the result set into groups of
rows based on specified criteria, such as grouping by a particular column. This
allows you to perform calculations within each partition separately.
 ORDER BY Clause: Window functions also allow you to specify an order for
the rows within each partition, which is useful for calculating running totals or
finding the "top N" items within each group.

In summary, while subqueries and window functions can both be used for similar tasks, they
have distinct capabilities and use cases. Subqueries are primarily used for filtering,
aggregation, and nested queries, while window functions are used for analytical calculations
and partitioning data within a query result set. Depending on the specific requirements of
your query, you may choose to use one or both of these features to achieve your desired
result.

Running Totals:-

You can find running totals using window functions in SQL. Here's an
example of how you can achieve this:
Let's say you have a table named sales with columns date and amount,
and you want to calculate the running total of sales amount over time.

 SUM(amount) OVER (ORDER BY date) calculates the running total of


the amount column. The ORDER BY date clause specifies the order in
which the rows are processed to calculate the running total. It
means that for each row, the running total is the sum of the amount
values from all previous rows, ordered by the date column.
 date and amount are columns from the sales table.

This query will return a result set with three columns: date, amount, and
running_total . The running_total column will contain the running total of
sales amount up to each date.

Here's an example of what the result might look like:

In this result, the running_total column contains the cumulative sum of the
amount column up to each date.

Similarly ,
To calculate the running total and percentage of running total for each
month in the sales table, you can use window functions along with
subqueries to achieve this. Here's how you can do it:

Explanation:

1. We use a Common Table Expression (CTE) named MonthlySales to


calculate the running total for each month. The EXTRACT(MONTH
FROM date) function extracts the month from the date column.
2. Within the CTE, we use the SUM(amount) OVER (PARTITION BY
EXTRACT(MONTH FROM date) ORDER BY date) window function to
calculate the running total for each month. The PARTITION BY clause
ensures that the running total is calculated separately for each
month.
3. In the main query, we select the month, amount,
monthly_running_total , and calculate the percentage_running_total .
We use the SUM(amount) OVER (PARTITION BY month) window
function to calculate the total amount for each month, and then
divide the monthly_running_total by this total to get the percentage.

This query will give you the month, amount, monthly running total, and
percentage of running total for each month in the sales table.

Moving average:-
To calculate the moving average of a specific column over a certain window of
rows, you can use the window function AVG() along with the ROWS or range
clause in SQL. Here's an example of how to calculate the moving average for a
column named amount over a window of the last 3 rows:

In this query:

 AVG(amount) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING


AND CURRENT ROW) calculates the moving average of the amount
column. The ORDER BY date clause specifies the order in which the
rows are processed to calculate the moving average. The ROWS
BETWEEN 2 PRECEDING AND CURRENT ROW clause specifies the
window of rows to consider for the moving average, which in this
case is the current row and the two preceding rows.
 date and amount are columns from the sales table.

This query will return a result set with three columns: date, amount, and
moving_average . The moving_average column will contain the moving
average of the amount column over the specified window of rows.

Here's an example of what the result might look like:

In this result, the moving_average column contains the moving average of the
amount column over the window of the last 3 rows, including the current row.
Note that for the first two rows, where there are not enough preceding rows to
calculate the moving average, the result is NULL. Adjust the window size and
column names according to your specific requirements and database schema.

Find Moving Average for each month and percentage of the total for each month :-

To calculate the moving average for each month and the percentage of the total
for each month, you can use a combination of window functions, subqueries, and
common table expressions (CTEs) in SQL. Below is an example of how you can
achieve this:

Note:-Right side cut the remaining part :-


Finding the "top N" items within each group:-
To find the "top N" items within each group in SQL, you can use window
functions such as ROW_NUMBER() or RANK(). Here's an example of how
you can achieve this:

Let's say you have a table named sales with columns product, category,
and sales_amount, and you want to find the top 3 products within each
category based on their sales amount.

In this query:

1. We use a Common Table Expression (CTE) named RankedProducts to rank the


products within each category based on their sales amount. The PARTITION BY
category clause ensures that ranking is done separately for each category, and the
ORDER BY sales_amount DESC clause specifies the order in which products are
ranked, with the highest sales amount first.
2. Within the CTE, we use the ROW_NUMBER() window function to assign a rank to
each product within its category.
3. In the main query, we select the product, category, and sales_amount columns
from the RankedProducts CTE.
4. We add a WHERE clause to filter the results to include only the top 3 products ( rank
<= 3) within each category.

This query will give you the top 3 products within each category based on their sales amount.

Use case on 3 queries like highest average salary in


each domain, female
percentage in each domain and on
maximum aggregation:-
Join Concepts as shown in the above pages already :

Ddl , dml , dcl , dql concepts , and views :-


1. DDL (Data Definition Language):
 DDL is used to define the structure of the database and its
objects.
 It includes commands like CREATE, ALTER, DROP, TRUNCATE,
etc.
 Examples of DDL operations include creating tables, altering
table structures, dropping tables, etc.
2. DML (Data Manipulation Language):
 DML is used to manipulate data stored in the database.
 It includes commands like SELECT, INSERT, UPDATE, DELETE,
etc.
 Examples of DML operations include inserting new records,
updating existing records, deleting records, etc.
3. DCL (Data Control Language):
 DCL is used to control access to data within the database.
 It includes commands like GRANT and REVOKE.
 Examples of DCL operations include granting privileges to
users, revoking privileges from users, etc.
4. DQL (Data Query Language):
 DQL is used to retrieve data from the database.
 It includes commands like SELECT.
 Examples of DQL operations include retrieving records from
tables, filtering data based on certain conditions, joining
tables, etc.

Views:

 A view in a database is a virtual table generated based on the result


of a SELECT query.
 Views do not store data themselves; instead, they are stored
queries that dynamically produce a result set.
 Views can be used to simplify complex queries, provide an
additional layer of security by restricting access to certain columns
or rows, and present data in a more understandable format.
 They are created using the DDL command CREATE VIEW and can be
dropped using DROP VIEW.
 Views can be queried just like tables, and they can also be used as
the basis for other views.

Now, let's see how views relate to the aforementioned concepts:

 DDL: Views are created using the DDL command CREATE VIEW. This
command defines the structure of the view based on the SELECT
query provided.
 DML: Views can be queried using DML commands like SELECT. They
do not store data themselves but provide a way to query data
stored in tables.
 DCL: Permissions can be granted or revoked on views using DCL
commands like GRANT and REVOKE. This allows controlling access
to the underlying tables through the views.
 DQL: Views are primarily used for querying data, which falls under
DQL. They allow users to retrieve data from tables in a more
convenient or secure manner.

In summary, views provide a way to abstract complex queries, control


access to data, and present data in a more understandable format, and
they interact with the concepts of DDL, DML, DCL, and DQL accordingly.

Deleting the duplicate rows from the table:-

To delete duplicate rows from a table, you can use the DELETE statement along
with a common table expression (CTE) to identify the duplicate rows. Here's a
general approach:
Asked about creating multiple tables under a database how does it
will perform:-

Creating multiple tables in a database is a common practice when


designing a relational database schema. Here's how you can create
multiple tables in a database, along with considerations for performance:

1. Creating Tables:
 Use the CREATE TABLE statement to create individual tables
within your database.
 Specify the table name, along with the columns and their data
types, constraints, indexes, and any other relevant attributes.
 Here's a basic example of creating two tables in a SQL
database:

Syntax for DDL,DML,DCL,VIEW :-


Nth highest salary :-
To find the Nth highest salary within each department, you can use a window
function along with a common table expression (CTE) in SQL. Here's how you can
do it:
Concecutive Orders :-

If you want to find consecutive orders in a table based on a certain


criterion, such as consecutive order IDs or consecutive timestamps, you
can use window functions to achieve this in SQL. Here's an example:

Let's say you have a table named orders with columns order_id and
order_date, and you want to find consecutive orders based on the order
date.

, who order every month

WITH MonthlyOrders AS (

SELECT
CustomerID,

EXTRACT(YEAR FROM OrderDate) AS year,

EXTRACT(MONTH FROM OrderDate) AS month,

COUNT(DISTINCT EXTRACT(MONTH FROM OrderDate)) OVER (PARTITION BY CustomerID) AS


distinct_months,

ROW_NUMBER() OVER (PARTITION BY CustomerID, EXTRACT(YEAR FROM OrderDate),


EXTRACT(MONTH FROM OrderDate) ORDER BY OrderDate) AS row_num

FROM

Orders

),

ConsecutiveMonths AS (

SELECT

CustomerID,

year,

month,

dense_rank() OVER (PARTITION BY CustomerID ORDER BY year, month) AS month_rank

FROM

MonthlyOrders

WHERE

distinct_months = 1

SELECT

CustomerID,

MIN(OrderDate) AS start_date,

MAX(OrderDate) AS end_date

FROM

Orders

WHERE

EXISTS (

SELECT 1

FROM ConsecutiveMonths

WHERE Orders.CustomerID = ConsecutiveMonths.CustomerID


GROUP BY CustomerID

HAVING COUNT(*) = 12

GROUP BY

CustomerID, month_rank

ORDER BY

CustomerID, start_date;

Running average of restaurant next 3 days and current day :-

To calculate the running average of a restaurant's orders for the next 3 days
including the current day, you can use a window function along with a suitable
date range condition in SQL.

WITH DailyOrders AS (
SELECT
order_date,
restaurant_id,
order_count,
SUM(order_count) OVER (PARTITION BY restaurant_id ORDER BY
order_date ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS
running_total
FROM
orders
WHERE
order_date BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL '3
days'
)
SELECT
order_date,
restaurant_id,
order_count,
running_total / 3.0 AS running_average
FROM
DailyOrders
ORDER BY
restaurant_id, order_date;

You might also like