Oracle Interview Prep Guide
Oracle Interview Prep Guide
If you are a fresher or an experienced, this is the right platform for you which will
help you to start your preparation.
Both Varchar & Varchar2 are the Oracle data types which are used to store
character strings of variable length. To point out the major differences between
these,
Q2. What are the components of logical database structure in Oracle database?
A table is a basic unit of data storage in the Oracle database. A table basically
contains all the accessible information of a user in rows and columns.
To create a new table in the database, use the “CREATE TABLE” statement.
First, you have to name that table and define its columns and datatype for each
column.
Here,
Q4. Explain the relationship among database, tablespace and data file?
A Join is used to compare and combine, this means literally join and return
specific rows of data from two or more tables in a database.
There are three types of joins in SQL that are used to write the subqueries.
The RAW datatype in Oracle is used to store variable-length binary data or byte
string values. The maximum size for a raw in a given table in 32767 bytes.
You might get confused as to when to use RAW, varchar, and varchar2. Let me
point out the major differences between them. PL/SQL does not recognize the
data type and hence, it cannot have any conversions when RAW data is
transferred to different systems. This data type can only be queried or can be
inserted in a table.
Average
Count
Sum
A view is a logical table based on one or more tables or views. A View is also
referred to as a user-defined database object that is used to store the results of a
SQL query, that can be referenced later in the course of time. Views do not store
the data physically but as a virtual table, hence it can be referred to as a logical
table. The corresponding tables upon which the views are signified are
called Base Tables and this doesn’t contain data.
It is possible to store pictures on to the database by using Long Raw Data type.
This data type is used to store binary data of length 2GB. Although, the table can
have only on Long Raw data type.
Both these statements Decode and Case will work similar to the if-then-else
statement and also they are the alternatives for each of them. These functions are
used in Oracle for data value transformation.
Example:
Decode function
Select OrderNum,
DECODE (Status,’O’, ‘Ordered’,’P’, ‘Packed,’ S’,’ Shipped’, ’A’,’Arrived’)
FROM Orders;
Case function
Select OrderNum
, Case(When Status=’O’ then ‘Ordered’
When Status =’P’ then Packed
When Status=’ S’ then ’Shipped’
else ’Arrived’) end
FROM Orders;
Both these commands will display Order Numbers with their respective Statuses
like this,
Status O= Ordered
Status P= Packed
Status S= Shipped
Status A= Arrived
Q14. What do you mean by Merge in Oracle and how can you merge two
tables?
The Dual table is basically a one-column table that is present in the Oracle
database. This table has a single Varchar2(1) column called Dummy which has a
value of ‘X’.
Domain Integrity
Referential Integrity
Domain Integrity
1. Select: Data Retrieval
2. Insert, Update, Delete, Merge: Data Manipulation Language
(DML)
3. Create, Alter, Drop, Rename, Truncate: Data Definition Language
(DDL)
4. Commit, Rollback, Savepoint: Transaction Control Statements
5. Grant, Revoke: Data Control Language (DCL)
Q18. Briefly explain what is Literal? Give an example where it can be used?
Also note that Date and character literals must be enclosed within single quotes
(‘ ‘), whereas you don’t have to do that for the number literals.
In order to display row numbers along with their records numbers you can do
this:
The above query will display the row numbers and the field values from the given
table. This query will display row numbers and the field values from the given
table.
SQL Functions are a very powerful feature of SQL. These functions can take
arguments but always return some value. There are two distinct types of SQL
functions available. They are:
1. Character
2. Number
3. Date
4. Conversion
5. General
1. avg
2. count
3. max
4. min
5. sum
6. stddev
7. variance
Q22. Describe different types of General Function used in SQL?
Types of subqueries:
Q24. What is the use of Double Ampersand (&&) in SQL Queries? Give an
example
You can use && if you want to reuse the variable value without prompting the
user each time.
Each Cursor in Oracle has a set of attributes that enables an application program
to test the state of the Cursor. The attributes can be used to check whether the
cursor is opened or closed, found or not found and also find row count.
Q28. What is the fastest query method to fetch data from the table?
The fastest query method to fetch data from the table is by using the Row ID. A
row can be fetched from a table by using RowID.
There are no such differences between these Joins. Cartesian and Cross join are
the same.
Cross join gives a cartesian product of two tables i.e., the rows from the first table
is multiplied with another table that is called cartesian product.
Cross join without the where clause gives a Cartesian product.
Using this On Delete Cascade you can automatically delete a record in the child
table when the same record is deleted from the parent table. This statement can
be used with Foreign Keys as well.
Syntax:
Now let’s move on to the next part of this Oracle Interview Questions article.
There are a lot of characteristics of PL/SQL. Some notable ones among them are:
There are two data types available in PL/SQL. They are namely:
Composite datatypes
Triggers are the programs which are automatically executed when some events
occur:
Q35. Show how functions and procedures are called in a PL SQL block
Q36. What are the two virtual tables available at the time of database trigger
execution?
Q37. What are the differences between Primary Key and Unique Key?
Q38. Explain the purpose of %TYPE and %ROWTYPE data types with the
example?
%ROWTYPE and %TYPE are the attributes in PL/SQL which can inherit the
datatypes of a table that are defined in a database. The main purpose of using
these attributes in Oracle is to provide data independence and integrity. Also
note that, if any of the datatypes gets changed in the database, PL/SQL code gets
updated automatically including the change in the datatypes.
%TYPE: This is used for declaring a variable that needs to have the same data
type as of a table column.
%ROWTYPE: This is used to define a complete row of record having a
structure similar to the structure of a table.
Q41. What is the difference between COUNT (*), COUNT (expression), COUNT
(distinct expression)?
Q43. List out the difference between Commit, Rollback, and Savepoint?
Q45. Point out the difference between USER TABLES and DATA DICTIONARY?
Mirroring is a process of having a copy of Redo log files. This is done by creating
a group of log files altogether. It ensures that the LGWR automatically writes it
to all the members of the current on-line redo log group. If the group fails, the
database automatically switches over to the next group and it diminishes the
performance of the database.
Q50. What is the difference between a hot backup and a cold backup in Oracle?
Explain about their benefits as well
Now with this, we come to an end of this comparison on SQL vs NoSQL. I hope
you guys enjoyed this article and understood all the differences. If you wish to
check out more articles on the market’s most trending technologies like Artificial
Intelligence, DevOps, Ethical Hacking, then you can refer to Edureka’s official
site.
Do look out for other articles in this series which will explain the various other
aspects of SQL.
A database can be defined as the structured form of data storage from which data can be retrieved
and managed based on requirements. Basically, a database consists of tables where data is stored
in an organized manner. Each table consists of rows and columns to store data. Data can be stored,
modified, updated, and accessed easily in a database. For instance, a bank management database
or school management database are a few examples of databases.
DBMS is the software that allows storing, modifying, and retrieving data from a database. And it is a
group of programs that act as the interface between data and applications. DBMS supports
receiving queries from applications and retrieving data from the database.
Like DBMS, RDBMS is also the software that allows storing, modifying, and retrieving data from a
database but a RELATIONAL database. In a relational database, the data in the tables have a
relationship. Besides, RDBMS is useful when data in tables are being managed securely and
consistently.
3. What are Query and Query language?
A query is nothing but a request sent to a database to retrieve data or information. The required
data can be retrieved from a table or many tables in the database.
Query languages use various types of queries to retrieve data from databases. SQL, Datalog, and
AQL are a few examples of query languages; however, SQL is known to be the widely used query
language. SQL returns data as columns and rows in a table, whereas other languages return data in
other forms, like graphs, charts, etc.
It is a query that exists inside the statements such as SELECT, INSERT, UPDATE, and DELETE. It may
exist inside a subquery too. A subquery is also known as an inner query or inner select. The
statement with a subquery is an outer query or outer select.
Let’s see the example shown below in which the maximum unit price is the result that will be
returned by the subquery using the SELECT statement. Also, orders is the value that will be
returned by the outer query using the SELECT statement.
SQL is known as the query programming language. It uses SQL queries to store, modify and
retrieve data into and from databases. Briefly, SQL inserts, updates, and deletes data in databases;
creates new databases and new tables; creates views and stored procedures; and sets permissions
on the database objects.
Dynamic SQL is the programming method that allows building SQL statements during runtime. You
can use dynamic SQL when you do not know the full text of the SQL statements used in the
program until runtime. Moreover, dynamic SQL can execute SQL statements that are not supported
by static SQL programs. So, Dynamic SQL helps to build more flexible applications.
Tables are the database objects where data is stored logically. Like a spreadsheet, data is stored in
the form of rows and columns in a database table. A row in a table represents a record, and
columns represent the different fields. Fields have the data types such as text, dates, numbers, and
links.
For example, consider the below customer database in which rows consist of the company names
and columns consist of the various details of customers like first name, last name, age, location,
etc. Here, number 1 indicates a record, number 2 indicates a field, and number 3 indicates the field
value.
Partitioned tables
Temporary tables
System tables
Wide tables
Temporary tables only store data during the current session, and they will be dropped once the
session is over. With temporary tables, you can create, read, update and delete records like
permanent tables. Know that there are two types of temporary tables: local and global temporary
tables.
Local temporary tables are only visible to the user who created them, and they are deleted the
moment the user disconnects from the instance of the SQL server.
On the contrary, global temporary tables are visible to all users, and they are deleted only when all
the users who reference the tables get disconnected.
10. What do you mean by Primary Key and Foreign Key in SQL?
Primary Key: A primary is a field or combination of many fields that help identify records in a
table. Note that there can be only one primary key for a table. The table that has the primary key is
known as the parent table.
Foreign Key: A foreign key is the field or combination of fields of a table that links the primary key
of another table. A foreign key is used to create a connection between two tables. Unlike a primary
key, a table can have one or many foreign keys. The table that has a foreign key is known as the
child table.
For example, customer ID (1) is the primary key of the Customers table, and customer ID (2) in the
orders table is identified as the foreign key to the customer's table.
A super key may be a single or a combination of keys that help to identify a record in a table. Know
that Super keys can have one or more attributes, even though all the attributes are not necessary
to identify the records.
A candidate key is the subset of Superkey, which can have one or more than one attributes to
identify records in a table. Unlike Superkey, all the attributes of the candidate key must be helpful
to identify the records.
Note that all the candidate keys can be Super keys, but all the super keys cannot be candidate keys.
A composite key is the combination of two or more columns in a table used to identify a row in a
table. Know that a combination of columns is essential in creating composite keys because a single
column in a composite key cannot identify a row in a table. We can say that the composite key is
the primary key with a few more attributes or columns. Also, a composite key can be a combination
of candidate keys.
JOIN is the logical operation used to retrieve data from two or more tables. It can be applied only
when there is a logical relationship between two tables. Moreover, the JOIN operator uses the data
of one table to retrieve data from another table.
INNER JOIN
LEFT (OUTER) JOIN
RIGHT (OUTER) JOIN
FULL (OUTER) JOIN
CROSS JOIN
In self-join operation, a table is joined with itself to retrieve the desired data. Every join operation
needs two tables as a basic rule. Therefore, in self-join, a table is joined with an instance of the
same table. By doing this, values of the two table columns are compared with each other, and the
desired data is retrieved as the result set.
Cross Join is basically the Cartesian product type in which each row in a table is paired with all the
rows of another table. So, the result set will be the paired combinations of the rows of two tables.
Generally, cross join is not preferred by developers as it increases complexity in programs when
there are many rows in tables. But, it can be used in queries if you identify normal join operation
won’t be effective for your query.
16. What are the SQL constraints?
SQL constraints specify conditions for a column or table to manage the data stored in tables
effectively.
NOT NULL - This condition ensures columns won’t accept a NULL value.
UNIQUE - It ensures that all the values in a column must be unique.
CHECK - It ensures that all the column fields obey a specific condition.
DEFAULT - It provides a default value for the fields of a column unless no value is specified
for the fields
CREATE INDEX - It ensures creating an index for tables so that retrieving data from the tables
becomes easier
PRIMARY KEY - It must identify every row of a table
FOREIGN KEY - It must link tables based on common attributes
Local variables are declared inside a function so that only that function can call them. They only
exist until the execution of that specific function. Generally, local variables are stored in stack
memory and cleaned up automatically.
Global variables are declared outside of a function. They are available until the execution of the
entire program. Unlike local variables, global variables are stored in fixed memory and not cleaned
up automatically.
An index is used to retrieve data from a database quickly. Generally, indexes have keys taken from
the columns of tables and views. We can say, SQL indexes are similar to the indexes in books that
help to identify pages in the books quickly.
Clustered indexes
Non-clustered indexes
There are five types of SQL commands offered in SQL. They are given as follows;
DQL SELECT
CREATE It allows the creation of database objects such as tables, views, and indexes.
TRUNCATE This command helps to delete all the rows of a table permanently.
This command can be applied if you want to restrict the access of database
REVOKE
objects by other users.
This command helps undo the transactions made in a database with the
ROLLBACK
condition that the transactions shouldn't be saved yet.
This command helps to roll the transactions up to a certain point but not the
SAVEPOINT
entire transaction.
It is a function that consists of a group of statements that can be stored and executed whenever it
is required. Know that stored procedures are compiled only once. They are stored as ‘Named
Object’ in the SQL server database. Stored procedures can be called at any time during program
execution. Moreover, a stored procedure can be called another stored procedure.
SQL offers the flexibility to developers to use built-in functions as well as user-defined functions.
Aggregate functions: They process a group of values and return a single value. They can
combine with GROUP BY, OVER, HAVING clauses and return values. They are deterministic
functions.
Analytic functions: They are similar to aggregate functions but return multiple rows as
result set after processing a group of values. They help calculate moving averages, running
totals, Top-N results, percentages, etc.
Ranking functions: They return ranking values for rows in a table based on the given
conditions. Here, the results are non-deterministic.
Rowset functions: They return an object used as the table reference.
Scalar functions: They operate on a single value and return a single value.
27. Mention the different types of operators used in SQL?
There are six types of operators used in SQL. They are given as follows:
Comparison Equal to, Not equal to, Greater than, Not greater than, Less than, Not less than,
Operators Not equal to, etc.
Compound Operators Add equals, Multiply equals, Subtract equals, Divide equals, and Modulo equals
Logical Operators ALL, ANY/SOME, AND, BETWEEN, NOT, EXISTS, OR, IN, LIKE, and ISNULL
There are four types of set operators available in SQL. They are given as follows:
Union This operator allows combining result sets of two or more SELECT statements.
This operator allows combining result sets of two or more SELECT statements
Union All
along with duplicates.
This operator returns the common records of the result sets of two or more
Intersect
SELECT statements.
This operator returns the exclusive records of the first table when two tables
Minus
undergo this operation.
29. What do you mean by buffer pool and mention its benefits?
A buffer pool in SQL is also known as a buffer cache. All the resources can store their cached data
pages in a buffer pool. The size of the buffer pool can be defined during the configuration of an
instance of SQL Server. The number of pages that can be stored in a buffer pool depends on its
size.
A tuple is a single row in a table that represents a single record of a relation. A tuple contains all the
data that belongs to a record. At the same time, tuple functions allow retrieving tuples from a
database table. They are extensively used in analysis services that have multidimensional
structures.
For example, the highlighted row in the below table shows all the data belonging to a customer,
which is nothing but a tuple.
31. What do you mean by dependency and mention the different dependencies?
Dependency is the relation between the attributes of a table. The following are the different types
of dependencies in SQL.
Functional dependency
Fully-functional dependency
Multivalued dependency
Transitive dependency
Partial dependency
Data integrity ensures the accuracy and consistency of data stored in a database. Data integrity, in
a way, represents the data quality. So, the data characteristics defined for a column should be
satisfied while storing data in the columns. For instance, if a column in a table is supposed to store
numeric values, then it should not accept Alphabetic values; otherwise, you can mean that data
integrity is lost in the table.
33. What is Database Cardinality?
Database Cardinality denotes the uniqueness of values in the tables. It supports optimizing query
plans and hence improves query performance. There are three types of database cardinalities in
SQL, as given below:
Higher Cardinality
Normal Cardinality
Lower Cardinality
It is the process that reduces data redundancy and improves data integrity by restructuring the
relational database.
In general, the result set of a SQL statement is a set of rows. If we need to manipulate the result set,
we can act on a single row of the result set at a time. Cursors are the extensions to the result set
and help point a row in the result set. Here, the pointed row is known as the current row.
Forward Only: It is known as the firehose cursor that can make only a forward movement.
The modification made by the current user and other users is visible while using this cursor.
As it is the forward-moving cursor, it fetches rows of the result set from the start to end
serially.
Static: This cursor can move forward and backward on the result set. Here, only the same
result set is visible throughout the lifetime of the cursor. In other words, once the cursor is
open, it doesn’t show any changes made in the database that is the source for the result set.
Keyset: This cursor is managed by a set of identifiers known as keys or keysets. Here, the
keysets are built by the columns that derive the rows of a result set. When we use this
cursor, we can’t view the records created by other users. Similarly, if any user deletes a
record, we can’t access that record too.
Dynamic: Unlike static cursors, once the cursor is open, all the modifications performed in
the database are reflected in the result set. The UPDATE, INSERT and DELETE operations
made by other users can be viewed while scrolling the cursor.
Entities are real-world objects that are individualistic and independent. Rows of a table represent
the members of the entity, and columns represent the attributes of the entity. For instance, a ‘list of
employees of a company is an entity where employee name, ID, address, etc., are the attributes of
the entity.
A relationship indicates how entities in a database are related to each other. Simply put, how a row
in a table is related to row(s) of another table in a database. The relationship is made using the
primary key and the foreign key primarily.
One-to-one relationship
One-to-many relationship
Many-to-many relationship
Triggers are nothing but they are special stored procedures. When there is an event in the SQL
server, triggers will be fired automatically.
LOGON triggers: They get fired when a user starts a Logon event
DML Triggers: They get fired when there is a modification in data due to DML
The schema represents the logical structures of data. Using schemas, the database objects can be
grouped logically in a database. Schema is useful for segregating database objects based on
different applications, controlling access permissions, and managing a database's security aspects.
Simply out, Schemas ensure database security and consistency.
Advantages:
41. What is the difference between char and varchar data types?
Char data type is a fixed-length data type in which the length of the character cannot be changed
during execution. It supports storing normal and alphanumeric characters.
On the other hand, varchar is the variable-length data type in which the length of the character can
be changed during execution. That's why, it is known as a dynamic data type.
COUNT
SUM
AVG
MAX
MIN
LOWER
UPPER
INITCAP
CONCAT
SUBSTR
LENGTH
INSTR
LPAD
RPAD
TRIM
REPLACE
45. How would you differentiate single-row functions from multiple-row functions?
Single row functions can act on a single row of a table at a time. They return only one result after
executing a row. Length and case conversions are known to be single-row functions.
Multiple row functions can act on multiple rows of a table at a time. They are also called group
functions and return a single output after executing multiple rows.
Experienced:
SQL No SQL
Stores data in tables based on schemas so that No specific method is followed for data storage, so
data are organized and structured it offers flexibility in storing data.
Scaling is performed vertically increasing the Scaling is performed horizontally adding more
processing power of servers servers and nodes
SQL MySQL
Generally, an index is created in a separate table. They are the pointers that indicate the address of
data in a database table. An index helps speed up querying and the data retrieval process in a
database.
On the other hand, a view is a virtual table created from the rows and columns of one or more
tables. The main thing about a view is that the rows and columns are grouped logically. With the
support of views, you can restrict access to the entire data in a database.
49. What is the use of Views, and mention its types in SQL?
Views are the virtual database tables created by selecting rows and columns from one or more
tables in a database. They support developers in multiple ways, such as simplifying complex
queries, restricting access to queries, and summarising data from many tables.
System-defined views: They can be used for specific purposes and perform specific actions
only. It provides all the information and properties of databases and tables.
User-defined views: They are created as per the requirements of users. They are routines
that accept parameters, perform complex functions, and return a value.
helps store large scale semi-structured and Known as Large Objects. It is used to store large
unstructured data size data
Subqueries cannot select LONG data types Subqueries can select LOB datatypes
51. What is the difference between Zero and NULL values in SQL?
When a field in a column doesn’t have any value, it is said to be having a NULL value. Simply put,
NULL is the blank field in a table. It can be considered as an unassigned, unknown, or unavailable
value. On the contrary, zero is a number, and it is an available, assigned, and known value.
52. What is the difference between INNER JOIN and OUTER JOIN?
Database testing is also known as back-end testing. It consists of the SQL queries executed to
validate database operations, data structures, and attributes of a database. It helps to ensure the
data integrity by eliminating duplicate entries of data in a database, failing which will create many
problems while managing the database. Besides, it deals with testable items hidden and not visible
to users.
Blackbox testing helps to examine the functionality of a database. It is performed by validating the
integration level of a database. The incoming and outgoing data are verified by various test cases
such as the cause-effect graphing technique, equivalence partitioning, and boundary value analysis.
This kind of testing can be performed at the early stages of development to ensure better
performance.
In a database, default values are substituted when no value is assigned to a field in a table column.
Basically, each column can be specified with a default value. In this way, SQL server management
studio specifies default values, which can be created only for the current databases. Note that if the
default value exceeds the size of the column field, it can be truncated.
56. What is SQL Injection, and how to avoid it?
SQL injection is a malicious attack sent targeting an SQL server instance. It is usually sent through
strings of statements and passed into the SQL server for execution. To avoid SQL injection, all
statements must be verified for malicious vulnerabilities before allowing for execution.
In addition to that, the following methods can be applied to avoid SQL injections. They are given as
follows:
58. Write the SQL statements that can be used to return even number records and odd number
records?
You can use the following statement to retrieve even number records from a table.
You can use the following statement to retrieve odd number records from a table.
SQL aliases help to assign temporary names for a table or column. It is used to simplify table or
column names. And aliases can exist only for that query period. It can be created using the ‘AS’
keyword. Know that creation of an alias is in no way affecting the column names in the database. It
can be applied when more than one table is involved in a query.
60. What is the difference between OLAP and OLTP?
OLAP is known as Online Analytical Processing. It consists of tools used for data analysis that will be
used for making better decisions. It can work on multiple database systems' historical data and
provide valuable insights. For example, NETFLIX and SPOTIFY generate insights from past data.
On the other hand, OLTP is known as Online Transaction Processing, and it works on operational
data. OLTP manages ACID properties during transactions. Specifically, it performs faster than OLAP
so that it can be used in online ticket booking, messaging services, etc.
Data inconsistency occurs when the same data exists in many tables in different formats. In other
words, the same information about an object or person may be spread across the database in
various places creating duplication. It decreases the reliability of the data and decreases the query
performance significantly. To overcome this drawback, we can use constraints on the database.
Collation allows to sort and compare data with pre-defined rules. These rules help to store, access
and compare data effectively. The collation rules are applied while executing insert, select, update
and delete operations. SQL servers can store objects that have different collations in a single
database. Note that collation offers case-sensitivity and accent sensitivity for datasets.
A copy of a table can be created from an existing table using the combination of CREATE and
SELECT statements. Using these statements, you can select all the columns or specific columns
from an existing table. As a result, the new table will be replaced with all the values of the existing
table. Here, the WHERE clause can select the specific columns from the table.
SELECT [column1,column2,…..columnN]
FROM EXISTING_TABLE_NAME1
[WHERE]
We can fetch common records using INTERSECT commands in SQL. The main thing about this
statement is that it returns only the common records. It means that this statement helps to
eliminate duplication of data.
The syntax for this statement is given as below:
SELECT CustomerID
INTERSECT
SELECT CustomerID
65. What are the common clauses used with SELECT Statements?
The common clauses such as FOR, ORDER BY, GROUP BY, and HAVING are used with SELECT
statements.
FOR Clause - it specifies the different formats for viewing result sets such as browser mode
cursor, XML, and JSON file.
ORDER BY Clause - It sorts the data returned by a query in a specific order. It helps to
determine the order for ranking functions.
GROUP BY Clause - It groups the result set of the SELECT statement. It returns one row per
group.
HAVING Clause – It is used with the GROUP BY clause and specifies a search condition for a
group.
66. What is COALESCE and describe any two properties of COALESCE functions?
COALESCE is an expression that evaluates arguments in a list and only returns the non-NULL value.
This statement will return 14 after the execution since the first value is the NULL in this argument
list.
Properties of COALESCE function:
MERGE allows combining the INSERT, DELETE and UPDATE functions altogether. This statement can
be applied when two statements have complex matching characteristics. Though the MERGE
statement seems to be complex, it provides much more advantages to developers when they get
familiar with this statement. It reduces I/O operations significantly and allows to read data only
from the source.
Clauses are nothing but they are the built-in functions of SQL. They help to retrieve data very
quickly and efficiently. Clauses are much-needed for developers when there is a large volume of
data in a database. The result set of clauses would be a pattern, group, or an ordered format.
WHERE Clause
OR Clause
And Clause
Like Clause
Limit Clause
Order By
Group By
If you need to rename a table name in SQL, you can use the RENAME OBJECT statement to achieve
the same.
You have to execute the following steps to change a table name using SQL.
SQL PL/SQL
SQL executes the queries such as creating tables, It is used to write program blocks, functions,
deleting tables, and inserting into tables. procedures, triggers, packages, and cursors.
Mainly, it is used to retrieve data from databases Used for creating web applications and server
and modify tables. pages
CHAR VARCHAR
It is a fixed-length character string data type It is a variable-length character string data type.
The data type can be a single byte or multiple-byte It can accept character strings up to 255 bytes
This data type can be used when the character This data type is used when the character length is
length is known not clear
This is used when the character length of the data This is used when the character length of the data
is the same. is variable.
Workload
Throughput
Resources
Optimization
Contention
75. List out the factors that affect the query performance?
The following are the factors that affect the performance of queries.
UNION: It is the operator that returns a single result set for two separate queries. And this
operator functions based on specific conditions.
INTERSECT: It is the operator that returns only the distinct rows from two separate queries.
DROP TRUNCATE
All the constraints will be removed after the Constraints don’t get affected because of the
execution of the DROP function. execution of this statement
The structure of the data also will be removed The structure of the data won’t get affected
This statement is used to select distinct values from a table. The table might consist of many
duplicate records, whereas this statement helps to return only the distinct values.
FROM table_name1;
79. How can you differentiate the RANK and DENSE_RANK functions?
Both RANK and DENSE_RANK are used as the ranking functions, which perform ranking of data
based on specific conditions. When the RANK statement is executed, it returns a ranking of values
of a table based on specific conditions. At the same time, the result set up skip positions in the
ranking if there are the same values. Simply put, there will be a discontinuity in the numbering of
ranking. On the other hand, when the RANK_DENSE function is executed, it doesn’t skip any
position in the ranking of values even though there are the same values present in the table. It
returns continuous numbering of ranking.
The following example will explain the use of the RANK and DENSE_RANK functions.
Both IN and BETWEEN operators are used to return records for multiple values from a table. The IN
operator is used to return records from a table for the multiple values specified in the statement.
On the other side, BETWEEN operator is used to return records within a range of values specified in
the statement.
Both STUFF and REPLACE statements are used to replace characters in a string. The STUFF
statement inserts the specific characters in a string replacing existing characters. In comparison,
the REPLACE statement replaces existing characters with specific characters throughout the string.
Output: rajan
Output: ramarathar
COMMIT statement allows saving the changes made in a transaction permanently. Once a
transaction is committed, the previous values cannot be retrieved.
SELECT *
FROM Staff
sql>COMMIT;
This statement grants permissions for users to perform operations such as SELECT, UPDATE,
INSERT, DELETE, or any other operations on tables and views.
For example, if you would like to provide access to a user for updating tables, then the following
statement must be used. In addition, the user too can grant permissions to other users.
GRANT UPDATE ON table_name TO user_name WITH GRANT OPTION
84. What is the difference between White Box Testing and Black Box Testing?
Testing is known as outer or external software Testing is known as inner or internal software
testing testing
Programming knowledge is not required for testers Programming knowledge is a must for testers
Extracting – It is about extracting data from the source, which can be a data warehouse, CRMs,
databases, etc.
Loading – It is the process of loading the transformed data into the new destination. There are two
types of loading data: full loading and incremental loading.
If a trigger fires another trigger while being executed, it is known as a NESTED trigger. Nested
triggers can be fired while executing DDL and DML operations such as INSERT, DROP and UPDATE.
Nested triggers help to back up the rows affected by the previous trigger. There are two types of
nested triggers: AFTER triggers and INSTEAD OF triggers.
We can use the INSERT INTO statement to insert multiple rows in a database table in SQL.
The following syntax can be used for this case:
When two processes repeat the same type of interaction continually without making any progress
in the query processing, it leads to a live-lock situation in the SQL server. There is no waiting state in
live-lock, but the processes are happening concurrently, forming a closed loop.
For example, let us assume process A holds a resource D1 and requests resource D2. At the same
time, assume that process B holds a resource D2 and requests resource D1. This situation won’t
progress any further until any of the processes should either drop holding a resource or drop
requesting a resource.
Equi-join creates a join operation to match the values of the relative tables. The syntax for this
operation can be given as follows:
SELECT column_list
On the other side, Non-Equi join performs join operations except equal. This operator works with
<,>,>=, <= with conditions.
SELECT *
FROM table_name1,table_name2
There are three types of SQL sandboxes. They are given as follows:
Safe access sandbox
Unsafe access sandbox
External access sandbox
It is the process of converting row and page locks into table locks. Know that Reduction of lock
escalation would increase the server performance. To improve performance, we need to keep
transactions short and reduce lock footprints in queries as low as possible. Besides, we can disable
lock escalation at the table and instance levels, but it is not recommended.
The UPDATE statement allows you to update a database table in SQL. After the execution, one or
more columns in a table will be replaced by new values.
UPDATE table_name
SET
Column1 = new_value1,
Column2 = new_value2,
..…..
WHERE
Condition;
This statement requires a table name, new values, and conditions to select the rows. Here, the
WHERE statement is not mandatory. Suppose the WHERE clause is used, all the rows in a table will
be updated by the new values.
USE AdventureWorks2012;
GO
@LastName nvarchar(25),
@FirstName nvarchar(25)
AS
SET NOCOUNT ON
FROM HR.vEmployeeDivisionHistory
GO
You can use the following statement to run the newly created stored procedure.
When a foreign key is created under this option, and if a referenced row in the parent table is
deleted, the referencing row(s) in a child table also gets deleted.
On similar tracks, when a referenced row is updated in a parent table, the referencing row(s) in a
child table is also updated.
Single-column indexes
Unique indexes
Composite indexes
Implicit indexes
96. What do you mean by auto-increment?
It is a unique number that will be generated when a new record is inserted into a table. Mainly, it
acts as the primary key for a table.
We can use the LIKE command in SQL to identify patterns in a database using character strings.
Generally, a pattern may be identified using wildcard characters or regular characters. So, pattern
matching can be performed using both wildcard characters and string comparison characters as
well. However, pattern matching through wildcard characters is more flexible than using string
comparison characters.
Blocking is a phenomenon that occurs when a process locks a resource ‘A’, and the same resource
is requested by another process ‘B’. Now, process ‘B’ can access the resource ‘A’ only when process
‘A’ releases the lock. The process ‘B’ has to wait until the process ‘A’ releases the lock. The SQL
server doesn't interfere and stops any process in this scenario.
On the contrary, deadlocking is the phenomenon that occurs when a resource 'A' is locked by a
process 'A' and the same resource is requested by another process 'B'. Similarly, a resource 'B' is
locked by process 'B' and requested by process A. This scenario causes a deadlock situation, and it
is a never-ending process. So, the SQL server interferes and voluntarily stops any one of the
processes to remove the deadlock.
COALESCE function returns the first value that is non-NULL in the expression, whereas ISNULL is
used to replace the non-NULL values in the expression.
FROM table_name;
SELECT column(s),ISNULL(column_name,value_to_replace)
FROM table_name;
100. What is the difference between NVL and the NVL (2) functions in SQL?
Both the functions are used to find whether the first argument in the expression is NULL. The NVL
function in the SQL query returns the second argument if the first argument is NULL. Otherwise, it
returns the first argument.
The NVL2 function in SQL query returns the third argument if the first argument is NULL.
Otherwise, the second argument is returned.
Conclusion
All of us know that knowledge is power. After reading this blog, we hope you might have gathered
good knowledge about SQL and understood it in depth. Keep reading the Q&A questions for few
more times. It will help you get familiar with the terminologies and syntaxes used in this blog.
2. What is PL/SQL?
Oracle PL/SQL is a procedural language that has both interactive SQL and procedural
programming language constructs such as iteration and conditional branching.
PL/SQL uses a block structure as its basic structure. Anonymous blocks or nested blocks can be
used in PL/SQL.
A trigger is a database object that automatically executes in response to some events on the
tables or views. It is used to apply the integrity constraint to database objects.
A PL/SQL program unit associated with a particular database table is called a database trigger.
It is used for:
There are various kinds of data types present in PL/SQL. They are:
1. Scalar: The scalar data type is a one-dimensional data type with no internal components.
CHAR, DATE, LONG, VARCHAR2, NUMBER, and BOOLEAN are some examples of the scalar
data type.
2. Composite: The composite data type is made up of different data types that are easy to
update and have internal components that can be utilized and modified together. For
instance, RECORD, TABLE, VARRAY, and so on.
3. Reference: The reference data type stores pointers, which are values that relate to other
programs or data elements. REF CURSOR is an example of the reference data type.
4. Large Object: The large object data type stores locators, which define the location of large
items stored out of line such as video clips, graphic images, and so on. BLOB, BFILE, CLOB,
and NCLOB are examples of the large object data type.
Submit
6. What are the Basic Parts of a Trigger?
Syntax checking, binding, and P-code generation are all part of the compilation process. Syntax
checking looks for compilation issues in PL/SQL code. After all mistakes have been fixed, the
data holding variables are given a storage address. This process is referred to as binding. The
PL/SQL engine’s P-code is a set of instructions. For named blocks, P-code is saved in the
database and used the next time it is run.
Go through the Handling PL/SQL Errors tutorial page to know how error handling is done in
PL/SQL!
8. What is a Join?
A join is a query that combines rows from two or more tables, views, or materialized views. A
join is performed by the Oracle Database whenever there are multiple tables in the FROM
clause of the query. Most of these queries contain at least one join condition, either in the
FROM or WHERE clause.
9. What is a View?
A view is created by joining one or more tables. It is a virtual table that is based on the result
set of an SQL statement; it contains rows and columns, just like a real table. A view can be
created with the CREATE VIEW statement.
Intermediate PL SQL Interview Questions
Check out this insightful PL/SQL tutorial to learn more about Pl/SQL Packages!
COMMIT: The COMMIT command saves changes to a database permanently during the current
transaction.
SAVEPOINT: The SAVEPOINT command saves the current point with a unique name during the
processing of a transaction.
Career Transition
Exception handling is a mechanism that is implemented to deal with runtime errors. It can be
adjusted in PL/SQL. PL/SQL provides the exception block that raises the exception, thus helping
the programmer to find the fault and resolve it. When an error occurs, the program’s error
handling code is included. There are two different types of exceptions defined in PL/SQL:
User-defined exception
System-defined exception
DDL: Data definition language (DDL) helps in the creation of a database structure or schema.
CREATE, DROP, ALTER, RENAME, and TRUNCATE are the five types of DDL commands in SQL.
DML: Data manipulation language (DML) allows you to insert, change, and delete data from
a database instance. DML is in charge of making all kinds of changes to a database’s data.
The database application and the user can insert data and information using three basic
commands—INSERT, UPDATE, and DELETE.
DCL: GRANT and REVOKE are the commands in the data control language (DCL) that can be
used to grant rights and permissions. The database system’s parameters are controlled by
other permissions.
TCL: Transaction control language (TCL) commands deal with database transactions. Some
of the TCL commands are COMMIT, ROLLBACK, and SAVEPOINT.
DQL: Data query language (DQL) is used to retrieve data from the database. It just has one
command, which is SELECT.
16. What are the Different Methods to Trace the PL/SQL Code?
Tracing the code is a crucial technique to measure its performance during the runtime.
The different methods of tracing the code include:
DBMS_APPLICATION_INFO
DBMS_TRACE
DBMS_SESSION and DBMS_MONITOR
trcsess and tkprof utilities
IN: The IN parameter allows you to send values to the procedure that is being called. The IN
parameter can be set to default values. It behaves as a constant and cannot be changed.
OUT: The OUT parameter returns a value to the caller. The OUT parameter is an uninitialized
variable that cannot be used in expressions.
IN OUT: The IN OUT parameter sends starting values to a procedure and returns the
updated values to the caller. This parameter should be treated as an initialized variable and
given a value.
PL/SQL records are a collection of values. To put it another way, PL/SQL records are a collection
of many pieces of information, each of which is of a simpler type and can be associated with
one another as fields.
We use an index in a table to allow quick access to rows. For procedures that return a small
percentage of a table’s rows, an index allows quicker access to data.
Functions: The main purpose of PL/SQL functions is to compute and return a single value.
The functions have a return type in their specifications and must return a specified value in
that type.
Procedures: Procedures do not have a return type and should not return any value, but
they can have a return statement that simply stops its execution and returns to the caller.
Procedures are used to return multiple values; otherwise, they are generally similar to
functions.
Packages: Packages are schema objects that group logically related PL/SQL types, items,
and subprograms. You can also say that packages are a group of functions, procedures,
variables, and record TYPE statements. Packages provide modularity, which aids in
application development. Packages are used to hide information from unauthorized users.
A stored procedure is a sequence of statements or a named PL/SQL block that performs one or
more specific functions. It is similar to a procedure in other programming languages. It is
stored in the database and can be repeatedly executed. It is stored as a schema object and can
be nested, invoked, and parameterized.
When the name of the same procedure is repeated with the parameters of different data types
and parameters in different places, then that is referred to as procedure overloading.
Expressions are made up of a series of literals and variables that are separated by operators.
Operators are used in PL/SQL to manipulate, compare, and calculate data. Expressions are
made up of two parts, operators and operands.
25. Which Cursor Attributes are the Result of a Saved DML
Statement, when it is Executed?
The statement’s result is saved in four cursor attributes. The four attributes are:
SQL% FOUND
SQL% NOTFOUND
SQL% ROWCOUNT
SQL% ISOPEN
A cursor is a temporary work area that is created in system memory when an SQL statement is
executed. A cursor contains information on a select statement and the row of data accessed by
it. This temporary work area stores the data, which is retrieved from the database, to
manipulate it. A cursor can hold more than one row but can process only one row at a time. A
cursor is required to process rows individually for queries.
Explicit Cursor: A programmer declares and names an explicit cursor for the queries that
return more than one row. An explicit cursor is a SELECT statement that is declared explicitly
in the current block’s declaration section or in a package definition. The following are the
commands that are used for explicit cursors in PL/SQL:
o OPEN
o FETCH
o CLOSE
When the OPEN cursor command is used to open a cursor, it performs the following
operations:
Stored procedures have various advantages to help you design sophisticated database
systems. Some of the advantages of stored procedures as listed below:
Better performance
Higher productivity
Ease of use
Increased scalability
Interoperability
Advance security
Replication
31. What are the Various Types of Schema Objects that can be
Created by PL/SQL?
There are various types of schema objects that are created by PL/SQL. Some of them are
mentioned below:
Implicit records are handy since they do not require hard-coded descriptions. Because implicit
records are based on database table records, any changes to the database table records will be
reflected in the implicit records automatically.
In PL/SQL, comments help readability by describing the purpose and function of code
portions. Two types of comments are available in PL/SQL. They are as follows:
The %TYPE property is used to declare a column in a table that includes the value of that
column. The variable’s data type is the same as the table’s column.
35. What is %ROWTYPE?
The %ROWTYPE property is used to declare a variable that contains the structure of the records
in a table. The variable’s data type is the same as the table’s columns.
A temporary tablespace is used to store temporary items such as sort structures, while a
permanent tablespace is used to store things that will be used as the database’s genuine
objects.
A mutating table error occurs when a trigger tries to update a row that is currently in use. It can
be fixed by using views or temporary tables so that the database selects one and updates the
other.
39. What does the PLVtab Enable you to do when you Show the
Contents of PL/SQL Tables?
PLVtab enables you to do following when you show the contents of PL/SQL tables:
To save a msg in a table, you either load the individual messages with calls to the add_text
procedure or load sets of messages from a database table using the load_from_dbms
procedure.
Wish to learn more? Visit the PL/SQL Collections and Records tutorial page!
41. What are Pseudocolumns and how do they work? How can
Pseudocolumns be used in Procedure Statements?
Pseudocolumns aren’t genuine table columns but they behave like them. Pseudocolumns are
used to retrieve specific information in SQL statements. Although pseudocolumns are
recognized by PL/SQL as part of SQL statements, they cannot be used directly in a procedural
language. The following are the pseudocolumns that are used:
The strings are concatenated using the || operator. The || operator is employed by both
DBMS_OUTPUT.put line and select statements.
The value of the error number for the most recent error detected is returned by SQLCODE. The
SQLERRM function returns the actual error message for the most recent issue. They can be
used in exception handling to report or save the error that happened in the code in the error
log database. These are especially important for the exception WHEN OTHERS.
This procedure can be used to send user-defined error messages from stored subprograms.
You can prevent returning unhandled exceptions by reporting failures to your application. It
appears in two places, the executable section and the exceptional section.
The SQL % NOTFOUND attribute can be used to determine whether or not the UPDATE
statement successfully changed any records. If the last SQL statement run had no effect on any
rows, this variable returns TRUE.
47. How can you locate a PL/SQL Block when a Cursor is Open?
The %ISOPEN variable cursor status can be used to find the PL/SQL block.
The key differences between stored procedure and stored function are:
Returning the value in a stored procedure is optional, while returning the value in a stored
function is required.
A stored procedure can have both input and output parameters, while a stored function can
only have either an input parameter or an output parameter.
Exception handling is possible in a stored procedure, whereas it is not possible in a stored
function.
If you have any doubts or queries related to PL/SQL, get them clarified from our PL/SQL
experts on our SQL Community!
Use the following code to display the highest salary from an employee table:
Select max(sal) from emp;
The table USER SOURCE is used to store user-defined functions and procedures. To examine
them, the function and procedure names should be specified in uppercase (in select
command). The following command is used to inspect the source code of a user-defined
function or method:
Select text from user_source where name=’PROCEDURE_NAME’;
Join is a keyword that is used to query data from multiple tables based on the relationship
between the fields of tables. Keys play a major role in Joins.
The PL/SQL output is shown on the screen using the DMS_OUTPUT package. get_line, put_Line,
new_line, and many more are found in DBMS_OUTPUT. The put_line procedure, which is a part
of the DBMS_OUPUT package, is used to display the information in the line.
Use the EXECUTE keyword. The EXEC keyword can also be used.
Call the name of the procedure from a PL/SQL block.
Syntax
EXECUTE procedure_name;
Or
Exec procedure_name;
60. What are the differences between ANY and ALL operators?
ALL Operator: Value is compared to every value returned by the subquery using the ALL
operator.
ANY Operator: Value is compared to each value returned by the subquery using the ANY
operator. SOME is a synonym for ANY operator.
One can switch from Init.ora file to Spfile by creating a spfile from the pfile command.
A subquery is a query within another query. The outer query is known as the main query and
the inner query is called the subquery. A subquery is executed first, and the result of the
subquery is passed to the main query.
Correlated
Non-correlated
64. How can you Read/Write Files in PL/SQL?
One can read/write operating system text files by using the UTL_FILE package. It provides a
restricted version of the operating system stream file I/O, and it is available for both client-side
and server-side PL/SQL.
DECLARE
fileHandler UTL_FILE.FILE_TYPE;
BEGIN
fileHandler := UTL_FILE.FOPEN('/home/oracle/tmp','myoutput','z');
UTL_FILE.PUTF(file, 'Value of func1 is %sn',func1(2));
UTL_FILE.FCLOSE(file;
END;
In PL/SQL, one of the collection types is nested tables. They can be made in a PL/SQL block or
at the schema level. These are similar to a 1D array, except their size can be dynamically
extended.
SQL stands for Structured Query Language. SQL is a language used to communicate with the server
to access, manipulate, and control data.
1. Data Retrieval: SELECT
Alias is a user-defined alternative name given to the column or table. By default column, alias
headings appear in upper case. Enclose the alias in double quotation marks (“ “) to make it case-
sensitive. “AS” Keyword before the alias name makes the SELECT clause easier to read.
For example Select emp_name AS name from employee; (Here AS is a keyword and “name” is an
alias).
If you want to enrich your career and become a Professional in Oracle PL SQL, then enroll in "Oracle PL SQL
Training" - This course will help you to achieve excellence in this domain.
A Literal is a string that can contain a character, a number, or a date that is included in the SELECT
list and that is not a column name or a column alias. Date and character literals must be enclosed
within single quotation marks (‘ ‘), number literals need not.
For exp: Select last_name||’ is a’||job_id As “emp details” from the employee; (Here “is a” is a
literal).
SQL iSQL*Plus
Is a Language Is an Environment
3 Comparison conditions
5 [NOT] BETWEEN
8 OR logical condition
7) What are SQL functions? Describe in brief different types of SQL functions?
SQL Functions are a very powerful feature of SQL. SQL functions can take arguments but always
return some value.1
There are two distinct types of SQL functions:
1) Single-Row functions: These functions operate on a single row to give one result per row.
1. Character
2. Number
3. Date
4. Conversion
5. General
2) Multiple-Row functions: These functions operate on groups of rows to give one result per
group of rows.
1. AVG
2. COUNT
3. MAX
4. MIN
5. SUM
6. STDDEV
7. VARIANCE
Character functions: accept character input and return both character and number values. Types
of character function are:
Number Functions: accept Numeric input and return numeric values. Number Functions are:
ROUND, TRUNC, and MOD
Date Functions: operates on values of the Date data type. (All date functions return a value of
DATE data type except the MONTHS_BETWEEN Function, which returns a number. Date Functions
are MONTHS_BETWEEN, ADD_MONTHS, NEXT_DAY, LAST_DAY, ROUND, TRUNC.
The dual table is owned by the user SYS and can be accessed by all users. It contains one
columnDummy and one row with the value X. The Dual Table is useful when you want to return a
value only once. The value can be a constant, pseudocolumn, or expression that is not derived from
a table with user data.
Conversion Functions convert a value from one data type to another. Conversion functions are of
two types:
1. TO_NUMBER
2. TO_CHAR
3. TO_DATE
TO_DATE function is used to convert a Character string to date format. TO_DATE function use fx
modifier which specifies the exact matching for the character argument and date format model of
TO_DATE function. TO_DATE function format: TO_DATE ( char[, ‘ format_model’] ).
For exp: Select TO_DATE (‘May 24, 2007’,’ mon dd RR’) from dual;
Read these latest SQL Interview Questions and Answers that help you grab high-paying jobs
NVL: Converts a null value to an actual value. NVL (exp1, exp2) .If exp1 is null then the NVL function
returns the value of exp2.
NVL2: If exp1 is not null, nvl2 returns exp2, if exp1 is null, nvl2 returns exp3. The argument exp1
can have any data type. NVL2 (exp1, exp2, exp3)
NULLIF: Compares two expressions and returns null if they are equal or the first expression if they
are not equal. NULLIF (exp1, exp2)
COALESCE: Returns the first non-null expression in the expression list. COALESCE (exp1, exp2…
expn). The advantage of the COALESCE function over the NVL function is that the COALESCE
function can take multiple alternative values.
Conditional Expressions: Provide the use of IF-THEN-ELSE logic within a SQL statement. Example:
CASE Expression and DECODE Function.
12) What is the difference between COUNT (*), COUNT (expression), COUNT (distinct expression)?
(Where expression is any column name of Table)?
COUNT (*): Returns a number of rows in a table including duplicates rows and rows
containing null values in any of the columns.
COUNT (EXP): Returns the number of non-null values in the column identified by expression.
COUNT (DISTINCT EXP): Returns the number of unique, non-null values in the column
identified by expression.
13) What is a Sub Query? Describe its Types?
Types of subqueries:
Single-Row Subquery: Queries that return only one row from the inner select statement.
Single-row comparison operators are: =, >, >=, <, <=, <>
Multiple-Row Subquery: Queries that return more than one row from the inner Select
statement. There are also multiple-column subqueries that return more than one column
from the inner select statement. Operators include: IN, ANY, ALL.
ANY Operator compares value to each value returned by the subquery. ANY operator has a
synonym SOME operator.
The MERGE statement inserts or updates rows in one table, using data from another table. It is
useful in data warehousing applications.
16) What is the difference between the “VERIFY” and the “FEEDBACK” command?
VERIFY Command: Use VERIFY Command to confirm the changes in the SQL statement (Old
and New values). Defined with SET VERIFY ON/OFF.
Feedback Command: Displays the number of records returned by a query.
17) What is the use of Double Ampersand (&&) in SQL Queries? Give an example?
Use “&&” if you want to reuse the variable value without prompting the user each time.
For ex: Select empno, ename, &&column_name from employee order by &column_name;
18) What are Joins and how many types of Joins are there?
Joins are used to retrieve data from more than one table.
Cartesian Join: When a Join condition is invalid or omitted completely, the result is a Cartesian
product, in which all combinations of rows are displayed. To avoid a Cartesian product, always
include a valid join condition in a “where” clause. To Join ‘N’ tables together, you need a minimum of
N-1 Join conditions.
For exp: to join four tables, a minimum of three joins is required. This rule may not apply if the
table has a concatenated primary key, in which case more than one column is required to uniquely
identify each row.
Equi Join: This type of Join involves primary and foreign key relations. Equi Join is also called Simple
or Inner Joins.
Non-Equi Joins A Non-Equi Join condition containing something other than an equality operator.
The relationship is obtained using an operator other than an equal operator (=). The conditions
such as <= and >= can be used, but BETWEEN is the simplest to represent Non-Equi Joins.
Outer Joins: Outer Join is used to fetch rows that do not meet the join condition. The outer join
operator is the plus sign (+), and it is placed on the side of the join that is deficient in information.
The Outer Join operator can appear on only one side of the expression, the side that has
information missing. It returns those rows from one table that has no direct match in the other
table. A condition involving an Outer Join cannot use IN and OR operators.
Cross Join:
The Cross Join clause produces the cross-product of two tables. This is the same as a Cartesian
product between the two tables.
Natural Joins:
This is used to join two tables automatically based on the columns which have matching data types
and names, using the keyword NATURAL JOIN. It is equal to the Equi-Join. If the columns have the
same names but different data types, then the Natural Join syntax causes an error.
If several columns have the same names but the data types do not match, then the NATURAL JOIN
clause can be modified with the USING clause to specify the columns that should be used for an
equi Join. Use the USING clause to match only one column when more than one column matches.
Do not use a table name or alias in the referenced columns. The NATURAL JOIN clause and USING
clause are mutually exclusive.
For ex: Select a.city, b.dept_name from loc a Join dept b USING (loc_id) where loc_id=10;
Use the ON clause to specify a join condition. The ON clause makes the code easy to understand.
ON clause is equals to Self Joins. The ON clause can also be used to join columns that have different
names.
Left Outer Join displays all rows from the table that is Left to the LEFT OUTER JOIN clause, right
outer join displays all rows from the table that is right to the RIGHT OUTER JOIN clause, and full
outer join displays all rows from both the tables either left or right to the FULL OUTER JOIN clause.
Read these latest SQL Interview Questions and Answers for Experienced that help you grab high-paying jobs
PlSQL Developer Interview Questions
21) What is the difference between Entity, Attribute, and Tuple?
Entity: A significant thing about which some information is required. For exp: EMPLOYEE (table).
Attribute: Something that describes the entity. For exp: empno, emp name, emp address
(columns). Tuple: A row in a relation is called Tuple.
22) What is a Transaction? Describe common errors that can occur while executing any
Transaction?
Transaction consists of a collection of DML statements that forms a logical unit of work.
The common errors that can occur while executing any transaction are:
Locking prevents destructive interaction between concurrent transactions. Locks held until Commit
or Rollback. Types of locking are:
COMMIT: Ends the current transaction by making all pending data changes permanent.
ROLLBACK: Ends the current transaction by discarding all pending data changes.
SAVEPOINT: Divides a transaction into smaller parts. You can roll back the transaction to a
particular named savepoint.
A column can be given a default value by using the DEFAULT option. This option prevents null
values from entering the column if a row is inserted without a value for that column. The DEFAULT
value can be a literal, an expression, or a SQL function such as SYSDATE and USER but the value
cannot be the name of another column or a pseudo column such as NEXTVAL or CURRVAL.
28) What is the difference between USER TABLES and DATA DICTIONARY?
USER TABLES: This is a collection of tables created and maintained by the user. Contain USER
information.
DATA DICTIONARY: This is a collection of tables created and maintained by the Oracle
Server. It contains database information. All data dictionary tables are owned by the SYS
user.
Data Types is a specific storage format used to store column values. Few data types used in SQL
are:
A LONG column is not copied when a table is created using a subquery. A LONG column cannot be
included in a GROUP BY or an ORDER BY clause. Only one LONG column can be used per table. No
constraint can be defined on a LONG column.
SET UNUSED option marks one or more columns as unused so that they can be dropped when the
demand on system resources is lower. Unused columns are treated as if they were dropped, even
though their column data remains in the table’s rows. After a column has been marked as unused,
you have no access to that column.
A select * query will not retrieve data from unused columns. In addition, the names and types of
columns marked unused will not be displayed during a DESCRIBE, and you can add to the table a
new column with the same name as an unused column. The SET UNUSED information is stored in
the USER_UNUSED_COL_TABS dictionary view.
TRUNCATE DELETE
Removes all rows from a table and releases storage Removes all rows from a table but does not release
space used by that table. storage space used by that table.
Database Triggers do not fire on TRUNCATE. Database Triggers fire on DELETE.
CHAR pads blank spaces to a maximum length, whereas VARCHAR2 does not pad blank spaces.
35) What are Constraints? How many types of constraints are there?
Constraints are used to prevent invalid data entry or deletion if there are dependencies.
Constraints enforce rules at the table level. Constraints can be created either at the same time as
the table is created or after the table has been created. Constraints can be defined at the column or
table level. Constraint defined for a specific table can be viewed by looking at the USER-
CONSTRAINTS data dictionary table. You can define any constraint at the table level except NOT
NULL which is defined only at the column level. There are 5 types of constraints:
NOT NULL: NOT NULL Constraint ensures that the column contains no null values.
UNIQUE KEY: UNIQUE Key Constraint ensures that every value in a column or set of columns
must be unique, that is, no two rows of a table can have duplicate values in a specified
column or set of columns. If the UNIQUE constraint comprises more than one column, that
group of columns is called a Composite Unique Key. There can be more than one Unique key
on a table. Unique Key Constraint allows the input of Null values. Unique Key automatically
creates an index on the column it is created.
PRIMARY KEY: Uniquely identifies each row in the Table. Only one PRIMARY KEY can be
created for each table but can have several UNIQUE constraints. PRIMARY KEY ensures that
no column can contain a NULL value. A Unique Index is automatically created for a PRIMARY
KEY column. PRIMARY KEY is called a Parent key.
FOREIGN KEY: This is also called Referential Integrity Constraint. FOREIGN KEY is one in
which a column or set of columns take references of the Primary/Unique key of the same or
another table. FOREIGN KEY is called a child key. A FOREIGN KEY value must match an
existing value in the parent table or be null.
CHECK KEY: Defines a condition that each row must satisfy. A single column can have
multiple CHECK Constraints. During CHECK constraint following expressions is not allowed:
37) What is the main difference between Unique Key and Primary Key?
The main difference between Unique Key and Primary Key is:
A table can have more than one Unique Key. A table can have only one Primary Key.
The unique key column can store NULL values. The primary key column cannot store NULL values.
Uniquely identify each value in a column. Uniquely identify each row in a table.
38) What is the difference between ON DELETE CASCADE and ON DELETE SET NULL?
ON DELETE CASCADE Indicates that when the row in the parent table is deleted, the dependent
rows in the child table will also be deleted. ON DELETE SET NULL Covert foreign key values to null
when the parent value is removed. Without the ON DELETE CASCADE or the ON DELETE SET NULL
options, the row in the parent table cannot be deleted if it is referenced in the child table.
The columns in a table that can act as a Primary Key are called Candidate Key.
A View logically represents subsets of data from one or more tables. A View is a logical table based
on a table or another view. A View contains no data of its own but is like a window through which
data from tables can be viewed or changed. The tables on which a view is based are called Base
Tables. The View is stored as a SELECT statement in the data dictionary. View definitions can be
retrieved from the data dictionary table: USER_VIEWS.
1. Group Functions
2. A Group By clause
3. The Distinct Keyword
4. The Pseudo column ROWNUM Keyword.
1. Group Functions
2. A Group By clause
3. The Distinct Keyword
4. The Pseudo column ROWNUM Keyword.
5. Columns defined by expressions (Ex; Salary * 12)
43) What is PL/SQL, Why do we need PL/SQL instead of SQL, Describe your experience working
with PLSQL and What are the difficulties faced while working with PL SQL and How did you
overcome them?
Trigger is also the same as stored procedure & also it will automatically be invoked whenever DML
operation performed against table or view.
Statement Level Trigger: In a statement-level trigger, the trigger body is executed only once for
the DML statement.
Row Level Trigger: In a row-level trigger, the trigger body is executed for each row DML
statement. It is the reason, we are employing each row clause and internally stored DML
transaction in trigger specification, these qualifiers: old, new, are also called records type variables.
These qualifiers are used in trigger specification & trigger body.
Syntax
old.column_name
Syntax
new column_name
When we use these qualifiers in trigger specification then we are not allowed to use “:” in form of
the names of the qualifiers.
declare
a exception
begin
If to_char(sysdate, ‘DY)=’THU’
then
raise a;
end if;
exception
when a then
dbms_output.put_line(‘my exception raised on thursday’);
end
;
46) Write a PL/SQL program to retrieve the emp table and then display the salary?
declare
v_sal number(10);
begin select max(sal)intr v_sal;
from emp;
dbms_output.put_line(v.sal);
end;
/
(or)
declare
A number(10);
B number(10);
C number(10);
begin
a:=70;
b:=30;
c:=greatest+(a,b);
dbms_output.put_line(c);
end;
/
Output:70
47) Write a PL/SQL cursor program that is used to calculate total salary from emp table without
using sum() function?
Declare
cursor c1 is select sal from emp;
v_sal number(10);
n.number(10):=0;
begin
open c1;
loop
fetch c1 into v_sal;
exit when c1%not found;
n:=n+v_sal;
end loop;
dbms_output.put_line(‘tool salary is’||’ ‘ ||n);
close c1;
end;
/
48) Write a PL/SQL cursor program to display all employee names and their salary from the emp
table by using % not found attributes?
Declare
Cursor c1 is select ename, sal from emp;
v_ename varchar2(10);
v_sal number(10);
begin
open c1;
loop
fetch c1 into v_ename, v_sal;
exist when c1 % notfound;
dbms_output.put_line(v_name ||’ ‘||v_sal);
end loop;
close c1;
end;
/
Into a row-level trigger based on a table, the trigger body cannot read data from the same
table and also we cannot perform DML operation on the same table.
If we are trying this oracle server returns mutating error oracle-4091: table is mutating.
This error is called a mutating error, and this trigger is called a mutating trigger, and the
table is called a mutating table.
Mutating errors are not occurred in statement-level trigger because through this statement-
level trigger when we are performing DML operations automatically data committed into the
database, whereas in the row-level trigger when we are performing transaction data is not
committed and also again we are reading this data from the same table then only mutating
errors is occurred.
If we want to perform multiple operations in different tables then we must use triggering events
within the trigger body. These are inserting, updating, deleting clauses. These clauses are used in
the statement, row-level triggers. These triggers are also called trigger predicate clauses.
→ Explore Oracle PL SQL Sample Resumes Download & Edit, Get Noticed by Top Employers!
Discard file we must specify within the control file by using the discard file clause.
The discard file also stores reflected records based on when clause condition within the
control file. This condition must be satisfied in the table clause.
52) What is REF CURSOR (or) CURSOR VARIABLE (or) DYNAMIC CURSOR?
Oracle 7.2 introduced ref cursor, This is a user-defined type that is used to process multiple records
and also this is a record by record process.
In static cursor database servers execute only one select statement at a time for a single active set
area wherein ref cursor database servers execute a number of select statements dynamically for a
single active set area that's why those cursors are also called a dynamical cursor.
Generally, we are not allowed to pass static cursor as parameters to use subprograms whereas we
can also pass ref cursor as a parameter to the subprograms because basically precursor is a user-
defined type in oracle we can also pass all user-defined type as a parameter to the subprograms.
Generally, the static cursor does not return multiple records into the client application whereas the
ref cursor is allowed to return multiple records into the client application (Java, .Net, PHP, VB, C++).
This is a user-defined type so we are creating it in 2 steps process i.e first we are creating a type
then only we are creating a variable from that type that’s why this is also called a cursor variable.
A strong ref cursor is a ref cursor that has a return type, whereas a weak ref cursor has no return
type.
Syntax:
Syntax
In the Weak ref cursor, we must specify a select statement by using open for clause this clause is
used in the executable section of the PL/SQL block.
Syntax:
54) What is the Difference Between the trim, delete collection methods?
SQL> declare
type t1 is table of number(10);
v_t t1;=t1(10,20,30,40,50,60);
beign
v_t.trim(2);
dbms_output.put_line(‘after deleting last two elements’);
vt.delete(2);
dbms_output.put_line(‘after deleting second element;);
for i in v_t.first..v_t.last
loop
If v_t.exists(i) then
dbms_output.put_line(v_t(i));
end if;
end loop;
end;
/
Overload refers to the same name that can be used for a different purpose, in oracle we can also
implement an overloading procedure through the package. Overloading procedure having the
same name with different types or different numbers of parameters.
In oracle declaring procedures within the package body are called forward declaring generally
before we are calling private procedures into public procedure first we must implement private
procedure within body otherwise use a forward declaration within the package body.
In oracle when we try to convert “string type to number type” or” data string into data type” then
the oracle server returns two types of errors.
1. Invalid_number
2. Value_error (or) numeric_error
Invalid_number:
When PL/SQL block has a SQL statement and also those SQL statements try to convert string type
to number type or data string into data type then oracle server returns an error: ora-1722-Invalid
Number
For handling this error oracle provides number exception Invalid_number exception name.
Example:
begin
Insert
intoemp(empno, ename, sal)
values(1,’gokul’, ‘abc’)
exception when invalid_number then dbms_output.put_line(‘insert proper data only’);
end;/
value_error:
Whenever PL/SQL block having procedural statements and also those statements find to convert
string type to number type then oracle servers return an error: ora-6502: numeric or value error:
character to a number conversion error
For handling, this error oracle provided exception value_error exception name
Example:
begin
declare z number(10);
begin
z:= ‘&x’ + ‘&y’;
dbms_output.put_line(z);
exception when value_error then dbms_output.put_line(‘enter numeric data value for x
& y only’);
end;/
Output:
Flashback query is handled by the Database Administrator the only flashback queries along
to allow the content of the table to be retrieved with reference to the specific point of time
by using as of clause that is flashback queries retrieves accidental data after committing the
transaction also.
Flashback queries generally use undo file that is flashback queries retrieve old data before
committing the transaction oracle to provide two methods for flashback queries
PL/SQL consists of two major parts, they are package specification and package body.
1. Package specification: it acts as a public interface for your application which includes
procedures, types, etc.
2. Package Body: It contains the code required to implement the Package Specification
Tracing code is a necessary technique to test the performance of the code during runtime. We
have different methods in PL/SQL to trace the code, which are,
DBMS_ TRACE
DBMS_ APPLICATION_INFO
Tkproof utilities and trcsess
DBMS_SESSION and DBMS_MONITOR
In PL/SQL to retrieve and process more, it requires a special resource, and that resource is known
as Cursor. A cursor is defined as a pointer to the context area. The context area is an area of
memory that contains information and SQL statements for processing the statements.
An implicit cursor used in PL/SQL to declare, all SQL data manipulation statements. An implicit
cursor is used to declare SQL statements such as open, close, fetch, etc.
An explicit cursor is a cursor and which is explicitly designed to select the statement with the help
of a cursor. This explicit cursor is used to execute the multirow select function. An explicit function
is used PL/SQL to execute tasks such as update, insert, delete, etc.
It is a program in PL/SQL, stored in the database, and executed instantly before or after the
UPDATE, INSERT and DELETE commands.
Triggers are programs that are automatically fired or executed when some events happen and are
used for:
Error handling part of PL/SQL is called an exception. We have two types of exceptions, and they are
User-defined and predefined.
The compilation process consists of syntax check, bind, and p-code generation. It checks the errors
in PL/SQL code while compiling. Once all errors are corrected, a storage address allocated to a
variable that stores this data. This process is called binding. P-Code consists of a list of rules for the
PL/SQL engine. It is stored in the database and triggered when the next time it is used.
PL/SQL was introduced to overcome the above disadvantages by retaining the power of SQL and
combining it with the procedural statements. It is developed as a block-structured language and the
statements of the block are passed to the oracle engine which helps to increase the speed of
processing due to the decrease in traffic.
Attempt Now
Events|Powered By
Software Dev
Data Science
All Events
2000+ Registered
Know More
Register Now
2000+ Registered
Know More
Register Now
2000+ Registered
Know More
Register Now
2000+ Registered
Know More
Register Now
2000+ Registered
Know More
Register Now
View All
PL/SQL Basic Interview Questions
1. What are the features of PL/SQL?
PL/SQL provides the feature of decision making, looping, and branching by making use of its
procedural nature.
Multiple queries can be processed in one block by making use of a single command using PL/SQL.
The PL/SQL code can be reused by applications as they can be grouped and stored in databases as
PL/SQL units like functions, procedures, packages, triggers, and types.
PL/SQL supports exception handling by making use of an exception handling block.
Along with exception handling, PL/SQL also supports error checking and validation of data before
data manipulation.
Applications developed using PL/SQL are portable across computer hardware or operating system
where there is an Oracle engine.
PL/SQL tables are nothing but objects of type tables that are modeled as database tables. They are a
way to provide arrays that are nothing but temporary tables in memory for faster processing.
These tables are useful for moving bulk data thereby simplifying the process.
The basic structure of PL/SQL follows the BLOCK structure. Each PL/SQL code comprises SQL and
PL/SQL statement that constitutes a PL/SQL block.
Each PL/SQL block consists of 3 sections:
o The optional Declaration Section
o The mandatory Execution Section
o The optional Exception handling Section
[DECLARE]
--declaration statements (optional)
BEGIN
--execution statements
[EXCEPTION]
--exception handling statements (optional)
END;
You can download a PDF version of Pl Sql Interview Questions.
Download PDF
4. What is a PL/SQL cursor?
A PL/SQL cursor is nothing but a pointer to an area of memory having SQL statements and the
information of statement processing. This memory area is called a context area. This special area
makes use of a special feature called cursor for the purpose of retrieving and processing more than
one row.
In short, the cursor selects multiple rows from the database and these selected rows are individually
processed within a program.
There are two types of cursors:
o Implicit Cursor:
Oracle automatically creates a cursor while running any of the commands - SELECT
INTO, INSERT, DELETE or UPDATE implicitly.
The execution cycle of these cursors is internally handled by Oracle and returns the
information and status of the cursor by making use of the cursor attributes-
ROWCOUNT, ISOPEN, FOUND, NOTFOUND.
o Explicit Cursor:
This cursor is a SELECT statement that was declared explicitly in the declaration block.
The programmer has to control the execution cycle of these cursors starting from
OPEN to FETCH and close.
The execution cycle while executing the SQL statement is defined by Oracle along
with associating a cursor with it.
Explicit Cursor Execution Cycle:
o Due to the flexibility of defining our own execution cycle, explicit cursors are used in many
instances. The following diagram represents the execution flow of an explicit cursor:
Cursor Declaration:
o The first step to use an explicit cursor is its declaration.
o Declaration can be done in a package or a block.
o Syntax: CURSOR cursor_name IS query; where cursor_name is the name of the cursor, the query
is the query to fetch data from any table.
Open Cursor:
o Before the process of fetching rows from cursor, the cursor has to be opened.
o Syntax to open a cursor: OPEN cursor_name;
o When the cursor is opened, the query and the bind variables are parsed by Oracle and the
SQL statements are executed.
o The execution plan is determined by Oracle and the result set is determined after associating
the cursor parameters and host variables and post these, the cursor is set to point at the first
row of the result set.
Fetch from cursor:
o FETCH statement is used to place the content of the current row into variables.
o Syntax: FETCH cursor_name INTO variable_list;
o In order to get all the rows of a result set, each row needs to be fetched.
Close Cursor:
o Once all the rows are fetched, the cursor needs to be closed using the CLOSE statement.
o Syntax: CLOSE cursor_name;
o The instructions tell Oracle to release the memory allocated to the cursor.
Cursors declared in procedures or anonymous blocks are by default closed post their
execution.
Cursors declared in packages need to be closed explicitly as the scope is global.
Closing a cursor that is not opened will result in INVALID_CURSOR exception.
We use this clause while referencing the current row from an explicit cursor. This clause allows
applying updates and deletion of the row currently under consideration without explicitly referencing
the row ID.
Syntax:
UPDATE table_name SET field=new_value WHERE CURRENT OF cursor_name
DECLARE
exception_name EXCEPTION;
PRAGMA EXCEPTION_INIT (exception_name, error_code);
BEGIN
// PL/SQL Logic
EXCEPTION
WHEN exception_name THEN
// Steps to handle exception
END;
As the name indicates, ‘Trigger’ means to ‘activate’ something. In the case of PL/SQL, a trigger is a
stored procedure that specifies what action has to be taken by the database when an event related to
the database is performed.
Syntax:
TRIGGER trigger_name
trigger_event
[ restrictions ]
BEGIN
actions_of_trigger;
END;
In the above syntax, if the trigger_name the trigger is in the enabled state, the trigger_event causes the
database to fire actions_of_trigger if the restrictions are TRUE or unavailable.
This statement is used by anonymous blocks of PL/SQL such as non-stored and stand-alone
procedures. When they are being used, the statement should come first in the stand-alone file.
Comments are those sentences that have no effect on the functionality and are used for the purpose
of enhancing the readability of the code. They are of two types:
o Single Line Comment: This can be created by using the symbol -- and writing what we want to
mention as a comment next to it.
o Multi-Line comment: These are the comments that can be specified over multiple lines and
the syntax goes like /* comment information */
Example:
WHEN clause specifies for what condition the trigger has to be triggered.
The PL/SQL engine does the process of compilation and execution of the PL/SQL blocks and
programs and can only work if it is installed on an Oracle server or any application tool that
supports Oracle such as Oracle Forms.
PL/SQL is one of the parts of Oracle RDBMS, and it is important to know that most of the Oracle
applications are developed using the client-server architecture. The Oracle database forms the
server-side and requests to the database form a part of the client-side.
So based on the above fact and the fact that PL/SQL is not a standalone programming language, we
must realize that the PL/SQL engine can reside in either the client environment or the server
environment. This makes it easy to move PL/SQL modules and sub-programs between server-side
and client-side applications.
Based on the architecture shown below, we can understand that PL/SQL engine plays an important
role in the process and execute the PL/SQL statements and whenever it encounters the SQL
statements, they are sent to the SQL Statement Processor.
Case 1: PL/SQL engine is on the server: In this case, the whole PL/SQL block gets passed to the
PL/SQL engine present on the Oracle server which is then processed and the response is sent.
Case 2: PL/SQL engine is on the client: Here the engine lies within the Oracle Developer tools and
the processing of the PL/SQL statements is done on the client-side.
o In case, there are any SQL statements in the PL/SQL block, then they are sent to the Oracle
server for SQL processing.
o When there are no SQL statements, then the whole block processing occurs at the client-side.
SYSDATE:
o This keyword returns the current time and date on the local database server.
o The syntax is SYSDATE.
o In order to extract part of the date, we use the TO_CHAR function on SYSDATE and specify
the format we need.
o Usage:
SELECT SYSDATE FROM dual;
SELECT id, TO_CHAR(SYSDATE, 'yyyy/mm/dd') from InterviewBitEmployeeTable where
customer_id < 200;
USER:
o This keyword returns the user id of the current session.
o Usage:
SELECT USER FROM dual;
An implicit cursor is used when a query When a subquery returns more than one row, an explicit cursor
returns a single row value. is used. These rows are called Active Set.
Implicit Cursor Explicit Cursor
All SQL statements are executed at a time by the database PL/SQL statements are executed one block at a
server which is why it becomes a time-consuming process. time thereby reducing the network traffic.
There is no error handling mechanism in SQL. This supports an error handling mechanism.
15. What is the importance of %TYPE and %ROWTYPE data types in PL/SQL?
%TYPE: This declaration is used for the purpose of anchoring by providing the data type of any
variable, column, or constant. It is useful during the declaration of a variable that has the same data
type as that of its table column.
o Consider the example of declaring a variable named ib_employeeid which has the data type
and its size same as that of the column employeeid in table ib_employee.
The syntax would be : ib_employeeid ib_employee.employeeid%TYPE;
%ROWTYPE: This is used for declaring a variable that has the same data type and size as that of a
row in the table. The row of a table is called a record and its fields would have the same data types
and names as the columns defined in the table.
o For example: In order to declare a record named ib_emprecord for storing an entire row in a
table called ib_employee, the syntax is:
ib_emprecord ib_employee%ROWTYPE;
16. What are the various functions available for manipulating the character data?
The functions that are used for manipulating the character data are called String Functions.
o LEFT: This function returns the specified number of characters from the left part of a string.
Syntax: LEFT(string_value, numberOfCharacters).
For example, LEFT(‘InterviewBit’, 9) will return ‘Interview’.
o RIGHT: This function returns the defined number of characters from the right part of a string.
Syntax: RIGHT(string_value, numberOfCharacters)
For example, RIGHT(‘InterviewBit’,3) would return ‘Bit’.
o SUBSTRING: This function would select the data from a specified start position through the
number of characters defined from any part of the string.
Syntax: SUBSTRING(string_value, start_position, numberOfCharacters)
For example, SUBSTRING(‘InterviewBit’,2,4) would return ‘terv’.
o LTRIM: This function would trim all the white spaces on the left part of the string.
Syntax: LTRIM(string_value)
For example, LTRIM(’ InterviewBit’) will return ‘InterviewBit’.
o RTRIM: This function would trim all the white spaces on the right part of the string.
Syntax: RTRIM(string_value)
For example, RTRIM('InterviewBit ') will return ‘InterviewBit’.
o UPPER: This function is used for converting all the characters to the upper case in a string.
Syntax: UPPER(string_variable)
For example, UPPER(‘interviewBit’) would return ‘INTERVIEWBIT’.
o LOWER: This function is used for converting all the characters of a string to lowercase.
Syntax: LOWER(string_variable)
For example, LOWER(‘INterviewBit’) would return ‘interviewbit’.
17. What is the difference between ROLLBACK and ROLLBACK TO statements in PL/SQL?
ROLLBACK command is used for rolling back all the changes from the beginning of the transaction.
ROLLBACK TO command is used for undoing the transaction only till a SAVEPOINT. The
transactions cannot be rolled back before the SAVEPOINT and hence the transaction remains active
even before the command is specified.
SYS.ALL_DEPENDENCIES is used for describing all the dependencies between procedures, packages,
triggers, functions that are accessible to the current user. It returns the columns like name,
dependency_type, type, referenced_owner etc.
19. What are the virtual tables available during the execution of the database trigger?
The THEN and NOW tables are the virtual tables that are available during the database trigger
execution. The table columns are referred to as THEN.column and NOW.column respectively.
Only the NOW.column is available for insert-related triggers.
Only the THEN.column values are available for the DELETE-related triggers.
Both the virtual table columns are available for UPDATE triggers.
20. Differentiate between the cursors declared in procedures and the cursors declared in the package
specifications.
The cursors that are declared in the procedures will have the local scope and hence they cannot be
used by other procedures.
The cursors that are declared in package specifications are treated with global scope and hence they
can be used and accessed by other procedures.
These are the three transaction specifications that are available in PL/SQL.
COMMIT: Whenever any DML operations are performed, the data gets manipulated only in the
database buffer and not the actual database. In order to save these DML transactions to the
database, there is a need to COMMIT these transactions.
o COMMIT transaction action does saving of all the outstanding changes since the last commit
and the below steps take place:
The release of affected rows.
The transaction is marked as complete.
The details of the transaction would be stored in the data dictionary.
o Syntax: COMMIT;
ROLLBACK: In order to undo or erase the changes that were done in the current transaction, the
changes need to be rolled back. ROLLBACK statement erases all the changes since the last COMMIT.
o Syntax: ROLLBACK;
SAVEPOINT: This statement gives the name and defines a point in the current transaction process
where any changes occurring before that SAVEPOINT would be preserved whereas all the changes
after that point would be released.
o Syntax: SAVEPOINT <savepoint_name>;
We can use DBMS_OUTPUT and DBMS_DEBUG statements for debugging our code:
o DBMS_OUTPUT prints the output to the standard console.
o DBMS_DEBUG prints the output to the log file.
23. What is the difference between a mutating table and a constraining table?
A table that is being modified by the usage of the DML statement currently is known as a mutating
table. It can also be a table that has triggers defined on it.
A table used for reading for the purpose of referential integrity constraint is called a constraining
table.
24. In what cursor attributes the outcomes of DML statement execution are saved?
The outcomes of the execution of the DML statement is saved in the following 4 cursor attributes:
o SQL%FOUND: This returns TRUE if at least one row has been processed.
o SQL%NOTFOUND: This returns TRUE if no rows were processed.
o SQL%ISOPEN: This checks whether the cursor is open or not and returns TRUE if open.
o SQL%ROWCOUNT: This returns the number of rows processed by the DML statement.
25. Is it possible to declare column which has the number data type and its scale larger than the precision?
For example defining columns like: column name NUMBER (10,100), column name NUMBER (10,-84)
PL/SQL Programs
26. Write a PL/SQL program using WHILE loop for calculating the average of the numbers entered by user.
Stop the entry of numbers whenever the user enters the number 0.
DECLARE
n NUMBER;
average NUMBER :=0 ;
sum NUMBER :=0 ;
count NUMBER :=0 ;
BEGIN
-- Take input from user
n := &input_number;
WHILE(n<>0)
LOOP
-- Increment count to find total elements
count := count+1;
-- Sum of elements entered
sum := sum+n;
-- Take input from user
n := &input_number;
END LOOP;
-- Average calculation
average := sum/count;
DBMS_OUTPUT.PUT_LINE(‘Average of entered numbers is ’||average);
END;
27. Write a PL/SQL procedure for selecting some records from the database using some parameters as
filters.
Consider that we are fetching details of employees from ib_employee table where salary is a
parameter for filter.
28. Write a PL/SQL code to count the number of Sundays between the two inputted dates.
--declare 2 dates of type Date
DECLARE
start_date Date;
end_date Date;
sundays_count Number:=0;
BEGIN
-- input 2 dates
start_date:='&input_start_date';
end_date:='&input_end_date';
/*
Returns the date of the first day after the mentioned date
and matching the day specified in second parameter.
*/
start_date:=NEXT_DAY(start_date-1, 'SUNDAY');
--check the condition of dates by using while loop.
while(start_date<=end_date)
LOOP
sundays_count:=sundays_count+1;
start_date:=start_date+7;
END LOOP;
Input:
start_date = ‘01-SEP-19’
end_date = ‘29-SEP-19’
Output:
Total number of Sundays between the two dates: 5
29. Write PL/SQL code block to increment the employee’s salary by 1000 whose employee_id is 102 from
the given table below.
EMPLOYEE_I PHONE_NUMBE
FIRST_NAME LAST_NAME EMAIL_ID JOIN_DATE JOB_ID SALARY
D R
AD_PRE 24000.0
100 ABC DEF abef 9876543210 2020-06-06
S 0
17000.0
101 GHI JKL ghkl 9876543211 2021-02-08 AD_VP
0
17000.0
102 MNO PQR mnqr 9876543212 2016-05-14 AD_VP
0
DECLARE
employee_salary NUMBER(8,2);
PROCEDURE update_salary (
emp NUMBER,
salary IN OUT NUMBER
) IS
BEGIN
salary := salary + 1000;
END;
BEGIN
SELECT salary INTO employee_salary
FROM ib_employee
WHERE employee_id = 102;
DBMS_OUTPUT.PUT_LINE
('Before update_salary procedure, salary is: ' || employee_salary);
DBMS_OUTPUT.PUT_LINE
('After update_salary procedure, salary is: ' || employee_salary);
END;
/
Result:
30. Write a PL/SQL code to find whether a given string is palindrome or not.
DECLARE
-- Declared variables string, letter, reverse_string where string is the original string.
string VARCHAR2(10) := 'abccba';
letter VARCHAR2(20);
reverse_string VARCHAR2(10);
BEGIN
FOR i IN REVERSE 1..LENGTH(string) LOOP
letter := SUBSTR(string, i, 1);
-- concatenate letter to reverse_string variable
reverse_string := reverse_string ||''||letter;
END LOOP;
IF reverse_string = string THEN
dbms_output.Put_line(reverse_string||''||' is palindrome');
ELSE
dbms_output.Put_line(reverse_string ||'' ||' is not palindrome');
END IF;
END;
31. Write PL/SQL program to convert each digit of a given number into its corresponding word format.
DECLARE
-- declare necessary variables
-- num represents the given number
-- number_to_word represents the word format of the number
-- str, len and digit are the intermediate variables used for program execution
num INTEGER;
number_to_word VARCHAR2(100);
digit_str VARCHAR2(100);
len INTEGER;
digit INTEGER;
BEGIN
num := 123456;
len := LENGTH(num);
dbms_output.PUT_LINE('Input: ' ||num);
-- Iterate through the number one by one
FOR i IN 1..len LOOP
digit := SUBSTR(num, i, 1);
-- Using DECODE, get the str representation of the digit
SELECT Decode(digit, 0, 'Zero ',
1, 'One ',
2, 'Two ',
3, 'Three ',
4, 'Four ',
5, 'Five ',
6, 'Six ',
7, 'Seven ',
8, 'Eight ',
9, 'Nine ')
INTO digit_str
FROM dual;
-- Append the str representation of digit to final result.
number_to_word := number_to_word || digit_str;
END LOOP;
dbms_output.PUT_LINE('Output: ' ||number_to_word);
END;
Input: 12345
Output: One Two Three Four Five
Input: 9874
Output: 28
PL/SQL Conclusion
33. PL SQL Interview
Below are some common basic and advanced pl/sql interview questions and answers
which are asked in the interview by the interviewer
Answer:
SQL PL/SQL
SQL is a query language to interact with the It is an extension of SQL which supports procedures,
database. functions and many more features.
SQL statements can be executed only one at a The entire block of statements is sent to the database
time, thereby making it a time-consuming server at once to be executed, saving time and increasing
process. efficiency.
No provision for error handling. Customized error handling is possible.
Answer:
[DECLARE]
BEGIN
--execution statements
[EXCEPTION]
END;
[before | after]
on [table_name]
[trigger_body]
Question: How do you compile PL/SQL code?
Answer: Firstly, the syntax check is performed. When the developer corrects any syntax
errors, Oracle binds all the variables holding data with a storage address. Finally, the p-
code generation process takes place.
Scalar types – primitive data types like CHAR, DATE, LONG, VARCHAR2 etc…
Composite – these are made up of other data types and can be easily updated.
Example, RECORD, TABLE etc…
Reference data types like CURSOR
Large object types – BLOB, CLOB etc…
Answer:
%TYPE %ROWTYPE
Example – Example –
DECLARE DECLARE
studentId stud_rec
students.student_id%TYPE; students.%ROWTYPE;
Answer: Packages are schema objects that place functions, procedures, variables, etc… in
one place. Packages should have –
Package specifications
Package body
Question: List some schema objects that are created using PL/SQL.
Answer: Predefined exceptions are internally defined exceptions that occur during the
execution of a program. For example, PL/SQL raises NO_DATA_FOUND when there are no
rows returned upon a select operation, and if more than one row is returned using a
select statement, TOO_MANY_ROWS error is generated. Some more examples:
Answer:
These are compile-time errors found These are not detected by the compiler and cause the program
by the compiler. to give an incorrect result.
The code doesn't build and run until The code is compiled and run, and if an error occurs, the
these issues are resolved. program stops halfway.
int x = 9 String name = null; In the first String name = null; if(name.equals(“hackr.io”)){….} Since name is
line, a semicolon is missing which the null, the exception will be caught during runtime when the code
compiler will catch is executed
Question: What are the various packages available for PL-SQL Developers?
alert an application using triggers when particular database values change. The alerts
DBMS_ALERT
are transaction-based and asynchronous.
DBMS_OUTPU display output from PL/SQL blocks, packages, subprograms and triggers. Mostly used for
T displaying PL/SQL debugging information.
different sessions communicate over named pipes using this package. The procedures
DBMS_PIPE PACK_MESSAGE and SEND_MESSAGE pack a message into a pipe, then send it to
another session.
allows your PL/SQL programs to make hypertext transfer protocol (HTTP) callouts. The
package has two entry points, each of which accepts a URL (uniform resource locator)
UTL_HTTP
string, contacts the specified site, and returns the requested data, which is usually in
HTML format.
Source: Oracle docs
Answer: Character functions are functions that manipulate character data. These are
more popularly called as string functions. Example:
SUBSTRIN selects data from any part of the string. SUBSTRING(value, SUBSTRING('hackr.io',0,4ll
G StartPosition, NoOfChars). Example return hackr.
LTRIM(' hackr.io') will
LTRIM trims white spaces from the left. Example
return hackr.io.
RTRIM('hackr.io ') will
RTRIM trims white spaces from the right. Example
return hackr.io.
UPPER('hackr.io')
UPPER converts all the characters to uppercase. Example
returns HACKR.IO.
LOWER('HACKR.IO')
LOWER converts all the characters to lowercase. Example
returns hackr.io.
Question: What is the use of SYSDATE and USER keywords? Explain with
examples.
Answer: SYSDATE: returns the current date and time on the local database server. The
syntax is SYSDATE. If we have to extract part of the date, then we use the TO_CHAR
function. Examples:
Example:
Answer:
SGA PGA
Contains data and control information for one Contains data and control information exclusively for
Oracle database instance a single Oracle process
example: cached data blocks and SQL areas Example: session memory, SQL work area
Question: Explain the uses of Merge with Syntax in PL-SQL.
Answer: Merge reduces the number of table scans and performs parallel operations if
required. MERGE inserts or updates data conditionally from one table to another. For
example,
In this example, if a record with the matching condition is found, then the address of the
same record is updated, else a new row is inserted.
Answer: ROLLBACK command rolls back all the changes from the beginning of the
transaction. In ROLLBACK TO, the transaction is rolled back (or undone) only till a point
known as the SAVEPOINT. The transactions before the SAVEPOINT cannot be undone,
and the transaction remains active even when the command is given.
Question: Explain the difference between procedure and function.
Answer:
Function Procedure
Can be called from SQL statements. Can not be called from SQL statements.
The function has to return a value. Need not return any value.
Generally used for computation purpose. Used for executing complex business logic.
Exception handling is not possible Try/catch block can be defined inside a procedure
Answer:
PROCEDURE TRIGGER
Called explicitly by a user, trigger or an Executed by the DBMS whenever an event occurs in the
application database.
Implicit cursor – PL/SQL applies implicit cursors for INSERT, UPDATE, DELETE and
SELECT statements returning a single row.
Explicit cursor – created by a programmer for queries returning more than one row.
Syntax–
CURSOR is
SELECT statement;
OPEN ;
FETCH INTO ;
CLOSE ;
Answer:
Not NULL
Unique
Primary key
Foreign key
Check
Answer:
TRIGGERS CONSTRAINTS
Trigger is for the entire table The constraint is for a column of the table
They are just stored procedures that get Prevent duplicate and invalid data entries
automatically executed, hence don’t check for data
integrity.
Named blocks are functions and procedures which are stored in the database server and
can be reused. Anonymous blocks are for one time use and are not stored in the server.
Example:
DECLARE
BEGIN
DBMS_OUTPUT.put_line (message);
byzero := 1/0;
EXCEPTION
Answer: Records contain a set of data of various data types that can be related to each
other as fields. Three types of records that are supported in PL/SQL are table-based
records, programmer-based records, and cursor-based records.
Answer:
COMMIT – is used to make the database changes permanent. All the save points are
erased and the transaction ends. Once committed, a transaction cannot be rolled back.
SAVEPOINT – is used to set points during a transaction to which a programmer can roll-
back later. it is helpful when there is a series of transactions that can be divided into
groups having a savepoint.
Question: What is the difference between actual and formal parameters?
Answer: The parameters that are used to call a procedure are called as actual
parameters. Example –
The variables declared in a procedure header used in the body are called formal
parameters. Example –
Answer: DECLARE is used as the first statement for stand-alone files that consist of
anonymous block of code which are not stored procedures, functions or triggers.
Example –
DECLARE
num1 NUMBER(2);
num2 NUMBER(3);
BEGIN
END;
Answer: SQLCODE and SQLERRM are used to trace exceptions that are not explicitly
handled in the program. These are globally defined variables. SQLCODE returns the error
code while SQLERRM returns the corresponding error message.
Answer: Rollback erases all the database changes including all the savepoints. It ends a
transaction.
‘Rollback to’ rollbacks the changes up to the savepoint mentioned in the code. The
transaction will still be active.
Answer: Yes, it is possible. Use ACCEPT keyword to take inputs from the user. Example –
Answer: By using ROWID. It is not a physical column but the logical address of a row. It
contains the block number, file number and row number thereby reducing I/O time
hence making query execution faster.
DBMS_APPLICATION_INFO
DBMS_TRACE
DBMS_SESSION and DBMS_MONITOR
Answer: Use CHAR (NUMBER) to get fixed length for a variable. Example – CHAR (10). If
the length of the string is less than the specified number, it will be padded with white
spaces.
Answer: By using this package, developers can get the code read and write files to and
from the computer. For doing this, the developer will need access grant from DBA user.
Question: What are DBMS_OUTPUT and DBMS_DEBUG?
Answer: Both can be used for debugging the code. DBMS_OUTPUT prints the output to
console whereas DBMS_DEBUG prints it to a log file.
Answer:
Answer: NVL lets the programmer substitute a value for a NULL value. Example –
Answer: We can achieve consistency by setting the appropriate isolation level. For
example, to give read consistency, the isolation level can be set to READ COMMITTED.
Question: Write a simple procedure to select some records from the database
using some parameters.
Answer: Example code –
AS
BEGIN
SELECT * FROM customers WHERE age = @age AND city = @city;
END;
Answer: Yes, we can do so using the DECODE keyword in versions 9 and above. Example
–
SELECT day_of_week,
1, 'Monday',
2, 'Tuesday',
3, 'Wednesday',
4, 'Thursday',
5, 'Friday',
6, 'Saturday',
PL SQL Collections
A collection is a group of elements of homogenous data types. It generally comprises arrays, lists,
sets, and so on. Each of the elements has a particular subscript which reflects its position.
Collection Methods
Pl/SQL has some built-in methods under collection which are listed below.
Sl. No. Name Descriptions
3 exists(m) Returns true if mth element present in the collection else returns false.
12 delete(m) Deletes mth element from collection, if mth element is NULL, then no action is performed.
Collection Exceptions
Some of the common collection exceptions are as follows:
1. VALUE_ERROR: This exception is thrown if a subscript cannot be converted to the key type
or is NULL. This exception is normally raised if a key is of type PLS_INTEGER range and the
subscript resides beyond this range.
2. NO_DATA_FOUND: This exception is thrown by PL/SQL if either a SELECT statement
fetches no rows or a program points to an element that is deleted in a nested table. This
exception can also be raised by an element which is uninitialized in an index-by table.
3. COLLECTION_IS_NULL: This exception is thrown by PL/SQL if the collection is NULL by
default.
4. SUBSCRIPT_BEYOND_COUNT: This exception is thrown when a subscript is more than the
total count of the number of elements in the collection.
5. SUBSCRIPT_OUTSIDE_LIMIT: This exception is thrown when a subscript is beyond the
threshold range.
Nested Tables In PL/SQL
The nested tables are like a single column database table or a 1-dimensional array where the array
size is dynamic. Its subscript is of numeric type. We can get the nested table into a variable by
giving the rows a subscript that begins with 1. This feature makes it similar in nature like an array.
A nested table can be held in a column of a database. It can also be used for manipulating SQL
operations by joining tables. Since it is like a dynamic array so the upper limit can be of any size.
A nested table can have both dense and sparse collection characteristics which means any element
can be deleted randomly (making it sparse) with the help of the DELETE procedure. The deletion of
data causes a discontinuity in the index but the NEXT function helps to iterate to the next subscripts.
Since the data is stored in the form of a table, it can be retrieved with the help of SELECT
statements.
A nested table can be built at the schema level or in PL/SQL block. It is like a database object which
is accessible within the database or subprogram.
An associative array is represented by a key-value pair. Each of the unique keys is used to identify
the value in the array. The data type of the key can be a string or an integer defined while creating
it. A key is added to the index-by table by simply assigning a value for the first time. To modify the
same entry, we have to use the same key.
The key should be a unique one either as a primary key in a table or by combining strings together
to develop unique value. This type of collection has an array size that is dynamic and has either
sparse or dense characteristics. One difference between the index-by table and the nested table is
that the former cannot be stored in the column of the database but the nested table can be stored.
The associative arrays provide easy maintenance of subscript and are created within a PL/SQL
block. It is like a SQL table where values are obtained with the help of the primary key. This is
generally used for temporary data storage and can be used instead of SQL tables for avoiding
network traffic and disk storage required by SQL tables.
As the associative arrays do not store persistent data, they cannot be used with SQL statements like
SELECT and INSERT. However, they can be made unending for a session of the database by
declaring their data type as a package and defining them inside the body of the package.
The maximum size of Varray is defined in its type definition. It has a one after another memory
arrangement beginning with 1 subscript and the lowest location address points to the starting
element and the highest location address points to the end element. All the elements of a Varray are
identified with an index.
This type of collection has numeric subscript and has dense characteristics. Thus the array
elements cannot be deleted in between. Either the entire Varray should be deleted or its end can be
trimmed. Due to its dense characteristics, it has less flexibility of use.
The Varray can be created either within a PL/SQL block or at the level of schema. It is treated as a
database object which can be accessed within the database or within a subprogram. Varray is used
more frequently when the size of the array is known to us. It should be initialized prior to using them
and it can be initialized with the help of a constructor. Its value is NULL when declared and should
be initialized before referencing its elements.
Syntax of Varray:
TYPE <<type>> IS {VARRAY | VARYING ARRAY} (<<size>>)
Here,
‘type’ is the type specifier.
‘element’ is the data type.
‘size’ is the maximum number of elements in an array. It is a positive integer.
Varray Variables Declaration And Initialization
After creating a Varray, we can declare it in the way described below:
Syntax:
name type_n [:= type_n(...)];
Here,
‘name’ is the Varray name.
‘type_n’ is the type of Varray.
‘type_n(…)’ is the constructor of type Varray. The argument lists are mentioned by a comma
separator and of type Varray.
We have to initialize a Varray variable before using it else it gives uninitialized collection error. The
initialization is done in the way described below.
Syntax:
name type_n := type_n();
This will initialize the variable with zero elements. In order to populate elements in the varray
variables, the syntax is:
name type_n := type_n(e1, e2, ...);
Oracle PL/SQL provides the functionality of fetching the records in bulk rather than
fetching one-by-one. This BULK COLLECT can be used in ‘SELECT’ statement to
populate the records in bulk or in fetching the cursor in bulk. Since the BULK
COLLECT fetches the record in BULK, the INTO clause should always contain a
collection type variable. The main advantage of using BULK COLLECT is it increases
the performance by reducing the interaction between database and PL/SQL engine.
Syntax:
SELECT <columnl> BULK COLLECT INTO bulk_varaible FROM <table name>;
FETCH <cursor_name> BULK COLLECT INTO <bulk_varaible >;
In the above syntax, BULK COLLECT is used in collect the data from ‘SELECT’ and
‘FETCH’ statement.
FORALL Clause
LIMIT Clause
BULK COLLECT Attributes
FORALL Clause
The FORALL allows to perform the DML operations on data in bulk. It is similar to that
of FOR loop statement except in FOR loop things happen at the record-level whereas
in FORALL there is no LOOP concept. Instead the entire data present in the given
range is processed at the same time.
EXPLORE MORE Learn Java Programming with Beginners Tutorial08:32
Syntax:
FORALL <loop_variable>in<lower range> .. <higher range>
<DML operations>;
In the above syntax, the given DML operation will be executed for the entire data that
is present between lower and higher range.
LIMIT Clause
The bulk collect concept loads the entire data into the target collection variable as a
bulk i.e. the whole data will be populated into the collection variable in a single-go.
But this is not advisable when the total record that needs to be loaded is very large,
because when PL/SQL tries to load the entire data it consumes more session
memory. Hence, it is always good to limit the size of this bulk collect operation.
However, this size limit can be easily achieved by introducing the ROWNUM condition
in the ‘SELECT’ statement, whereas in the case of cursor this is not possible.
To overcome this Oracle has provided ‘LIMIT’ clause that defines the number of
records that needs to be included in the bulk.
Syntax:
FETCH <cursor_name> BULK COLLECT INTO <bulk_variable> LIMIT <size>;
In the above syntax, the cursor fetch statement uses BULK COLLECT statement along
with the LIMIT clause.
DECLARE
CURSOR guru99_det IS SELECT emp_name FROM emp;
TYPE lv_emp_name_tbl IS TABLE OF VARCHAR2(50);
lv_emp_name lv_emp_name_tbl;
BEGIN
OPEN guru99_det;
FETCH guru99_det BULK COLLECT INTO lv_emp_name LIMIT 5000;
FOR c_emp_name IN lv_emp_name.FIRST .. lv_emp_name.LAST
LOOP
Dbms_output.put_line(‘Employee Fetched:‘||c_emp_name);
END LOOP:
FORALL i IN lv_emp_name.FIRST .. lv emp_name.LAST
UPDATE emp SET salaiy=salary+5000 WHERE emp_name=lv_emp_name(i);
COMMIT;
Dbms_output.put_line(‘Salary Updated‘);
CLOSE guru99_det;
END;
/
Output
Employee Fetched:BBB
Employee Fetched:XXX
Employee Fetched:YYY
Salary Updated
Code Explanation:
Code line 2: Declaring the cursor guru99_det for statement ‘SELECT emp_name
FROM emp’.
Code line 3: Declaring lv_emp_name_tbl as table type of VARCHAR2(50)
Code line 4: Declaring lv_emp_name as lv_emp_name_tbl type.
Code line 6: Opening the cursor.
Code line 7: Fetching the cursor using BULK COLLECT with the LIMIT size as
5000 intl lv_emp_name variable.
Code line 8-11: Setting up FOR loop to print all the record in the collection
lv_emp_name.
Code line 12: Using FORALL updating the salary of all the employee by 5000.
Code line 14: Committing the transaction.
A managed NoSQL database service, Amazon DynamoDB offers quick and predictable performance
along with seamless scalability. By using DynamoDB, you can delegate the administrative tasks
associated with running and scaling a distributed database, freeing you from having to worry about
hardware provisioning, setup, software patching, replication or cluster scalability. Additionally,
DynamoDB provides encryption at rest, which removes the operational complexity and burden of
protecting sensitive data.
Now that you know what AWS DynamoDB is, let us learn some less-known facts about it:
You can create a new DynamoDB table and import data directly from Amazon S3 without
DynamoDB is supported by AWS Glue Elastic Views as a source to continuously integrate and
To query, update, insert and delete table data in DynamoDB, use the PartiQL query
language.
You can record item-level changes in the DynamoDB tables using Amazon Kinesis Data
Streams.
Even more quickly restore DynamoDB tables.
Export your DynamoDB data to Amazon Simple Storage Service (Amazon S3), then use
Amazon Athena and other AWS services to analyze the data and derive useful insights.
The demand for AWS DynamoDB is quite high and therefore, the opportunities are quite vast. We
assure you that the interview questions will assist you in acing the interview and getting the dream
job and role.
The four scalar data types that DynamoDB supports are as follows:
Numbers
Strings
Binary
Boolean.
Quick in-place atomic notifications are supported by Amazon DynamoDB, allowing you to add or
eliminate lists, sets or maps while also incrementing and decrementing a numeric attribute with
just one API call.
GET/PUT operations that use the user-defined unique identifier are supported.
By enabling querying of a non-primary key characteristic using both local and global
quickly.
It enables you to access all the items for a single aggregate partition-sort key along a number
You must specify a condition in order for a procedure to be finished on an item. A condition
expression which can be created from the following is one you can define:
Comparison operators: =, >,, >, =, >, BETWEEN, and IN Logic operations: NOT, AND, and OR
You can also create a conditional expression that is free-form and combines several
5. What are some of the differences between Amazon SimpleDB and Amazon DynamoDB?
Crafted for internet application domains, Amazon DynamoDB is a quick and scalable NoSQL
database provider that is also highly recommended. It maintains predictable high performance and
is extremely cost-effective for caseloads of any scale.
Although it has scaling restrictions, Amazon SimpleDB is a good choice for smaller caseloads that
demand query flexibility.
At the expense of performance and scale, it supports query flexibility and instantaneously indexes
all item attributes.
UpdateTable
CreateTable
DescribeTable
DeleteTable
PutItem
ListTables
UpdateItem
BatchWriteItem
GetItem
Query
DeleteItem
Scan
BatchGetItem.
The term "global Secondary index" refers to an index with a partition and divided key that differs
from those on the table.
In the context that questions on the index could indeed cover every item in a table throughout all
partitions, it is regarded as being "global."
A global secondary index is one that has a partition or partition sort key that really is distinct
from the table's primary key. Because questions on the index can cover all the items in a
Local secondary index: An index with a different sort key than the table's partition key. Since
every index partition is bound to a table partition with the same partition key, it is regarded
as being "local."
The console or an API call can be used to remove a Global Secondary Index.
The Global Secondary index can be removed from a table by selecting it on the console,
going to the "Table items" section and choosing the "indexes" tab, and then clicking the
The Notification Table API call can also be used to delete a Global Secondary Index.
10. How many global secondary indexes are created for each table?
11. What types of API calls does a global secondary index support?
The API calls that Global Secondary Index supports are "Query" and "Scan."
12. On how many different tables can local secondary indexes be created?
Currently, once local secondary indexes are created, Amazon DynamoDB is unable to remove them
from the table; however, the entire table can be deleted.
14. A table that already exists can I add local supplementary indexes to?
Once you establish a table to local secondary indexes, you can define a sort key element that isn't
currently used to set up a local secondary index for use in the future.
This means that adding local secondary indexes to the existing table is likely impossible right now.
When making a table that can't be added right now, a Local secondary index must be
created.
You should therefore try specifying the two following parameters later on when you
Attributes that can be directly copied into the school serves index are known as projected
attributes. The local secondary index only contains primary keys and secondary keys when
The group of characteristics which are copied or predicted from a table to an index are called
projections. They are in addition to the automatically projected index key attributes and primary
key attributes. The characteristics that are estimated into the index must always be specified when
defining a local secondary index. Each index has a minimum of three of the following attributes:
It supports GET/PUT operations using the user-specified primary key. The ability to query a non-
primary key attribute using both local and global secondary indexes promotes flexible querying.
DynamoDB permanently stores data (although it is less fast than Redis because we don't
Atomic counters are a feature of DynamoDB that let you change the value of an established
attribute without affecting other write requests by using the update method. It increases the value
of this characteristic by one each time the programme is executed.
AWS DynamoDB Interview Questions For Experienced
20. What are the main advantages of using DynamoDB over an established MySQL-style SQL-
based database?
Several advantages of DynamoDB over conventional SQL databases. You don't have to worry about
providing or managing servers because it is a completely managed service. Second, you could
indeed quickly increase or decrease ability as needed because it is highly scalable. Finally, you can
be sure that your data is secure because it has built-in compliance and security features.
21. How is Amazon's NoSQL implementation different from other well-known ones like
Cassandra or MongoDB?
A managed NoSQL database service, DynamoDB provides quick, predictable performance with easy
scalability. Several significant aspects set DynamoDB apart from other well-liked NoSQL
implementations:
DynamoDB provides a managed service rather than needing setup and management by the
user.
DynamoDB utilizes a proprietary query language rather than SQL, DynamoDB employs a
proprietary storage format rather than JSON.
Any application that requires latency access to data and is willing to give up some data modeling
flexibility in exchange for performance should consider DynamoDB. Additionally, it is a wise choice
for programs that need to be highly available and are willing to give up some performance to
achieve it.
23. Do you have any restrictions when using DynamoDB? If so, what exactly are they?
Although DynamoDB is a strong tool, it does have some drawbacks. Its inability to handle large
amounts of information is one of its limitations. You might want to think about a different solution
if you need to store a lot of data. You should compare the costs of using DynamoDB with the
advantages it offers because it can be costly to use.
24. What different methods are there for accessing data in DynamoDB?
The Query or Scan APIs can be used to access data stored in DynamoDB. The Scan API enables you
to browse through data for items that meet specific criteria, whereas the Query API enables you to
ask a series using a primary key.
25. How well do you comprehend DynamoDB Streams?
You can record data changes in one's DynamoDB table in almost real-time using DynamoDB
Streams, a feature of DynamoDB. This can be helpful for a variety of purposes, such as auditing or
maintaining a backup copy of your data in a different location for disaster recovery.
You can record data adjustments done to goods in a DynamoDB table using a feature called
DynamoDB Streams. Then, you can take action with that data by retrieving, filtering, or trying to
export it to a different DynamoDB table, among other actions.
It is possible to use DynamoDB to connect data stored in AWS S3. But to do that, you'll need to use
an API designed specifically for DynamoDB.
28. Could you provide me with a few instances of real-world applications that employ the use of
DynamoDB as their main database?
The Amazon.com website, the Kindle Fire tablet line, and the Amazon Web Services cloud
computing service are a few real examples of applications that use DynamoDB as their main
database.
If you're going to look for a controlled NoSQL database that scales well, DynamoDB is a fantastic
choice. Additionally, if you require precise control over your data, it is a wise choice. If you want a
controlled NoSQL database that is feature-rich and simple to use, Firebase is a good choice.
30. What do you know about DynamoDB's partition keys and sort keys?
Sort credentials are used to choose the order wherein items are kept within a partition, while
separation keys are used to ascertain what partition an item would be stored in. A DynamoDB table
can be searched for items using separation keys and sort keys combined.
The capacity to query the DynamoDB data using a different sort key from the one used to shop the
data in the table is provided by local secondary indexes. If you want to query the data in a variety of
ways or if you need to, this can be helpful.
32. What do you think provisioned throughput means?
DynamoDB's provisioned throughput feature enables users to specify the read and write capacity
needs for their table. The user can then make sure that one‘s table can accommodate the volume
of traffic they anticipate.
A technique for making sure that all duplicates of a data item are updated is the eventual
consistency model. In distributed applications, where it may take some time for adjustments to
reach every node, it is frequently used. According to the eventual consistency model, if enough
time passes, all copies of a data item will have been updated.
34. What transpires if an application tries to read or write so many requests per second than is
permitted?
Applications will encounter errors if they try to read or write greater than the maximum number of
request units per second allowed. The application will need to increase its allotted request units or
decrease the number of queries it is currently making.
35. Can you describe how conditional writes are used by DynamoDB to enhance performance?
By enabling you to clearly state conditions on write operations that need to be met in order for the
write to be successful, DynamoDB uses conditional writes to help enhance efficiency. By doing this,
you can prevent overwriting data which has already been revised by another process or writing
duplicate data.
move from SQL to NoSQL. Basically, it charges for data reading, writing, and stashing as well
availability zones in a region or it may be made available across multiple regions. All data
servers.
Simple Administration: Because Amazon DynamoDB is a fully managed service, you don't
have to worry about setting up and configuring hardware or software, applying software
It is a NoSQL database provider that is well-designed, offers quick and unavoidable performance,
and scales easily. Additionally, it allows users to delegate running and scaling online digital content
distribution to AWS for the users' convenience, so they won't need to worry about
setup, configuration, hardware requirements, replication, bandwidth capacity planning, cluster
scaling, or software patching.
The NoSQL databases used in non-relational databases. These databases are divided into four
groups, which are as follows:
Key-value registries
Stores in graphs
Stores in columns
document archives.
4. Is AWS DynamoDB free?
You just pay for the assets you provision in Amazon DynamoDB. Start out with DynamoDB's free
tier limitations, which power many applications. Depending on the kind of resources you need,
there are different monthly prices when you're in need of them.
It serves as DynamoDB's entry point. The DynamoDB Mapper class allows users to connect to a
DynamoDB endpoint, allowing them to execute questions and scan against tables as well as carry
out CRUD operations on items and access their data in various tables.
Explore AWS Sample Resumes! Download & Edit, Get Noticed by Top Employers!
It is a database service that offers and facilitates the storing, updating, and querying of objects that
are recognised using key and value pairs and make up the actual material that is being stored.
Using DynamoDB, a global value with respect or a DynamoDB can automatically scale up and down
in terms of read and write capacity.
Multiple items all over multiple tables can be added, removed, or replaced using the Amazon
DynamoDB BatchWriteItem in just one request, but not in a single transaction. It supports inserting
or deleting combinations of up to 25 items with a total combined request size of 16 MB.
You can generate an access key and a secret key by creating a user in AWS IAM which is Identity
Access Management.
It is possible to share this class across threads and it is thread-safe. DynamoDBMapper will start
chucking DynamoDBMappingException while using the load, save and delete methods to
demonstrate that domain courses are incorrectly marked or otherwise incompatible with this class.
Key Upshots
Key-value and document data models are supported by the NoSQL database Amazon DynamoDB.
The use of DynamoDB by developers allows them to create serverless, modern applications that
can scale globally and support petabytes of data as well as tens of millions of read and write
requests per second. High-performance, internet-scale applications that would tax conventional
relational databases can run on DynamoDB.
DynamoDB is a fully managed NoSQL database service. It is backed by AWS and provides exciting features
like seamless scalability, fast performance, and high reliability over data. DynamoDB supports both key-value
and document data structures. In addition, this service comes with different pricing tiers to suit varying user
requirements.
DynamoDB is quite effective at data storing and retrieval in all traffic levels and allows the users to create
tables for the database.
NoSQL or non-relational databases focus on different data storing models rather than a tabular structure.
There are four types of NoSQL databases:
1. Key-value stores
2. Document stores
3. Graph store
4. Colum stores
DynamoDB supports both document and key-value structures.
We can store any amount of data with the unlimited storage provided by the
DynamoDB service.
The data we store replicates over many availability regions. It allows to easily
cope with global-scale applications and make sensitive information highly
available.
It is highly cost-effective, and the users have to pay only for what they use.
Easy administration and the user doesn't have to worry about software patching,
setup, and configuration because AWS fully manages DynamoDB.
DynamoDB has an advanced system for reporting and highly secured user
authentication mechanisms to provide maximum security over sensitive data.
Read more: DynamoDB Pros & Cons.
05. What are the disadvantages of DynamoDB?
The DynamoDBMapper class is an entry point that allows access to DynamoDB endpoints, enabling users
to handle the database. Users can perform CRUD operations, run queries, and scan against tables. And this
class is only available for Java.
Read more: DynamoDBMapper.
08. What is meant by Partition Key in DynamoDB?
A partition key is a primary key composed of only a single attribute. In DynamoDB, the value of the partition
key work as the input for internal hash functions. The resulting output from that function helps determine the
partition to store the item in question.
DynamoDB provides 2 options to fetch data from collections as Query and Scan. When using Scan,
DynamoDB will look through the complete table for records with matching criteria, while Query uses key
constraints to perform a direct lookup for a particular data set.
In addition to the primary key, DynamoDB uses global secondary key, local secondary key, and partition
primary key to help improve flexibility and improve the read/ write operation speed.
As a result, it is fast and time effective compared to the DynamoDB Scan operation and is recommended for
most data fetching scenarios.
10. What are the key differences between Amazon DynamoDB and Amazon Aurora?
GetItem
Query
Scan
BatchGet
TransactRead
As its name suggests, the attributes in a table projected to the index are the projections. (Similar to the
GROUP BY operation in SQL). Projections can exclude all the unnecessary items and reduce the overall size
of the payload returned by the API.
We have to define the projected attributes each time we create a local secondary index. Each index must
have minimally three attributes: table partition key, index sort key, and table sort key.
There is long-term storage and a two-tier backup system in DynamoDB to keep data loss at a minimal level.
There are three nodes for each participant, and each node contains the same data from the partition. In
addition, there is a B tree for data location and a replication log to track the changes in each node.
DynamoDB stores snapshots of these and stores them in another AWS database for a month for data
restoration when necessary.
DynamoDB supports quick in-place atomic updates enabling users to add or remove values to sets or lists at
the atomic level using Transactions.
DynamoDB Streams allow capturing the time-ordered sequence of item-level modifications made to a
DynamoDB table. This information saves in a log for 24 hours, and each modification made to the database
records sequentially to the modification order.
Read more: DynamoDB Streams.
16. What are the DynamoDB pricing tiers?
400KB
Including both the attribute name length and the value lengths in binary format, 400KB is the maximum item
size.
Most of the time, it is hard to tell beforehand about the database's workload. So, DynamoDB introduced
DynamoDB Auto Scaling to scale the read and writes capacity in response to the traffic.
Scaling up allows the tables or the global secondary indexes to gain more read and write. Whereas scaling
down automatically based on the traffic makes DynamoDB cost-effective to use.
DynamoDB Local is a downloadable version of DynamoDB. It allows developing and testing applications in
the local environment without using the DynamoDB web service. Once the application is ready for
deployment, the local endpoint can be changed and redirected to the DynamoDB web service.
20. How many Global Secondary Indexes can you create on a single table?
Encryption at Rest is a security mechanism DynamoDB uses to protect sensitive data. This database service
uses the AWS KMS (AWS Key Management Service) keys to encrypt all the data at rest. There are three
types of AWS KMS keys to select from:
DAX (Amazon DynamoDB Accelerator) is a type of in-memory cache. Even when it is millions of requests per
second, DynamoDB Accelerator provides a performance up to 10 times the original rate. In addition, it is fully
managed and is highly available.
Read more: DAX.
23. What are DynamoDB Global Tables?
DynamoDB Global Tables allow users to replicate their data over different regions of choice. That makes the
data highly available and quickly delivered across global applications of enormous size. Every data write
made to a global table is replicated over all the regions having replicas of the same table.
24. What does BatchGetItem do in DynamoDB?
BatchGetItem allows retrieving attributes of one or more items from one or more tables using the primary
key. There is a limitation of 16MB up to which this operation can return items.
Read more: BatchGetItem.
25. What are Indexes and Secondary Indexes in DynamoDB?
An index is a data structure that enhances the data retrieval speed from the database. However, it costs
some storage space and additional writes to the database to maintain the index data structure.
Closing Thoughts
This article discussed 25 of the top questions you may face about DynamoDB in an interview. DynamoDB is
a developing technology becoming a huge asset amongst the NoSQL database services. Because of that,
positions are opening up for individuals passionate about working with this fantastic service.
So, I hope you find this article helpful to crush your next interview with confidence!
Join Telegram
3. What is NoSQL?
Ans: NoSQL refers to non-relational databases that do not use SQL for querying and
are designed to handle large amounts of unstructured data.
Ans: A partition key is the primary key for a DynamoDB table and is used to partition
data across multiple servers.
Ans: A sort key is a secondary key for a DynamoDB table and is used to sort items
within a partition.
Ans: DynamoDB optimizes queries by selecting the most efficient index and partition to
retrieve data.
Ans: A secondary index is an index that allows you to query a DynamoDB table using
an alternate key.
Ans: A local secondary index is an index that allows you to query a DynamoDB table
using an alternate key and must be created when the table is created.
Ans: Eventual consistency in DynamoDB means that it may take some time for all
replicas of an item to be updated after a write operation.
Ans: Strong consistency in DynamoDB means that all replicas of an item are updated
before a read operation is performed.
Ans: A DynamoDB table stores data while a DynamoDB stream captures changes to
the data.
18. What is the maximum number of DynamoDB streams allowed per table?
Ans: DynamoDB Auto Scaling automatically adjusts the read and write capacity of a
DynamoDB table based on traffic patterns.
Ans: DynamoDB provides encryption at rest and in transit using AWS Key Management
Service (KMS).
Ans: A strongly-typed attribute in DynamoDB has a defined data type while a weakly-
typed attribute does not.
Ans: A PUT operation in DynamoDB adds a new item or replaces an existing item while
an UPDATE operation modifies an existing item.
25. What is the maximum number of items that can be retrieved in a single
DynamoDB query?
Ans: The maximum number of items that can be retrieved in a single DynamoDB query
is 1 MB.
26. What is the difference between a query and a scan operation in
DynamoDB?
Ans: A query operation retrieves items based on a specified partition key and sort key
while a scan operation retrieves all items in a table.
29. What is the maximum number of items that can be retrieved in a single
BatchGetItem operation?
Ans: A BatchGetItem operation retrieves multiple items from one or more tables while
a Query operation retrieves items from a single table based on a partition key and sort
key.
31. What is the maximum number of tables that can be created in a single
AWS account?
Ans: The maximum number of tables that can be created in a single AWS account is
256.
32. What is the maximum number of secondary indexes that can be created on
a single DynamoDB table?
33. What is the difference between a local and a global secondary index in
DynamoDB?
Ans: A local secondary index is a secondary index that has the same partition key as
the table while a global secondary index has a different partition key than the table.
34. What is the difference between a strong and eventual consistent read in
DynamoDB?
Ans: A strong read in DynamoDB returns the most up-to-date data while an eventual
consistent read may return stale data.
Binary
Boolean
Numbers
Strings
37. What is the maximum number of tags that can be applied to a single
DynamoDB table?
Ans: The maximum number of tags that can be applied to a single DynamoDB table is
50.
Ans: A conditional write in DynamoDB only performs the write operation if a specified
condition is met while a non-conditional write always performs the write operation.
Ans: The DynamoDB CLI is a command-line interface for DynamoDB that allows you to
manage DynamoDB tables and data.
40. What is the purpose of the DynamoDB data mapper for JavaScript?
Ans: The DynamoDB data mapper for JavaScript is a library that allows you to map
JavaScript objects to DynamoDB tables.
41. What is the difference between a hash key and a range key in DynamoDB?
Ans: Ahash key is a required attribute for every item in a DynamoDB table and is used
to determine the partition where the item is stored. A range key is an optional attribute
that is used to sort items with the same partition key.
43. What is the difference between a stream view type of NEW_IMAGE and
OLD_IMAGE?
Ans: A stream view type of NEW_IMAGE includes the new values of the modified item
while a stream view type of OLD_IMAGE includes the old values of the modified item.
Ans: A stream view type of NEW_AND_OLD_IMAGES includes both the new and old
values of the modified item while a stream view type of NEW_IMAGE only includes the
new values.
49. What is the difference between a table backup and a table export in
DynamoDB?
Ans: A table backup in DynamoDB includes all of the table’s attributes and indexes
while a table export only includes the attributes that you specify.
Ans: The PutItem operation in DynamoDB writes a single item to a table while the
BatchWriteItem operation writes multiple items to one or more tables in a single
request.
Apache Spark is an open-source framework engine that is known for its speed, easy-to-use nature
in the field of big data processing and analysis. It also has built-in modules for graph processing,
machine learning, streaming, SQL, etc. The spark execution engine supports in-memory
computation and cyclic data flow and it can run either on cluster mode or standalone mode and can
access diverse data sources like HBase, HDFS, Cassandra, etc.
2. What are the features of Apache Spark?
High Processing Speed: Apache Spark helps in the achievement of a very high processing speed of
data by reducing read-write operations to disk. The speed is almost 100x faster while performing in-
memory computation and 10x faster while performing disk computation.
Dynamic Nature: Spark provides 80 high-level operators which help in the easy development of
parallel applications.
In-Memory Computation: The in-memory computation feature of Spark due to its DAG execution
engine increases the speed of data processing. This also supports data caching and reduces the time
required to fetch data from the disk.
Reusability: Spark codes can be reused for batch-processing, data streaming, running ad-hoc queries,
etc.
Fault Tolerance: Spark supports fault tolerance using RDD. Spark RDDs are the abstractions designed
to handle failures of worker nodes which ensures zero data loss.
Stream Processing: Spark supports stream processing in real-time. The problem in the earlier
MapReduce framework was that it could process only already existing data.
Lazy Evaluation: Spark transformations done using Spark RDDs are lazy. Meaning, they do not
generate results right away, but they create new RDDs from existing RDD. This lazy evaluation
increases the system efficiency.
Support Multiple Languages: Spark supports multiple languages like R, Scala, Python, Java which
provides dynamicity and helps in overcoming the Hadoop limitation of application development only
using Java.
Hadoop Integration: Spark also supports the Hadoop YARN cluster manager thereby making it
flexible.
Supports Spark GraphX for graph parallel execution, Spark SQL, libraries for Machine learning, etc.
Cost Efficiency: Apache Spark is considered a better cost-efficient solution when compared to
Hadoop as Hadoop required large storage and data centers while data processing and replication.
Active Developer’s Community: Apache Spark has a large developers base involved in continuous
development. It is considered to be the most important project undertaken by the Apache
community.
3. What is RDD?
RDD stands for Resilient Distribution Datasets. It is a fault-tolerant collection of parallel running
operational elements. The partitioned data of RDD is distributed and immutable. There are two
types of datasets:
DAG stands for Directed Acyclic Graph with no directed cycles. There would be finite vertices and
edges. Each edge from one vertex is directed to another vertex in a sequential manner. The vertices
refer to the RDDs of Spark and the edges represent the operations to be performed on those
RDDs.
Client Mode: The deploy mode is said to be in client mode when the spark driver component runs on
the machine node from where the spark job is submitted.
o The main disadvantage of this mode is if the machine node fails, then the entire job fails.
o This mode supports both interactive shells or the job submission commands.
o The performance of this mode is worst and is not preferred in production environments.
Cluster Mode: If the spark job driver component does not run on the machine from which the spark
job has been submitted, then the deploy mode is said to be in cluster mode.
o The spark job launches the driver component within the cluster as a part of the sub-process
of ApplicationMaster.
o This mode supports deployment only using the spark-submit command (interactive shell
mode is not supported).
o Here, since the driver programs are run in ApplicationMaster, in case the program fails, the
driver program is re-instantiated.
o In this mode, there is a dedicated cluster manager (such as stand-alone, YARN, Apache Mesos,
Kubernetes, etc) for allocating the resources required for the job to run as shown in the below
architecture.
Apart from the above two modes, if we have to run the application on our local machines for unit
testing and development, the deployment mode is called “Local Mode”. Here, the jobs run on a
single JVM in a single machine which makes it highly inefficient as at some point or the other there
would be a shortage of resources which results in the failure of jobs. It is also not possible to scale
up resources in this mode due to the restricted memory and space.
Receivers are those entities that consume data from different data sources and then move them to
Spark for processing. They are created by using streaming contexts in the form of long-running
tasks that are scheduled for operating in a round-robin fashion. Each receiver is configured to use
up only a single core. The receivers are made to run on various executors to accomplish the task of
data streaming. There are two types of receivers depending on how the data is sent to Spark:
Reliable receivers: Here, the receiver sends an acknowledegment to the data sources post successful
reception of data and its replication on the Spark storage space.
Unreliable receiver: Here, there is no acknowledgement sent to the data sources.
Usage repartition can increase/decrease the Spark coalesce can only reduce the number of data
number of data partitions. partitions.
Repartition creates new data partitions and Coalesce makes use of already existing partitions to
performs a full shuffle of evenly distributed data. reduce the amount of shuffled data unevenly.
Repartition internally calls coalesce with shuffle Coalesce is faster than repartition. However, if there are
parameter thereby making it slower than coalesce. unequal-sized data partitions, the speed might be slightly
Repartition Coalesce
slower.
Spark supports both the raw files and the structured file formats for efficient reading and
processing. File formats like paraquet, JSON, XML, CSV, RC, Avro, TSV, etc are supported by Spark.
The process of redistribution of data across different partitions which might or might not cause
data movement across the JVM processes or the executors on the separate machines is known as
shuffling/repartitioning. Partition is nothing but a smaller logical division of data.
It is to be noted that Spark has no control over what partition the data gets distributed across.
YARN is one of the key features provided by Spark that provides a central resource management
platform for delivering scalable operations throughout the cluster.
YARN is a cluster management technology and a Spark is a tool for data processing.
Spark Interview Questions for Experienced
11. How is Apache Spark different from MapReduce?
MapReduce Apache Spark
MapReduce highly depends on disk which makes it to be Spark supports in-memory data storage and caching
a high latency framework. and makes it a low latency computation framework.
12. Explain the working of Spark with the help of its architecture.
Spark applications are run in the form of independent processes that are well coordinated by the
Driver program by means of a SparkSession object. The cluster manager or the resource manager
entity of Spark assigns the tasks of running the Spark jobs to the worker nodes as per one task per
partition principle. There are various iterations algorithms that are repeatedly applied to the data to
cache the datasets across various iterations. Every task applies its unit of operations to the dataset
within its partition and results in the new partitioned dataset. These results are sent back to the
main driver application for further processing or to store the data on the disk. The following
diagram illustrates this working as described above:
13. What is the working of DAG in Spark?
DAG stands for Direct Acyclic Graph which has a set of finite vertices and edges. The vertices
represent RDDs and the edges represent the operations to be performed on RDDs sequentially.
The DAG created is submitted to the DAG Scheduler which splits the graphs into stages of tasks
based on the transformations applied to the data. The stage view has the details of the RDDs of
that stage.
The working of DAG in spark is defined as per the workflow diagram below:
The first task is to interpret the code with the help of an interpreter. If you use the Scala code, then
the Scala interpreter interprets the code.
Spark then creates an operator graph when the code is entered in the Spark console.
When the action is called on Spark RDD, the operator graph is submitted to the DAG Scheduler.
The operators are divided into stages of task by the DAG Scheduler. The stage consists of detailed
step-by-step operation on the input data. The operators are then pipelined together.
The stages are then passed to the Task Scheduler which launches the task via the cluster manager to
work on independently without the dependencies between the stages.
The worker nodes then execute the task.
Each RDD keeps track of the pointer to one/more parent RDD along with its relationship with the
parent. For example, consider the operation val childB=parentA.map() on RDD, then we have the RDD
childB that keeps track of its parentA which is called RDD lineage.
14. Under what scenarios do you use Client and Cluster modes for deployment?
In case the client machines are not close to the cluster, then the Cluster mode should be used for
deployment. This is done to avoid the network latency caused while communication between the
executors which would occur in the Client mode. Also, in Client mode, the entire process is lost if the
machine goes offline.
If we have the client machine inside the cluster, then the Client mode can be used for deployment.
Since the machine is inside the cluster, there won’t be issues of network latency and since the
maintenance of the cluster is already handled, there is no cause of worry in cases of failure.
Spark Streaming is one of the most important features provided by Spark. It is nothing but a Spark
API extension for supporting stream processing of data from different sources.
Data from sources like Kafka, Kinesis, Flume, etc are processed and pushed to various destinations
like databases, dashboards, machine learning APIs, or as simple as file systems. The data is divided
into various streams (similar to batches) and is processed accordingly.
Spark streaming supports highly scalable, fault-tolerant continuous stream processing which is mostly
used in cases like fraud detection, website monitoring, website click baits, IoT (Internet of Things)
sensors, etc.
Spark Streaming first divides the data from the data stream into batches of X seconds which are
called Dstreams or Discretized Streams. They are internally nothing but a sequence of multiple RDDs.
The Spark application does the task of processing these RDDs using various Spark APIs and the
results of this processing are again returned as batches. The following diagram explains the workflow
of the spark streaming process.
16. Write a spark program to check if a given keyword exists in a huge text file or not?
def keywordExists(line):
if (line.find(“my_keyword”) > -1):
return 1
return 0
lines = sparkContext.textFile(“test_file.txt”);
isExist = lines.map(keywordExists);
sum = isExist.reduce(sum);
print(“Found” if sum>0 else “Not Found”)
Spark Datasets are those data structures of SparkSQL that provide JVM objects with all the
benefits (such as data manipulation using lambda functions) of RDDs alongside Spark SQL-
optimised execution engine. This was introduced as part of Spark since version 1.6.
Spark datasets are strongly typed structures that represent the structured queries along with their
encoders.
They provide type safety to the data and also give an object-oriented programming interface.
The datasets are more structured and have the lazy query expression which helps in triggering the
action. Datasets have the combined powers of both RDD and Dataframes. Internally, each dataset
symbolizes a logical plan which informs the computational query about the need for data production.
Once the logical plan is analyzed and resolved, then the physical query plan is formed that does the
actual query execution.
Optimized Query feature: Spark datasets provide optimized queries using Tungsten and Catalyst
Query Optimizer frameworks. The Catalyst Query Optimizer represents and manipulates a data flow
graph (graph of expressions and relational operators). The Tungsten improves and optimizes the
speed of execution of Spark job by emphasizing the hardware architecture of the Spark execution
platform.
Compile-Time Analysis: Datasets have the flexibility of analyzing and checking the syntaxes at the
compile-time which is not technically possible in RDDs or Dataframes or the regular SQL queries.
Interconvertible: The type-safe feature of datasets can be converted to “untyped” Dataframes by
making use of the following methods provided by the Datasetholder:
o toDS():Dataset[T]
o toDF():DataFrame
o toDF(columName:String*):DataFrame
Faster Computation: Datasets implementation are much faster than those of the RDDs which helps
in increasing the system performance.
Persistent storage qualified: Since the datasets are both queryable and serializable, they can be easily
stored in any persistent storages.
Less Memory Consumed: Spark uses the feature of caching to create a more optimal data layout.
Hence, less memory is consumed.
Single Interface Multiple Languages: Single API is provided for both Java and Scala languages. These
are widely used languages for using Apache Spark. This results in a lesser burden of using libraries for
different types of inputs.
Spark Dataframes are the distributed collection of datasets organized into columns similar to SQL.
It is equivalent to a table in the relational database and is mainly optimized for big data operations.
Dataframes can be created from an array of data from different data sources such as external
databases, existing RDDs, Hive Tables, etc. Following are the features of Spark Dataframes:
Spark Dataframes have the ability of processing data in sizes ranging from Kilobytes to Petabytes on
a single node to large clusters.
They support different data formats like CSV, Avro, elastic search, etc, and various storage systems
like HDFS, Cassandra, MySQL, etc.
By making use of SparkSQL catalyst optimizer, state of art optimization is achieved.
It is possible to easily integrate Spark Dataframes with major Big Data tools using SparkCore.
The applications developed in Spark have the same fixed cores count and fixed heap size defined
for spark executors. The heap size refers to the memory of the Spark executor that is controlled by
making use of the property spark.executor.memory that belongs to the -executor-memory flag. Every
Spark applications have one allocated executor on each worker node it runs. The executor memory
is a measure of the memory consumed by the worker node that the application utilizes.
SparkCore is the main engine that is meant for large-scale distributed and parallel data processing.
The Spark core consists of the distributed execution engine that offers various APIs in Java,
Python, and Scala for developing distributed ETL applications.
Spark Core does important functions such as memory management, job monitoring, fault-tolerance,
storage system interactions, job scheduling, and providing support for all the basic I/O
functionalities. There are various additional libraries built on top of Spark Core which allows
diverse workloads for SQL, streaming, and machine learning. They are responsible for:
Fault recovery
Memory management and Storage system interactions
Job monitoring, scheduling, and distribution
Basic I/O functions
Worker nodes are those nodes that run the Spark application in a cluster. The Spark driver program
listens for the incoming connections and accepts them from the executors addresses them to the
worker nodes for execution. A worker node is like a slave node where it gets the work from its
master node and actually executes them. The worker nodes do data processing and report the
resources used to the master. The master decides what amount of resources needs to be allocated
and then based on their availability, the tasks are scheduled for the worker nodes by the master.
22. What are some of the demerits of using Spark in applications?
Despite Spark being the powerful data processing engine, there are certain demerits to using
Apache Spark in applications. Some of them are:
Spark makes use of more storage space when compared to MapReduce or Hadoop which may lead to
certain memory-based problems.
Care must be taken by the developers while running the applications. The work should be distributed
across multiple clusters instead of running everything on a single node.
Since Spark makes use of “in-memory” computations, they can be a bottleneck to cost-efficient big
data processing.
While using files present on the path of the local filesystem, the files must be accessible at the same
location on all the worker nodes when working on cluster mode as the task execution shuffles
between various worker nodes based on the resource availabilities. The files need to be copied on all
worker nodes or a separate network-mounted file-sharing system needs to be in place.
One of the biggest problems while using Spark is when using a large number of small files. When
Spark is used with Hadoop, we know that HDFS gives a limited number of large files instead of a
large number of small files. When there is a large number of small gzipped files, Spark needs to
uncompress these files by keeping them on its memory and network. So large amount of time is
spent in burning core capacities for unzipping the files in sequence and performing partitions of the
resulting RDDs to get data in a manageable format which would require extensive shuffling overall.
This impacts the performance of Spark as much time is spent preparing the data instead of
processing them.
Spark doesn’t work well in multi-user environments as it is not capable of handling many users
concurrently.
23. How can the data transfers be minimized while working with Spark?
Data transfers correspond to the process of shuffling. Minimizing these transfers results in faster
and reliable running Spark applications. There are various ways in which these can be minimized.
They are:
Usage of Broadcast Variables: Broadcast variables increases the efficiency of the join between large
and small RDDs.
Usage of Accumulators: These help to update the variable values parallelly during execution.
Another common way is to avoid the operations which trigger these reshuffles.
SchemaRDD is an RDD consisting of row objects that are wrappers around integer arrays or strings
that has schema information regarding the data type of each column. They were designed to ease
the lives of developers while debugging the code and while running unit test cases on the
SparkSQL modules. They represent the description of the RDD which is similar to the schema of
relational databases. SchemaRDD also provides the basic functionalities of the common RDDs
along with some relational query interfaces of SparkSQL.
Consider an example. If you have an RDD named Person that represents a person’s data. Then
SchemaRDD represents what data each row of Person RDD represents. If the Person has attributes
like name and age, then they are represented in SchemaRDD.
Spark provides a powerful module called SparkSQL which performs relational data processing
combined with the power of the functional programming feature of Spark. This module also
supports either by means of SQL or Hive Query Language. It also provides support for different
data sources and helps developers write powerful SQL queries using code transformations.
The four major libraries of SparkSQL are:
Spark SQL supports the usage of structured and semi-structured data in the following ways:
Spark supports DataFrame abstraction in various languages like Python, Scala, and Java along with
providing good optimization techniques.
SparkSQL supports data read and writes operations in various structured formats like JSON, Hive,
Parquet, etc.
SparkSQL allows data querying inside the Spark program and via external tools that do the
JDBC/ODBC connections.
It is recommended to use SparkSQL inside the Spark applications as it empowers the developers to
load the data, query the data from databases and write the results to the destination.
Spark persists intermediary data from different shuffle operations automatically. But it is
recommended to call the persist() method on the RDD. There are different persistence levels for
storing the RDDs on memory or disk or both with different levels of replication. The persistence
levels available in Spark are:
MEMORY_ONLY: This is the default persistence level and is used for storing the RDDs as the
deserialized version of Java objects on the JVM. In case the RDDs are huge and do not fit in the
memory, then the partitions are not cached and they will be recomputed as and when needed.
MEMORY_AND_DISK: The RDDs are stored again as deserialized Java objects on JVM. In case the
memory is insufficient, then partitions not fitting on the memory will be stored on disk and the data
will be read from the disk as and when needed.
MEMORY_ONLY_SER: The RDD is stored as serialized Java Objects as One Byte per partition.
MEMORY_AND_DISK_SER: This level is similar to MEMORY_ONLY_SER but the difference is that the
partitions not fitting in the memory are saved on the disk to avoid recomputations on the fly.
DISK_ONLY: The RDD partitions are stored only on the disk.
OFF_HEAP: This level is the same as the MEMORY_ONLY_SER but here the data is stored in the off-
heap memory.
The syntax for using persistence levels in the persist() method is:
df.persist(StorageLevel.<level_value>)
MEMORY_AND_DISK_SE
Low High Some Some
R
Number of nodes = 10
Number of cores in each node = 15 cores
RAM of each node = 61GB
Number of Cores = number of concurrent tasks that can be run parallelly by the executor. The optimal value as part of
a general rule of thumb is 5.
Broadcast variables let the developers maintain read-only variables cached on each machine
instead of shipping a copy of it with tasks. They are used to give every node copy of a large input
dataset efficiently. These variables are broadcasted to the nodes using different algorithms to
reduce the cost of communication.
29. Differentiate between Spark Datasets, Dataframes and RDDs.
Criteria Spark Datasets Spark Dataframes Spark RDDs
Schema Datasets find out schema Dataframes also find the Schema needs to be defined
Projection automatically using SQL Engine. schema automatically. manually in RDDs.
30. Can Apache Spark be used along with Hadoop? If yes, then how?
Yes! The main feature of Spark is its compatibility with Hadoop. This makes it a powerful
framework as using the combination of these two helps to leverage the processing capacity of
Spark by making use of the best of Hadoop’s YARN and HDFS features.
Hadoop can be integrated with Spark in the following ways:
HDFS: Spark can be configured to run atop HDFS to leverage the feature of distributed replicated
storage.
MapReduce: Spark can also be configured to run alongside the MapReduce in the same or different
processing framework or Hadoop cluster. Spark and MapReduce can be used together to perform
real-time and batch processing respectively.
YARN: Spark applications can be configured to run on YARN which acts as the cluster management
framework.
31. What are Sparse Vectors? How are they different from dense vectors?
Sparse vectors consist of two parallel arrays where one array is for storing indices and the other for
storing values. These vectors are used to store non-zero values for saving space.
In the above example, we have the vector of size 5, but the non-zero values are there only at indices
0 and 4.
Sparse vectors are particularly useful when there are very few non-zero values. If there are cases that
have only a few zero values, then it is recommended to use dense vectors as usage of sparse vectors
would introduce the overhead of indices which could impact the performance.
Dense vectors can be defines as follows:
Usage of sparse or dense vectors does not impact the results of calculations but when used
inappropriately, they impact the memory consumed and the speed of calculation.
32. How are automatic clean-ups triggered in Spark for handling the accumulated metadata?
Spark Streaming involves the division of data stream’s data into batches of X seconds called
DStreams. These DStreams let the developers cache the data into the memory which can be very
useful in case the data of DStream is used for multiple computations. The caching of data can be
done using the cache() method or using persist() method by using appropriate persistence levels.
The default persistence level value for input streams receiving data over the networks such as
Kafka, Flume, etc is set to achieve data replication on 2 nodes to accomplish fault tolerance.
Cost efficiency: Since Spark computations are expensive, caching helps to achieve reusing of data
and this leads to reuse computations which can save the cost of operations.
Time-efficient: The computation reusage leads to saving a lot of time.
More Jobs Achieved: By saving time of computation execution, the worker nodes can
perform/execute more jobs.
Apache Spark provides the pipe() method on RDDs which gives the opportunity to compose
different parts of occupations that can utilize any language as needed as per the UNIX Standard
Streams. Using the pipe() method, the RDD transformation can be written which can be used for
reading each element of the RDD as String. These can be manipulated as required and the results
can be displayed as String.
35. What API is used for Graph Implementation in Spark?
Spark provides a powerful API called GraphX that extends Spark RDD for supporting graphs and
graph-based computations. The extended property of Spark RDD is called as Resilient Distributed
Property Graph which is a directed multi-graph that has multiple parallel edges. Each edge and the
vertex has associated user-defined properties. The presence of parallel edges indicates multiple
relationships between the same set of vertices. GraphX has a set of operators such as subgraph,
mapReduceTriplets, joinVertices, etc that can support graph computation. It also includes a large
collection of graph builders and algorithms for simplifying tasks related to graph analytics.
Spark provides a very robust, scalable machine learning-based library called MLlib. This library aims
at implementing easy and scalable common ML-based algorithms and has the features like
classification, clustering, dimensional reduction, regression filtering, etc. More information about
this library can be obtained in detail from Spark’s official documentation site
here: https://spark.apache.org/docs/latest/ml-guide.html
Conclusion
37. Conclusion
In this article, we have seen the most commonly asked Spark interview questions. Apache Spark is
the fastest-growing cluster computational platform that was designed to process big data in a
faster manner along with the compatibility to previously existing big data tools and support to
various libraries. These integrations help to build seamlessly fast and powerful applications with the
power of different computational models. Due to these reasons, Spark has become a hot and
lucrative technology, and knowing Spark will open doors to new, better, and challenging career
opportunities for Software Developers and Data Engineers.
Top 50 Azure Interview Questions You Must Prepare In 2023
In infrastructure as a service, you get Platform as a Service, gives You get software as a service in
the raw hardware from your cloud you a platform to publish Azure, i.e no infrastructure, no
provider as a service i.e you get a without giving the access to platform, simple software that
server which you can configure with the underlying software or you can use without purchasing
your own will. OS. it.
Public Cloud: The infrastructure is owned by your cloud provider and the server that you are
using could be a multi-tenant system.
Hybrid Cloud: When you use both Public Cloud, Private Cloud together, it is called Hybrid
Cloud. For Example: Using your in-house servers for confidential data, and the public cloud for
hosting your company’s public facing website. This type of setup would be a hybrid cloud.
4. I have some private servers on my premises, also I have distributed some of my workload on
the public cloud, what is this architecture called?
Explanation: This type of architecture would be a hybrid cloud. Why? Because we are using
both, the public cloud, and on premises servers i.e the private cloud. To make this hybrid
architecture easy to use, wouldn’t it be better if your private and public cloud were all on the
same network (virtually). This is established by including your public cloud servers in a virtual
private cloud, and connecting virtual cloud with your on premise servers using a VPN (Virtual
Private Network).
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
A. Application Insights
B. Azure Resource Manager
C. Azure Portal
D. Log Analytics
A. ASP.NET
B. PHP
C. WCF
D. All of the mentioned
Explanation: Microsoft also has released SDKs for both Java and Ruby to allow applications
written in those languages to place calls to the Azure Service Platform API to the AppFabric
Service.
5(149898)
5(36278)
MICROSOFT AZURE ARCHITECT CERTIFICATION TRAINING COURSE (AZ-305)
Microsoft Azure Architect Certification Training Course (AZ-305)
Reviews
5(17028)
5(16115)
5(12684)
5(3818)
5(7812)
4(11850)
MIGRATING APPLICATIONS TO AWS TRAINING
Migrating Applications to AWS Training
Reviews
5(5970)
Next
Web Role
Worker Role
VM Role
Web Role – A web role is basically used to deploy a website, using languages supported by the
IIS platform like, PHP, .NET etc. It is configured and customized to run web applications.
Worker Role – A worker role is more like an help to the Web role, it used to execute background
processes unlike the Web Role which is used to deploy the website.
VM Role – The VM role is used by a user to schedule tasks and other windows services. This role
can be used to customize the machines on which the web and worker role is running.
9. A _________ role is a virtual machine instance running Microsoft IIS Web server that can accept
and respond to HTTP or HTTPS requests.
A. Web
B. Server
C. Worker
D. Client
Answer: A. Web
Explanation: The answer should be Web Roles, there are no roles such as Server or Client
roles. Also, Worker roles can only communicate with Azure Storage or through direct
connections to clients.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
10. Is it possible to create a Virtual Machine using Azure Resource Manager in a Virtual Network
that was created using classic deployment?
Explanation: This is not supported. You cannot use Azure Resource Manager to deploy a
virtual machine into a virtual network that was created using classic deployment.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
Want to upskill yourself to get ahead in your career? Check out this video
Here are the Top 10 Technologies to Learn in 2023 | Edureka
20. What happens when you exhaust the maximum failed attempts for authenticating yourself via Azure AD?
Explanation: We use a more sophisticated strategy to lock accounts. This is based on the IP
address of the request and the passwords entered. The duration of the lockout also increases
based on the likelihood that it is an attack.
21. Where can I find a list of applications that are pre-integrated with Azure AD and their capabilities?
Explanation: Azure AD has around 2600 pre-integrated applications. All pre-integrated
applications support single sign-on (SSO). SSO let you use your organizational credentials to
access your apps. Some of the applications also support automated provisioning and de-
provisioning.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
22. How can I use applications with Azure AD that I’m using on-premises?
Explanation: Azure AD gives you an easy and secure way to connect to the web applications
you choose. You can access these applications in the same way you access your SaaS apps in
Azure AD, no need for a VPN to change your network infrastructure.
25. What are the differences between Subscription Administrator and Directory Administrator?
Explanation: By default, one is assigned the Subscription Administrator role when he/she
signs up for Azure. A subscription admin can use either a Microsoft account or a work or school
account from the directory that the Azure subscription is associated with. This role is
authorized to manage services in the Azure portal. If others need to sign in and access services
by using the same subscription, you can add them as co-admins.
Azure AD has a different set of admin roles to manage the directory and identity-related
features. These admins will have access to various features in the Azure portal or the Azure
classic portal. The admin’s role determines what they can do, like create or edit users, assign
administrative roles to others, reset user passwords, manage user licenses, or manage
domains.
26. Are there any scale limitations for customers using managed disks?
Explanation: Managed Disks eliminates the limits associated with storage accounts. However,
the number of managed disks per subscription is limited to 2000 by default.
27. What is the difference between Service Bus Queues and Storage Queues?
Explanation: The Azure Storage Queue is simple and the developer experience is quite good. It
uses the local Azure Storage Emulator and debugging is made quite easy. The tooling for Azure
Storage Queues allows you to easily peek at the top 32 messages and if the messages are in
XML or Json, you’re able to visualize their contents directly from Visual Studio Furthermore,
these queues can be purged of their contents, which is especially useful during development
and QA efforts.
The Azure Service Bus Queues are evolved and surrounded by many useful mechanisms that
make it enterprise-worthy! They are built into the Service Bus and are able to forward
messages to other Queues and Topics. They have a built-in dead-letter queue and messages
have a time to live that you control, hence messages don’t automatically disappear after 7 days.
Furthermore, Azure Service Bus Queues have the ability of deleting themselves after a
configurable amount of idle time. This feature is very practical when you create Queues for
each user, because if a user hasn’t interacted with a Queue for the past month, it automatically
gets clean it up. Its also a great way to drive costs down. You shouldn’t have to pay for storage
that you don’t need. These Queues are limited to a maximum of 80gb. Once you’ve reached
this limit your application will start receiving exceptions.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
29. Why doesn’t Azure Redis Cache have an MSDN class library reference like some of the other Azure services?
Explanation: Microsoft Azure Redis Cache is based on the popular open source Redis Cache
and can be accessed by a wide variety of Redis clients for many programming languages. Each
client has its own API that makes calls to the Redis cache instance using Redis commands.
Because each client is different, there is not one centralized class reference on MSDN, and each
client maintains its own reference documentation. In addition to the reference documentation,
there are several tutorials showing how to get started with Azure Redis Cache using different
languages and cache clients. To access these tutorials, see How to use Azure Redis Cache and
click the desired language from the language switcher at the top of the article.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
Azure Managed Disks are the new and recommended disk storage offerings for use with Azure
Virtual Machines for persistent storage of data. You can use multiple Managed Disks with each
Virtual Machine. Managed Disks offer two types of durable storage options: Premium and
Standard Managed Disks.
Azure storage accounts can also provide storage for the operating system disk and any data
disks. Each disk is a .vhd file stored as a page blob. You can learn more about this from the MS
Azure certification.
37. How to create a new storage account and container using Power Shell?
$storageName = "st" + (Get-Random)
New-AzureRmStorageAccount -ResourceGroupName "myResourceGroup" -AccountName
$storageName -Location "West US" -SkuName "Standard_LRS" -Kind Storage
$accountKey = (Get-AzureRmStorageAccountKey -ResourceGroupName myResourceGroup -
Name $storageName).Value[0]
$context = New-AzureStorageContext -StorageAccountName $storageName -
StorageAccountKey $accountKey
New-AzureStorageContainer -Name "templates" -Context $context -Permission
Container
38. How can one create a VM in Azure CLI?
az vm create ` --resource-group myResourceGroup ` --name myVM --image
win2016datacenter ` --admin-username azureuser ` --admin-password myPassword12
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
Client-side causes
o The client application was redeployed.
o The client application performed a scaling operation.
o In the case of Cloud Services or Web Apps, this may be due to auto-scaling.
o The networking layer on the client side changed.
o Transient errors occurred in the client or in the network nodes between the client and the
server.
o The bandwidth threshold limits were reached.
o CPU bound operations took too long to complete.
Server-side causes
o On the standard cache offering, the Azure Redis Cache service initiated a fail-over from
the primary node to the secondary node.
o Azure was patching the instance where the cache was deployed
o This can be for Redis server updates or general VM maintenance.
44. My web app still uses an old Docker container image after I’ve updated the image on Docker Hub. Does
Azure support continuous integration/deployment of custom containers?
Explanation: Yes, it does. For private registries, you can update the container by stopping and
then re-starting your web app. Alternatively, you can also change or add a dummy application
setting to force an update of your container.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
45. What are the expected values for the Startup File section when I configure the runtime stack?
Explanation: For Node.Js, you specify the PM2 configuration file or your script file. For .NET
Core, specify your compiled DLL name. For Ruby, you can specify the Ruby script that you want
to initialize your app with.
Pricing will vary based on product types. ISV software charges and Azure infrastructure costs
are charged separately through your Azure subscription. Pricing models include:
BYOL Model: Bring-your-own-license. You obtain outside of the Azure Marketplace, the right to
access or use the offering and are not charged Azure Marketplace fees for use of the offering in
the Azure Marketplace.
Free: Free SKU. Customers are not charged Azure Marketplace fees for use of the offering.
Free Software Trial: Full-featured version of the offer that is promotionally free for a limited
period of time. You will not be charged Azure Marketplace fees for use of the offering during a
trial period. Upon expiration of the trial period, customers will automatically be charged based
on standard rates for use of the offering.
Usage-Based: You are charged or billed based on the extent of your use of the offering. For
Virtual Machines Images, you are charged an hourly Azure Marketplace fee. For Data Services,
Developer services, and APIs, you are charged per unit of measurement as defined by the
offering.
Monthly Fee: You are charged or billed a fixed monthly fee for a subscription to the offering
(from the date of subscription start for that particular plan). The monthly fee is not prorated for
mid-month cancellations or unused services.
47. What is the difference between “price,” “software price,” and “total price” in the cost structure for Virtual
Machine offers in the Azure Marketplace?
Explanation: “Price” refers to the cost of the Azure Virtual Machine to run the software.
“Software price” refers to the cost of the publisher software running on an Azure Virtual
Machine. “Total price” refers to the combined total cost of the Azure Virtual Machine and the
publisher software running on an Azure Virtual Machine.
Apart from this Azure Interview Questions Blog, if you want to get trained from professionals
on this technology, you can opt for a structured training from edureka! Click below to know
more.
2. What is a Cloud?
A cloud is a collaboration of networks, hardware, services, storage, and interfaces that help in
delivering computing as a service. It has three users:
1. End users
2. Business management users
3. Cloud service providers
It is an advanced-stage technology implemented so that the cloud provides its services globally
as per the user requirements. It provides a method to access several servers worldwide.
Reliable
Scalable
Agile
Location Independent
Multi-tenant
Get started on your cloud computing project today and take advantage of the scalability,
flexibility, and cost-effectiveness of the cloud!
Cloud Controller
Storage Services
Object
NoSQL
Relational
Block storage
Simple-to-scale applications
Easier recovery from failure
FaaS provides users with a fully functional platform where they can create, manage and run
their applications without having to worry about maintaining the infrastructure.
Submit
Cloud
SQL Azure
App Fabric: Allows fabric cloud
Want to master the core concepts of Azure? Check out our Microsoft Azure Training
Course and become a Certified Administrator!
People and teams who use different types of cloud services, within your organization.
12. Who are the direct consumers in a cloud ecosystem?
The individuals who utilize the service provided by your company, build within a cloud
environment.
Cloud service providers are the companies that sell their cloud services to others. Sometimes
these companies also provide cloud services internally to their partners, employees, etc.
Azure Agent
Learn more about AWS in this AWS Training in New York to get ahead in your career!
Two virtual machines are in a single fault domain if a single piece of hardware can bring
down both virtual machines.
Azure automatically distributes instances of a role across fault domains.
Use of Upgrade Domains:
When a new version of the software is rolled out, then only one up-gradation of the domain
is done at a time.
It ensures that any instance of the service is always available.
There is an availability of the applications in multiple instances.
Cloud Computing Architecture brings together two components of cloud computing – the front-
end and the back-end. It is important to bring the correct services together for the benefit of
both internal and external people. If need be, cloud management should be able to quickly
make the required changes.
Files
Blocks
Datasets
Objects
These components allow you to create apps without the stress of managing the infrastructure.
Advantages Disadvantages
Cost-effective Can cause late responses
Increases productivity Not ideal for high-computing
operations
Scalable More vulnerable when it comes to
security
No server Debugging is challenging
management
To know more pros and cons of cloud computing check our blog on the Advantages and
disadvantages of cloud computing now!
20. Give the best example of the open-source Cloud Computing.
OpenStack
Microservices help create apps that consist of codes that are independent of one another and
the platform they were developed on. Microservices are important in the cloud because of the
following reasons:
Each of them is built for a particular purpose. This makes app development simpler.
They make changes easier and quicker.
Their scalability makes it easier to adapt the service as needed.
AMI is Amazon Machine Image, which basically is a copy of your root file system. It feeds the
information required to launch an instance.
It launches the permissions that decide which AWS accounts can use the AMI for launching
instances. It also needs a block device mapping for specifying the volumes in order to attach
them to the instances whenever they are launched.
Cloud Bursting:
Access capacity and specialized software are available in the public cloud and not in the private
cloud.
Examples: Virtual Amazon and Dynamo
vCloud:
It is a VMware cloud.
It is expensive.
It gives enterprise quality.
OpenStack:
24. List the platforms that are used for large-scale Cloud
Computing.
The platforms that are used for large-scale Cloud Computing are:
Apache Hadoop
MapReduce
Private Cloud
Public Cloud
Community Cloud
Hybrid Cloud
The full form of ‘Eucalyptus’ is ‘Elastic Utility Computing Architecture for Linking Your Programs
to Useful Systems.
Google Bigtable
Amazon Simple Database
Cloud-based SQL (Sequential Query Language)
Edge computing is a part of the distributed computing structure. It brings companies closer to
the sources of data. This benefits businesses by giving them better insights, good response
time and better bandwidth.
APIs (Application Programming Interfaces) are used to eliminate the necessity to write
complete programs.
Here, instructions are provided to make communication between one or more applications.
Creation of applications is made easy and accessible for the link of cloud services with other
systems.
Professional cloud
Personal cloud
Performance cloud
In cloud, the hardware requirement is fulfilled as per the demand created for cloud
architecture.
Cloud architecture is capable of scaling up resources when there is a demand.
Cloud architecture is capable of managing and handling dynamic workloads without any
point of failure.
Reference architecture
Technical architecture
Deployment operation architecture
37. Explain AWS.
AWS stands for Amazon Web Services which is a collection of remote computing services also
known as Cloud Computing. This technology is also known as IaaS or Infrastructure as a
Service.
AWS Route 53: AWS Route 53 is a DNS (Domain Name Server) web-based service platform.
Simple E-mail Service: Sending of e-mail is done by using a RESTFUL API call or via regular
SMTP (Simple Mail Transfer Protocol).
Identity and Access Management: Improvised security and identity management are
provided for an AWS account.
Simple Storage Device (S3): It is a huge storage medium, widely used for AWS services.
Elastic Compute Cloud (EC2): It allows on-demand computing resources for hosting
applications and is essentially useful for unpredictable workloads.
Elastic Block Stores (EBS): They are storage volumes attached to EC2 and allow the data
lifespan of a single EC2.
CloudWatch: Amazon CloudWatch is used to monitor AWS resources, and it allows
administrators to view and collect the keys required. Access is provided so that one can set a
notification alarm in the case of trouble.
Go through the Challenges of Cloud Computing Blog to know about the challenges you can
face while working.
39. Explain how you can vertically scale an Amazon instance.
This is one of the essential features of AWS and cloud virtualization. We spinup a newly
developed large instance where we pause that instance and detach the root EBS volume from
the server and discard. Later, we stop our live instance and detach its root volume connected.
here, we note down the unique device ID and attach the same root volume to the new server,
and we restart it again. This results in a vertically scaled Amazon instance.
In Amazon, the backup storage of EBS volumes is maintained by inserting the snapshot
facility via an API call or via a GUI interface like Elasticfox.
Performance is improved by using Linux software RAID and striping across four volumes.
Hope that you will find these interviews questions for the role of Cloud Engineer useful. Do let
us know in the comments section below!
Do you want to become an AWS expert? Learn about the technology from Intellipaat’s AWS
Certified Solutions Architect Course.
Resource Replication creates duplicates of the same resource. Replication is employed when a
resource is needed more and more. The resource is virtualized to replicate cloud-based
resources.
CaaS is a system that allows developers to run, scale, manage, upload, and organize containers
by using virtualization.
A container is a software pack. It allows teams to scale their apps to highly available cloud
infrastructures.
AWS Basic Interview Questions
1. What is AWS?
AWS (Amazon Web Services) is a platform to provide secure cloud services, database
storage, offerings for computing power, content delivery, and other services to help the
business scale and develop.
Want to learn the basics of AWS Cloud Solutions? Check out our AWS Certification Course!
An Elastic Load Balancer ensures that the incoming traffic is distributed optimally across
various AWS instances. A buffer will synchronize different components and make the
arrangement additionally elastic to a burst of load or traffic. The components are prone to
working in an unstable way of receiving and processing requests. The buffer creates an
equilibrium linking various apparatus and crafts them to work at an identical rate to supply
more rapid services.
Both Spot Instance and On-demand Instance are models for pricing.
Spot Instance On-demand Instance
With Spot Instance, customers can purchase With On-demand Instance, users can launch
compute capacity with no upfront instances at any time based on the demand.
commitment at all.
Spot Instances are spare Amazon instances On-demand Instances are suitable for the
that you can bid for. high-availability needs of applications.
When the bidding price exceeds the spot On-demand Instances are launched by users
price, the instance is automatically launched, only with the pay-as-you-go model.
and the spot price fluctuates based on
supply and demand for instances.
When the bidding price is less than the spot On-demand Instances will remain persistent
price, the instance is immediately taken away without any automatic termination from
by Amazon. Amazon.
Spot Instances are charged on an hourly On-demand Instances are charged on a per-
basis. second basis.
Creating subnets means dividing a large network into smaller ones. These subnets can be
created for several reasons. For example, creating and using subnets can help reduce
congestion by making sure that the traffic destined for a subnet stays in that subnet. This helps
in efficiently routing the to the network, which reduces the network’s load.
Learn more about AWS from this AWS Training in New York to get ahead in your career!
Yes, it is possible by using the multipart upload utility from AWS. With the multipart upload
utility, larger files can be uploaded in multiple parts that are uploaded independently. You can
also decrease upload time by uploading these parts in parallel. After the upload is done, the
parts will be merged into a single object or file to create the original file from which the parts
were created.
To learn more about the Amazon S3 bucket, read the blog.
7. What is the maximum number of S3 buckets you can create?
Submit
9. When should you use the classic load balancer and the
application load balancer?
The classic load balancer is used for simple load balancing of traffic across multiple EC2
instances.
While the application load balancing is used for more intelligent load balancing, based on the
multi-tier architecture or container-based architecture of the application. Application load
balancing is mostly used when there is a need to route traffic to multiple services.
Want to learn about AWS DevOps! Check out our blog on What is AWS DevOps.
10. How many total VPCs per account/region and subnets per
VPC can you have?
We can have a total of 5 VPCs for every account/region and 200 subnets for every VPC that you
have.
A hybrid cloud. The hybrid cloud architecture is where an organization can use the public cloud
for shared resources and the private cloud for confidential workloads.
Career Transition
12. Which one of the storage solutions offered by AWS would you
use if you need extremely low pricing and data archiving?
AWS Glacier is an extremely low-cost storage service offered by Amazon that is used for data
archiving and backup purposes. The longer you store data in Glacier, the lesser it will cost you.
Go through the AWS Course in London to get a clear understanding of AWS!
Auto-scaling groups
EBS-backed instances. EBS-backed instances use EBS volume as their root volume. EBS volume
consists of virtual drives that can be easily backed up and duplicated by snapshots.
The biggest advantage of EBS-backed volumes is that the data can be configured to be stored
for later retrieval even if the virtual machine or the instances are shut down.
15. How will you configure an Amazon S3 bucket to serve static
assets for your public web application?
By configuring the bucket policy to provide public read access to all objects
That is all we have in our section on basic Amazon Web Services interview questions section.
Let’s move on to the next section on AWS interview questions for experienced professionals.
Looking for the Perfect Job Interview Attire? Worry Not. Read our perfect guide on Interview
Outfits to land your dream job.
No, standby instances are automatically launched in different availability zones than the
primary, making them physically independent infrastructures. This is because the whole
purpose of standby instances is to prevent infrastructure failure. So, in case the primary goes
down, the standby instance will help recover all of the data.
19. What is the name of Amazon's Content Delivery Network?
Amazon CloudFront
20. Which Amazon solution will you use if you want to accelerate
moving petabytes of data in and out of AWS, using storage
devices that are designed to be secure for data transfer?
Amazon Snowball. AWS Snowball is the data transport solution for large amounts of data that
need to be moved into and out of AWS using physical storage devices.
Courses you may like
21. If you are running your DB instance as Multi-AZ deployment,
can you use standby DB instances along with your primary DB
instance?
No, the standby DB instance cannot be used along with the primary DB instances since the
standby DB instances are supposed to be used only if the primary instance goes down.
DynamoDB will be the right choice here since it is designed to be highly scalable, more than
RDS or any other relational database service.
23. You accidently stopped an EC2 instance in a VPC with an
associated Elastic IP. If you start the instance again, what will be
the result?
Elastic IP will only be disassociated from the instance if it’s terminated. If it’s stopped and
started, there won’t be any change to the instance, and no data will be lost.
It is possible using AWS IAM groups, by adding users in the groups as per their roles and by
simply applying the policy to the groups.
Become a master of AWS by going through this online AWS Course in Toronto!
A bigger RDS instance type needs to be opted for handling large amounts of traffic, creating
manual or automated snapshots to recover data in case the RDS instance goes down.
Learn more about Amazon Web Services from our AWS Tutorial!
It can be done by creating an autoscaling group to deploy more instances when the CPU
utilization exceeds 100 percent and distributing traffic among instances by creating a load
balancer and registering the Amazon EC2 instances with it.
AWS CloudTrail can be used in this case as it is designed for logging and tracking API calls, and
it has also been made available for storage solutions.
The data and the key should be in the same region. That is, the data that has to be encrypted
should be in the same region as the one in which the key was created. In this case, the data is
in the Oregon region, whereas the key was created in the North Virginia region.
Application Load Balancer: It supports path-based routing of the traffic and hence helps in
enhancing the performance of the application structured as smaller services.
Using an application load balancer, the traffic can be routed based on the requests made. In
this case scenario, the traffic where requests are made for rendering images can be directed to
the servers only deployed for rendering images, and the traffic where requests are made for
computing can be directed to the servers deployed only for general computing purposes.
Elastic IP
Private IP
Public IP
Internet Gateway
Private IP. Private IP is automatically assigned to the instance as soon as it is launched. While
elastic IP has to be set manually, Public IP needs an Internet Gateway which again has to be
created since it’s a new VPC.
33. Your organization has four instances for production and
another four for testing. You are asked to set up a group of IAM
users that can only access the four production instances and not
the other four testing instances. How will you achieve this?
We can achieve this by defining tags on the test and production instances and then adding a
condition to the IAM policy that allows access to specific tags.
34. Your organization wants to monitor the read and write IOPS
for its AWS MySQL RDS instance and then send real-time alerts
to its internal operations team. Which service offered by
Amazon can help your organization achieve this scenario?
Amazon CloudWatch would help us achieve this. Since Amazon CloudWatch is a monitoring
tool offered by Amazon, it’s the right service to use in the above-mentioned scenario.
Enabling CloudTrail for your load balancer. AWS CloudTrail is an inexpensive log monitoring
solution provided by Amazon. It can provide logging information for load balancers or any
other AWS resources. The provided information can be further used for analysis.
Database servers should be ideally launched on private subnets. Private subnets are ideal for
the backend services and databases of all applications since they are not meant to be accessed
by the users of the applications, and private subnets are not routable from the Internet.
38. Can you change the instance type of the instances that are
running in your application tier and are also using autoscaling?
If yes, then how? (Choose one of the following)
Yes, the instance type of such instances can be changed by modifying the autoscaling launch
configuration. The tags configuration is used to add metadata to the instances.
Do you know about the different types of AWS Certifications? Read the Blog to find out.
39. Can you name the additional network interface that can be
created and attached to your Amazon EC2 instance launched in
your VPC?
Amazon Direct Connect. It is an AWS networking service that acts as an alternative to using the
Internet to connect customers in on-premise sites with AWS.
We can deploy ElastiCache in-memory cache running in every availability zone. This will help in
creating a cached version of the website for faster access in each availability zone. We can also
add an RDS MySQL read replica in each availability zone that can help with efficient and better
performance for read operations. So, there will not be any increased workload on the RDS
MySQL instance, hence resolving the contention issue.
43. Your company wants you to propose a solution so that the
company’s data center can be connected to the Amazon cloud
network. What would your proposal be?
The data center can be connected to the Amazon cloud network by establishing a virtual
private network (VPN) between the VPC and the data center. A virtual private network lets you
establish a secure pathway or tunnel from your premise or device to the AWS global network.
Are you interested in learning AWS from experts? Enroll in our AWS Course in Bangalore and
be a master of it!
RDS
Redshift
ElastiCache
DynamoDB
Amazon RDS
45. You want to modify the security group rules while it is being
used by multiple EC2 instances. Will you be able to do that? If
yes, will the new rules be implemented on all previously running
EC2 instances that were using that security group?
Yes, the security group that is being used by multiple EC2 instances can be modified. The
changes will be implemented immediately and applied to all the previously running EC2
instances without restarting the instances
46. Which one of the following is a structured data store that
supports indexing and data queries to both EC2 and S3?
DynamoDB
MySQL
Aurora
SimpleDB
SimpleDB
47. Which service offered by Amazon will you choose if you want
to collect and process e-commerce data for near real-time
analysis? (Choose any two)
DynamoDB
Redshift
Aurora
SimpleDB
DynamoDB. DynamoDB is a fully managed NoSQL database service that can be fed any type of
unstructured data. Hence, DynamoDB is the best choice for collecting data from e-commerce
websites. For near-real-time analysis, we can use Amazon Redshift.
48. If in CloudFront the content is not present at an edge
location, what will happen when a request is made for that
content?
CloudFront will deliver the content directly from the origin server. It will also store the content
in the cache of the edge location where the content was missing.
50. Which of the following options will you use if you have to
move data over long distances using the Internet, from
instances that are spread across countries to your Amazon S3
bucket?
Amazon CloudFront
Amazon Transfer Acceleration
Amazon Snowball
Amazon Glacier
Amazon Transfer Acceleration. It throttles the data transfer up to 300 percent using optimized
network paths and Amazon Content Delivery Network. Snowball cannot be used here as this
service does not support cross-region data transfer.
51. Which of the following services is a data storage system that
also has a REST API interface and uses secure HMAC-SHA1
authentication keys?
Amazon S3
Amazon S3. It gets various requests from applications, and it has to identify which requests are
to be allowed and which are to be denied. Amazon S3 REST API uses a custom HTTP scheme
based on a keyed HMAC for the authentication of requests.
Launched in 2006, EC2 is a virtual machine that you can use to deploy your own servers in the
cloud, giving you OS-level control. It helps you have control over the hardware and updates,
similar to the case of on-premise servers. EC2 can run on either of these operating systems-
Microsoft and Linux. It can also support applications like Python, PHP, Apache, and more.
Snowball is an application designed for transferring terabytes of data into and outside of the
AWS cloud. It uses secured physical storage to transfer the data. Snowball is considered a
petabyte-scale data transport solution that helps with cost and time savings.
The Amazon CloudWatch is used for monitoring and managing data and getting actionable
insights for AWS, on-premise applications, etc. It helps you to monitor your entire task stack
that includes the applications, infrastructure, and services. Apart from this, CloudWatch also
assists you in optimizing your resource utilization and cost by providing analytics-driven
insights.
In the AWS cloud, the Elastic Transcoder is used for converting media files into versions that
can be run/played on devices such as Tablets, PCs, Smartphones, etc. It consists of advanced
transcoding features with conversion rates starting from $ 0.0075 per minute.
VPC is the abbreviated form of Virtual Private Cloud. It allows you to launch AWS resources that
can be defined by you and fully customize the network configurations. Through VPC, you can
define and take full control of your virtual network environment. For example- you can have a
private address range, internet gateways, subnets, etc.
Single or multiple Amazon Elastic Block Store (Amazon EBS) snapshots. Basically, templates
for the root volume of the instance.
Launch permissions that let AWS accounts use AMI to launch instances.
A block device mapping to specify what volumes to be attached to the instance during its
launch.
S3 Standard- It is by and large the default storage class. In cases where no specification
about the storage class is provided while uploading the object, Amazon S3 assigns the S3
Standard storage class by default.
Reduced Redundancy- It is assigned when non-critical, reproducible data needs to be stored.
The Reduced Redundancy Storage class is designed in a way that the above data categories
can be stored with less redundancy.
The native AWS security logging capabilities include AWS CloudTrail, AWS Config, AWS detailed
billing reports, Amazon S3 access logs, Elastic load balancing Access logs, Amazon CloudFront
access logs, Amazon VPC Flow logs, etc. To know about native AWS security logging capabilities
in detail, click here.
Take up the AWS Masters Certification Course by Intellipaat and upgrade your skill set.
When connecting to an Amazon EC2 instance, you need to prove your identity. Key pairs are
used to execute this. Basically, a key pair is a set of security credentials that are used during
identity proofing. It consists of a public key and a private key.
61. What are policies and what are the different types of
policies?
Policies define the permissions required to execute an operation, irrespective of the method
used to perform it. AWS supports six types of policies:
Identity-based policies
Resource-based policies
Permissions boundaries
Organizations SCPs
ACLs
Session policies
1- Identity-based policies- These are JSON permissions policy documents that control what
actions an identity can perform, under what conditions, and on which resources. These policies
are further classified into 2 categories:
3- IAM permissions boundaries- They actually refer to the maximum level of permissions that
identity-based policies can grant to the specific entity.
4- Service Control Policies (SCPs)- SCPs are the maximum level of permissions for an
organization or organizational unit.
5- Access Control lists- They define and control which principals in another AWS account can
access the particular resource.
6- Session policies- They are advanced policies that are passed as a parameter when a
temporary session is programmatically created for a role or federated user.
62. What kind of IP address can you use for your customer
gateway (CGW) address?
We can use the Internet routable IP address, which is a public IP address of your NAT device.
If you have any doubts or queries related to AWS, get them clarified by AWS experts in
our AWS Community!
63. Which of the following is not an option in security groups?
List of users
Ports
IP addresses
List of protocols
List of users
List of Users
Hope these top AWS Interview questions and answers for freshers and the experienced, helps
you in preparing for top AWS jobs in the Cloud market.
Create an AMI of the server running in the North Virginia region. Once the AMI is created,
The administrator will need the 12-digit account number of the #2 AWS account. This is
required for copying the AMI which we have created.
Once the AMI is successfully copied into the Mumbai region, you can launch the instance using
copied AMI in the Mumbai region. Once the instance is running and if it’s completely
operational, the server in the North Virginia region could be terminated. This is the best way to
migrate a server to a different account without any hassle.
If the client is able to access the website from his/her end, it means the connection is perfect
and there is no issue with connectivity and the Security Group configuration also seems
correct.
We can check the internal firewall of the Windows 2019 IIS server. If it is blocking ICMP traffic,
we should enable it.
Intellipaat provides industrial-based SQL Training. Enroll now and learn from the experts.
66. A start-up company has a web application based in the us-
east-1 Region with multiple Amazon EC2 instances running
behind an Application Load Balancer across multiple Availability
Zones. As the company's user base grows in the us-west-1
region, the company needs a solution with low latency and
improved high availability. What should a solutions architect do
to achieve it.?
You need to notice here that, currently, the web application is in the us-ease-1, and the user
base grows in the us-east-1 region. The very first step, provision multiple EC2 instances (web
application servers) and configure an Application Load Balancer in us-west-1. Now, create
Global Accelerator in AWS Global Accelerator which uses an endpoint group that includes the
load balancer endpoints in both regions.
Configure Cross Region Replication Rule in the Ohio region bucket and select the destination
bucket in the London region to replicate the data and store it in the destination using one zone
IA storage class to save cost.
69. You are an AWS Architect in your company, and you are
asked to create a new VPC in the N.Virginia Region with two
Public and two Private subnets using the following CIDR blocks:
Public Subnet
Subnet01 : 10.10.10.0/26
Subnet02 : 10.10.10.64/26
Private Subnet
Subnet03: 10.10.10.128/26
Subnet04: 10.10.10.192/26
Using the above CIDRs you created a new VPC, and you launched EC2 instances in all
subnets as per the need.
Now, you are facing an issue in private instances that you are unable to update
operating systems from the internet. So, what architectural changes and configurations
will you suggest to resolve the issue?
NAT G/W to be installed in one public subnet and will configure the route-table associated with
private subnets to add NAT G/W entry to provide internet access to private instances.
EBS-backed instances or instances with EBS Volume. EBS-backed instances use EBS volume as
their root volume. These volumes contain Operating Systems, Applications, and Data. We can
create Snapshots from these volumes or AMI from Snapshots.
The main advantage of EBS-backed volumes is that the data can be configured to be stored for
later retrieval even if the virtual machine or instances are shut down.
It can be done by creating an autoscaling group to deploy more instances when the CPU
utilization of the EC2 instance exceeds 80 percent and distributing traffic among instances by
creating an application load balancer and registering EC2 instances as target instances.
72. In AWS, three different storage services are available, such
as EFS, S3, and EBS. When should I use Amazon EFS vs. Amazon
S3 vs. Amazon Elastic Block Store (EBS)?
Amazon Web Services (AWS) offers cloud storage services to support a wide range of storage
workloads.
Amazon EFS is a file storage service for use with Amazon compute (EC2, containers, and
serverless) and on-premises servers. Amazon EFS provides a file system interface, file system
access semantics (such as strong consistency and file locking), and concurrently accessible
storage for up to thousands of Amazon EC2 instances.
Amazon EBS is a block-level storage service for use with Amazon EC2. Amazon EBS can deliver
performance for workloads that require the lowest latency for access to data from a single EC2
instance.
Amazon S3 is an object storage service. Amazon S3 makes data available through an Internet
API that can be accessed anywhere
Create an Application Load Balancer with AWS Auto Scaling groups across multiple Availability
Zones. Store data on Amazon EFS and mount a target on each instance.
74. An application running on AWS uses an Amazon Aurora
Multi-AZ deployment for its database. When evaluating
performance metrics, a solutions architect discovered that the
database reads were causing high I/O and adding latency to the
write requests against the database. What should the solution
architect do to separate the read requests from the write
requests?
Create a read replica and modify the application to use the appropriate endpoint.
75. A client reports that they wanted to see an audit log of any
changes made to AWS resources in their account. What can the
client do to achieve this?
76. Usually, you have noticed that one EBS volume can be
connected with one EC2 instance, our company wants to run a
business-critical application on multiple instances in a single
region and needs to store all instances output in single storage
within the VPC. Instead of using EFS, our company is
recommending the use of multi-attach volume with instances.
As an architect, you need to suggest to them what instance type
and EBS volumes they should use.
The instance type should be EC2 Nitro-based instances and Provisioned IOPs io1 multi-attach
EBS volumes.
77. A company is using a VPC peering connection option to
connect its multiple VPCs in a single region to allow for cross
VPC communication. A recent increase in account creation and
VPCs has made it difficult to maintain the VPC peering strategy,
and the company expects to grow to hundreds of VPCs. There
are also new requests to create site-to-site VPNs with some of
the VPCs. A solutions architect has been tasked with creating a
central networking setup for multiple accounts and VPNs. Which
networking solution would you recommend to resolve it?
Configure a transit gateway with AWS Transit Gateway and connect all VPCs and VPNs.
A DBMS allows a user to interact with the database. The data stored in the
database can be modified, retrieved and deleted and can be of any type like
strings, numbers, images, etc.
A JOIN clause is used to combine rows from two or more tables, based on a
related column between them. It is used to merge two tables or retrieve data
from there. There are 4 joins in SQL namely:
Inner Join
Right Join
Left Join
Full Join
Q6. What is the difference between CHAR and VARCHAR2 datatype in SQL?
Both Char and Varchar2 are used for characters datatype but varchar2 is used for
character strings of variable length whereas Char is used for strings of fixed
length. For example, char(10) can only store 10 characters and will not be able to
store a string of any other length whereas varchar2(10) can store any length i.e
6,8,2 in this variable.
Constraints are used to specify the limit on the data type of the table. It can be
specified while creating or altering the table statement. The sample of
constraints are:
NOT NULL
CHECK
DEFAULT
UNIQUE
PRIMARY KEY
FOREIGN KEY
SQL is a standard language which stands for Structured Query Language based
on the English language whereas MySQL is a database management system. SQL
is the core of the relational database which is used for accessing and managing
database, MySQL is an RDMS (Relational Database Management System) such
as SQL Server, Informix, etc.
Data Integrity defines accuracy as well as the consistency of the data stored in a
database. It also defines integrity constraints to enforce business rules on the
data when it is entered into an application or a database.
Q13. What is the difference between clustered and non clustered index in SQL?
The differences between the clustered and non clustered index in SQL are :
1. The clustered index is used for easy retrieval of data from the
database and its faster whereas reading from a non clustered index
is relatively slower.
2. Clustered index alters the way records are stored in a database as it
sorts out rows by the column which is set to be clustered index
whereas in a non clustered index, it does not alter the way it was
stored but it creates a separate object within a table which points
back to the original table rows after searching.
3. One table can only have one clustered index whereas it can have
many non clustered index.
Q14. Write a SQL query to display the current date?
In SQL, there is a built-in function called GetDate() which helps to return the
current timestamp/date.
There are various types of joins which are used to retrieve data between the
tables. There are four types of joins, namely:
Entities: A person, place, or thing in the real world about which data can be
stored in a database. Tables store data that represents one type of entity. For
example — A bank database has a customer table to store customer information.
Customer table stores this information as a set of attributes (columns within the
table) for each customer.
Unique Index:
This index does not allow the field to have duplicate values if the column is
unique indexed. If a primary key is defined, a unique index can be applied
automatically.
Clustered Index:
This index reorders the physical order of the table and searches based on the
basis of key values. Each table can only have one clustered index.
Non-Clustered Index:
Non-Clustered Index does not alter the physical order of the table and maintains
a logical order of the data. Each table can have many nonclustered indexes.
DROP command removes a table and it cannot be rolled back from the database
whereas TRUNCATE command removes all the rows from the table.
Trigger in SQL is are a special type of stored procedures that are defined to
execute automatically in place or after data modifications. It allows you to
execute a batch of code when an insert, update or any other query is executed
against a specific table.
1. Arithmetic Operators
2. Logical Operators
3. Comparison Operators
Q26. Are NULL values the same as that of zero or a blank space?
A NULL value is not at all same as that of zero or a blank space. The NULL value
represents a value which is unavailable, unknown, assigned or not applicable
whereas zero is a number and blank space is a character.
Q27. What is the difference between cross join and natural join?
The cross join produces the cross product or Cartesian product of two tables
whereas the natural join is based on all the columns having the same name and
data types in both the tables.
To count the number of records in a table, you can use the below commands:
SELECT * FROM table1
SELECT COUNT(*) FROM table1
SELECT rows FROM sysindexes WHERE id = OBJECT_ID(table1) AND indid < 2
Q31. Write a SQL query to find the names of employees that begin with ‘A’?
To display the name of the employees that begin with ‘A’, type in the below
command:
SELECT * FROM Table_name WHERE EmpName like 'A%'
Q32. Write a SQL query to get the third-highest salary of an employee from
employee_table?
SELECT TOP 1 salary
FROM(
SELECT TOP 3 salary
FROM employee_table
ORDER BY salary DESC) AS emp
ORDER BY salary ASC;
Group functions work on the set of rows and return one result per group. Some
of the commonly used group functions are AVG, COUNT, MAX, MIN, SUM,
VARIANCE.
Relation or links are between entities that have something to do with each other.
Relationships are defined as the connection between the tables in a database.
There are various relationships, namely:
Q35. How can you insert NULL values in a column while inserting the data?
Q36. What is the main difference between ‘BETWEEN’ and ‘IN’ condition
operators?
Example of BETWEEN:
SELECT * FROM Students where ROLL_NO BETWEEN 10 AND 50;
Example of IN:
SELECT * FROM students where ROLL_NO IN (8,15,25);
Recursive stored procedure refers to a stored procedure which calls by itself until
it reaches some boundary condition. This recursive function or procedure helps
the programmers to use the same set of code n number of times.
SQL clause helps to limit the result set by providing a condition to the query. A
clause helps to filter the rows from the entire set of records.
HAVING clause can be used only with SELECT statement. It is usually used in a
GROUP BY clause and whenever GROUP BY is not used, HAVING behaves like a
WHERE clause. Having Clause is only used with the GROUP BY function in a
query whereas WHERE Clause is applied to each row before they are a part of
the GROUP BY function in a query.
Q44. How can you fetch common records from two tables?
You can fetch common records from two tables using INTERSECT. For example:
Select studentID from student INTERSECT Select StudentID from ExaM
Some of the available set operators are — Union, Intersect or Minus operators.
ALIAS name can be given to any table or a column. This alias name can be
referred in WHERE clause to identify a particular table or a column.
For example-
Select emp.empID, dept.Result from employee emp, department as dept where
emp.empID=dept.empID
In the above example, emp refers to alias name for employee table and dept
refers to alias name for department table.
Scalar functions return a single value based on the input value. For example —
UCASE(), NOW() are calculated with respect to the string.
You can fetch alternate records i.e both odd and even row numbers. For
example- To display even numbers, use the following command:
Q50. Name the operator which is used in the query for pattern matching?
You can select unique records from a table by using the DISTINCT keyword.
Select DISTINCT studentID from Student
Using this command, it will print unique student id from the table Student.
There are a lot of ways to fetch characters from a string. For example:
Select SUBSTRING(StudentName,1,5) as studentname from student
SQL is a query language that allows you to issue a single query or execute a single
insert/update/delete whereas PL/SQL is Oracle’s “Procedural Language” SQL,
which allows you to write a full program (loops, variables, etc.) to accomplish
multiple operations such as selects/inserts/updates/deletes.
A view refers to a logical snapshot based on a table or another view. It is used for
the following reasons:
Restricting access to data.
Making complex queries simple.
Ensuring data independence.
Providing different views of the same data.
Advantages:
Disadvantages:
The only disadvantage of Stored Procedure is that it can be executed only in the
database and utilizes more memory in the database server.
Scalar Functions
Inline Table-valued functions
Multi-statement valued functions
Scalar returns the unit, variant defined the return clause. Other two types of
defined functions return table.
Collation is defined as a set of rules that determine how data can be sorted as
well as compared. Character data is sorted using the rules that define the correct
character sequence along with options for specifying case-sensitivity, character
width, etc.
Case Sensitivity
Kana Sensitivity
Width Sensitivity
Accent Sensitivity
Local variables:
These variables can be used or exist only inside the function. These variables are
not used or referred by any other function.
Global variables:
These variables are the variables which can be accessed throughout the program.
Global variables cannot be created whenever that function is called.
Q62. What is Auto Increment in SQL?
Q64. What are the different authentication modes in SQL Server? How can it be
changed?
Windows mode and Mixed Mode — SQL and Windows. You can go to the below
steps to change authentication mode in SQL Server:
With the amount of data present in the world, it is almost next to impossible, to
manage data without proper databases. In today’s market, there are different
kinds of databases present, and deciding on the best database which suits your
business can be an overwhelming task. So, in this article on SQL vs NoSQL, I will
compare these two type of databases to help you choose which type of database
can help you and your organization.
What is SQL?
What is NoSQL?
SQL vs NoSQL
Examples of SQL and NoSQL
What is MySQL?
What is MongoDB?
MySQL vs MongoDB
Demo: Insert values into tables and collections
What is SQL?
SQL aka Structured Query Language is the core of the relational database which
is used for accessing and managing the databases. This language is used to
manipulate and retrieve data from a structured data format in the form of tables
and holds relationships between those tables. The relations could be as follows:
A One-to-One Relationship is when a single row in Table A is
related to a single row in Table B.
A One-to-Many Relationship is when a single row in Table A is
related to many rows in table B.
A Many-to-Many Relationship is when many rows in table A can be
related to many rows in table B.
A Self -Referencing Relationship is when a record in table A is
related to the same table itself.
What is NoSQL?
NoSQL, or most commonly known as Not only SQL database, provides a
mechanism for storage and retrieval of unstructured data. This type of database
can handle a humongous amount of data and has a dynamic schema. So, a
NoSQL database has no specific query language, no or a very few relationships,
but has data stored in the format of collections and documents.
So, a database can have a ’n’ number of collections and each collection can have
‘m ‘ number of documents. Consider the example below.
As you can see from the above image, there is an Employee Database which has 2
collections i.e. the Employee and Projects Collection. Now, each of these
collections has Documents, which are basically the data values. So, you can
assume the collections to be your tables and the Documents to be your fields in
the tables.
Alright, So, now that you know what is SQL & NoSQL, let us now see, how these
databases stand against each other.
SQL vs NoSQL
So, in this face off, I will be comparing both these databases based on the
following grounds:
1. Type of Database
2. Schema
3. Database Categories
4. Complex Queries
5. Hierarchical Data Storage
6. Scalability
7. Language
8. Online Processing
9. Base Properties
10. External Support
Type of database
Schema
SQL needs a predefined schema for structured data. So, before you start using
SQL to extract and manipulate data, you need to make sure that your data
structure is pre-defined in the form of tables.
The SQL databases are . So, you can have ’n’ number of tables related to each
other and each table can have rows and columns which store data in each cell of
the table.
Now, if we talk about NoSQL Databases, then NoSQL databases have the
following categories of databases:
So, SQL databases store data in the form of tables and NoSQL databases store
data in the form of key-value pair, documents, graph databases or wide-column
stores.
Complex Queries
Now, the reason why NoSQL databases isn’t a good fit for complex
queries is because the NoSQL databases aren’t queried in a standard language
like SQL.
Well, when we compare the databases on this factor, NoSQL fits better for
hierarchical storage when compared to SQL databases.
Scalability
The SQL databases are vertically scalable. You can load balance the data
servers by optimizing hardware such as increasing CPU, RAM, SSD, etc.
Online Processing
On comparing SQL and NoSQL, based on this factor, SQL databases are used
for heavy-duty transactional type applications. Well, this is because
SQL provides atomicity, integrity, and stability of the data. Also, you can use
NoSQL for transactions purpose, but, it is still not stable enough in high load and
for complex transactional applications. So, you can understand that SQL is
mainly used for OLTP(Online Transactional Processing) and NoSQL is mainly
used for OLAP(Online Analytical Processing).
Base Properties
Brewers CAP Theorem states that a database can only achieve at most two out of
three guarantees: Consistency, Availability and Partition Tolerance. Here
Consistency: All the nodes see the same data at the same time.
Availability: Guarantees whether every request is successful in
failed.
Partition Tolerance: Guarantees whether a system continues to
operate despite message loss or failure of part of the system.
External Support
All the SQL vendors offer excellent support since SQL has been into existence for
more than the past 40 years. However, for some NoSQL database, only limited
experts are available and you still have to rely on community support to deploy
your large scale NoSQL deployments. This is because NoSQL has come into
existence in the late 2000s and people haven’t explored it yet much.
So, if I have to summarize the differences for SQL and NoSQL in this article on
SQL vs NoSQL, you can refer to the below table.
So, folks, with this we come to an end of this face-off between SQL and NoSQL.
Now, that we have discussed so much about SQL and NoSQL, let me show you
some examples of the same.
What is MySQL?
Alright, So, now that you know what is MySQL & MongoDB, let us now see, how
these databases stand against each other.
MySQL vs MongoDB
So, in this face off, I will be comparing both these databases based on the
following grounds:
1. Query Language
2. Flexibility of Schema
3. Relationships
4. Security
5. Performance
6. Support
7. Key Features
8. Replication
9. Usage
10. Active Community
Query Language
MongoDB, on the other hand,MySQL uses the Structured Query
language(SQL). This language is simple and consists of mainly DDL, DML
DCL & TCL commands to retrieve and manipulate data. MongoDB on the other
hand uses an Unstructured Query Language. So, the query language is
basically the MongoDB query language. Refer to the image below.
Flexibility of Schema
MySQL has good flexibility of schema for structured data as you just
need to clearly define tables and columns. Now, has no restrictions on
schema design. You can directly mention, a couple of documents inside a
collection without having any relations between those documents. But, the only
problem with MongoDB is that you need to optimize your schema based on how
you want to access the data .
Relationships
Security
Performance
However, MongoDB has the ability to handle large unstructured data. So, it is
faster than MySQL where large databases are considered as it allows users to
query in such a way that the load on servers are reduced.
NOTE: There is as such no hard and fast rule that MongoDB will be
faster for your data all the time, It completely depends on your data and
infrastructure.
Support
Key Features
You can refer to the following image for the key features of MySQL and
MongoDB:
Replication
You can refer to the following image for understanding where to use MySQL and
MongoDB:
Active Community
So, if I have to summarize the differences between MySQL and MongoDB, you
can refer to the below table.
So, folks, with this we come to an end of this face-off between MySQL and
MongoDB. Now, knowing so much more about MySQL and MongoDB might
have raised a question on your mind i.e.Wether businesses should go for
MySQL or MongoDB?
Well, there is no clear winner between both of them. The choice of database
completely depends upon the schema of your database and how you wish to
access it. Nevertheless, you can use MySQL when you have a fixed schema, high
transaction, low maintenance, data security with a limited budget and MongoDB
while you have an unstable schema, high availability, cloud computing, with in-
built sharding.
So, there won’t be any final verdict as to which among them is the best as each
one of these excel based on your requirement.
Now, that you know the differences between MySQL and MongoDB, next in this
article on SQL vs NoSQL let me show you how to insert data into tables and
collections in MySQL Workbench and MongoDB Compass respectively.
To insert data into tables using MySQL Workbench, you can follow the below
steps:
Step 2: Now, once your connection has been created, open your connection and
then you will be redirected to the following dashboard.
Step 3: Now to create a database and a table, follow the below queries:
//Create Database
CREATE DATABASE Employee_Info;
//Use Database
USE Employee_Info;
//Create Table
CREATE TABLE Employee
(EmpID int,
EmpFname varchar(255),
EmpLname varchar(255),
Age int,
EmailID varchar(255),
PhoneNo int8,
Address varchar(255));
Step 4: Now, once your table is created, to insert values into the table, use the
INSERT INTO syntax as below:
//Insert Data into a Table
INSERT INTO Employee(EmpID, EmpFname, EmpLname,Age, EmailID, PhoneNo,
Address)
VALUES ('1', 'Vardhan','Kumar', '22', 'vardy@abc.com', '9876543210',
'Delhi');
Step 5: When you view your table, you will the output as below.
Now, next in this article on SQL vs NoSQL, let us see how to create database and
collections in MongoDB Compass.
To insert data into tables using MongoDB Compass, you can follow the below
steps:
Step 3: Now, open your database, and choose the collection. Here I have chosen
samplecollection. To add documents into the collection, choose the Insert
Document option and mention the parameters. Here I have mentioned the
EmpID and EmpName.
Q2. Explain the terms database and DBMS. Also, mention the different types of
DBMS.
Query optimization is the phase which identifies a plan for evaluation query that
has the least estimated cost. This phase comes into the picture when there are a
lot of algorithms and methods to execute the same task.
Q6. Do we consider NULL values the same as that of blank space or zero?
A NULL value is not at all same as that of zero or a blank space. The NULL value
represents a value that is unavailable, unknown, assigned, or not applicable
whereas zero is a number and blank space is a character.
This is a feature of the E-R model which allows a relationship set to participate in
another relationship set.
This property states that a database modification must either follow all the rules
or nothing at all. So, if one part of the transaction fails, then the entire
transaction fails.
Q10. What do you understand by the terms Entity, Entity Type, and Entity Set in
DBMS?
A relationship in DBMS is the scenario where two entities are related to each
other. In such a scenario, the table consisting of foreign key references to that of
a primary key of the other table.
Q14. What is normalization and what are the different types of normalization?
The process of organizing data to avoid any duplication of data and redundancy
is known as Normalization. There are many successive levels of normalization
which are known as normal forms. Each consecutive normal form depends on
the previous one. The following are the first three normal forms. Apart from
these, you have higher normal forms such as BCNF.
You can also understand correlated subqueries as those queries, which are used
for row-by-row processing by the parent statement. Here, the parent statement
can be SELECT, UPDATE or DELETE statement.
A checkpoint is a mechanism where all the previous logs are removed from the
system and are permanently stored on the storage disk. So, basically,
checkpoints are those points from where the transaction log record can be used
to recover all the committed data up to the point of crash.
Next, le us discuss one of the most commonly asked DBMS interview questions,
that is:
Q22. Mention the differences between Trigger and Stored Procedures
Q23. What are the differences between Hash join, Merge join and Nested
loops?
Indexes are data structures responsible for improving the speed of data retrieval
operations on a table. This data structure uses more storage space to maintain
extra copies of data by using additional writes. So, indexes are mainly used for
searching algorithms, where you wish to retrieve data in a quick manner.
Q27. What do you understand by cursor? Mention the different types of cursor
A cursor is a database object which helps in manipulating data, row by row, and
represents a result set.
The types of the cursor are as follows:
When you say an application has data independence, it implies that the
application is independent of the storage structure and data access strategies of
data.
Q30. What are the different integrity rules present in the DBMS?
Entity Integrity: This rule states that the value of the primary key
can never be NULL. So, all the tuples in the column identified as
the primary key should have a value.
Referential Integrity:This rule states that either the value of the
foreign key is NULL or it should be the primary key of any other
relation.
Q31. What does Fill Factor concept mean with respect to indexes?
Fill Factor is used to mention the percentage of space left on every leaf-level
page, which is packed with data. Usually, the default value is 100.
Q32. What is Index hunting and how does it help in improving query
performance?
Q33. What are the differences between the network and hierarchical database
model?
Q34. Explain what is a deadlock and mention how it can be resolved?
Deadlock is a situation that occurs when two transactions wait on a resource that
is locked or other transaction holds. Deadlocks can be prevented by making all
the transactions acquire all the locks at the same instance of time. So, once a
deadlock occurs, the only way to cure is to abort one of the transactions and
remove the partially completed work.
Q35. What are the differences between an exclusive lock and a shared lock?
Next, in this article on DBMS interview questions, let us discuss the top
questions asked about SQL.
Q1. What are the differences between DROP, TRUNCATE and DELETE
commands?
SQL aka Structured Query Language is the core of the relational database which
is used for accessing and managing the databases. This language is used to
manipulate and retrieve data from a structured data format in the form of tables
and holds relationships between those tables. So, in layman's terms, you can use
SQL to communicate with the database.
CLAUSE in SQL is used to limit the result set by mentioning a condition to the
query. So, you can use a CLAUSE to filter rows from the entire set of records.
Example:
Syntax: LOWER(‘string’)
Syntax: INITCAP(‘string’)
Q9. What are joins in SQL and what are the different types of joins?
A JOIN clause is used to combine rows from two or more tables, based on a
related column between them. It is used to merge two tables or retrieve data
from there. There are 4 joins in SQL namely:
Inner Join
Right Join
Left Join
Full Join
Q10. What do you understand by the view and mention the steps to create,
update and drop a view?
A view in SQL is a single table, which is derived from other tables. So, a view
contains rows and columns similar to a real table and has fields from one or
more table.
Next, in this article on DBMS interview questions, let us discuss the most
frequently asked queries about SQL.
Q1. Write a query to create a duplicate table with and without data present?
Consider you have a table named Customers, having details such as CustomerID,
CustomerName and so on. Now, if you want to create a duplicate table named
‘DuplicateCustomer’ with the data present in it, you can mention the following
query:
CREATE TABLE DuplicateCustomer AS SELECT * FROM Customers;
Similarly, if you want to create a duplicate table without the data present,
mention the following query:
CREATE TABLE DuplicateCustomer AS SELECT * FROM Customers WHERE 1=2;
Q2. Mention a query to calculate the even and odd records from a table
To write a query to calculate the even and odd records from a table, you can write
two different queries by using the MOD function.
So, if you want to retrieve the even records from a table, you can write a query as
follows:
SELECT CustomerID FROM (SELECT rowno, CustomerID from Customers) where
mod(rowno,2)=0;
Similarly, if you want to retrieve the odd records from a table, you can write a
query as follows:
SELECT CustomerID FROM (SELECT rowno, CustomerID from Customers) where
mod(rowno,2)=1;
Q3. Write a query to remove duplicate rows from a table?
To remove duplicate rows from a table, you have to initially select the duplicate
rows from the table without using the DISTINCT keyword. So, to select the
duplicate rows from the table, you can write a query as follows:
SELECT CustomerNumber FROM Customers WHERE ROWID (SELECT MAX (rowid) FROM
Customers C WHERE CustomerNumber = C.CustomerNumber);
Now, to delete the duplicate records from the Customers table, mention the
following query:
DELETE FROM Customers WHERE ROWID(SELECT MAX (rowid) FROM Customers C WHERE
CustomerNumber = C.CustomerNumber);
Well, there are multiple ways to add email validation to your database, but one
out the lot is as follows:
SELECT Email FROM Customers WHERE NOT REGEXP_LIKE(Email, '[A-Z0-9._%+-]+@[A-
Z0-9.-]+.[A-Z]{2,4}', 'i');
Q5. Write a query to retrieve the last day of next month in Oracle.
To write a query to retrieve the last day of the next month in Oracle, you can
write a query as follows:
SELECT LAST_DAY (ADD_MONTHS (SYSDATE,1)) from dual;
So this brings us to the end of the DBMS Interview Questions article. I hope this
set of DBMS Interview Questions will help you ace your job interview. All the
best for your interview! If you wish to check out more articles on the
market’s most trending technologies like Artificial Intelligence, DevOps, Ethical
Hacking, then you can refer to Edureka’s official site.
Do look out for other articles in this series that will explain the various other
aspects of SQL.
SQL Interview Questions
mayuri budake
·
Follow
5 min read
65
Tables.
Q.1. Write a SQL query to fetch the count of employees working in
project ‘P1’.
Ans. Here, we use aggregate function count() with the SQL where clause.
FROM [sampleDB].[dbo].[EmployeeSalary]
where Project=’P1' ;
Fig 1
Query:
SELECT EmpFN
FROM [sampleDB].[dbo].[EmployeeDetails]
WHERE Empid IN
Fig 2
Query:
FROM [sampleDB].[dbo].[EmployeeSalary]
Group By Project
Order By EmpProjectCount DESC;
fig 3
Q.4. Write a query to fetch only the first name(string before space)
from the FullName column of EmployeeDetails table.
Ans. In this question, we are required to first fetch the location of the space
character in the FullName field and then extract the first name out of the
FullName field. For finding the location we will use LOCATE method in mySQL
and CHARINDEX in SQL SERVER and for fetching the string before space, we
will use SUBSTRING OR MID method.
Query:
SELECT SUBSTRING(EmpFN,0,CHARINDEX(‘
‘,EmpFN))FirstName
FROM [sampleDB].[dbo].[EmployeeDetails];
FROM [sampleDB].[dbo].[EmployeeDetails];
// LEFT returns the left part of a string
fig 4
Query:
FROM [sampleDB].[dbo].[EmployeeDetails]
ON E.Empid = S.Empsid;
Fig 5
Q.6. Write a SQL query to fetch all the Employees who are also
managers from EmployeeDetails table.
Ans. Here, we have to use Self-Join as the requirement wants us to analyze the
EmployeeDetails table as two different tables, each for Employee and manager
records.
Query:
SELECT E.EmpFN
FROM [sampleDB].[dbo].[EmployeeDetails] E
ON E.EmpiD = M.ManagrID;
Fig 6
Query:
WHERE EXISTS
Query:
FROM [sampleDB].[dbo].[EmployeeSalary]
GROUP BY Project
Query:
FROM [sampleDB].[dbo].[EmployeeSalary]
GROUP BY Project)
Original Table.
Q.10. Write a SQL query to fetch only odd and even rows from the
table.
Ans. This can be achieved by using Row_number in SQL server.
Query:
WHERE E.RowNumber % 2 = 1
WHERE E.RowNumber % 2 = 0
Q.11. Write a SQL query to create a new table with data and structure
copied from another table.
Ans. Using SELECT INTO command.
Query:
Query:
If you are new to SQL refer Below video for better Understanding.
SELECT * FROM My_Schema.Tables;
SELECT Student_ID FROM STUDENT;
If you want to display all the attributes from a particular table, this is the right
query to use:
SELECT * FROM STUDENT;
SELECT EMP_ID, NAME FROM EMPLOYEE_TBL WHERE EMP_ID = '0000';
SELECT EMP_ID, LAST_NAME FROM EMPLOYEE
WHERE CITY = 'Seattle' ORDER BY EMP_ID;
The ordering of the result can also be set manually, using “asc ” for ascending
and “desc” for descending.
SELECT EMP_ID, LAST_NAME FROM EMPLOYEE_TBL
WHERE CITY = 'INDIANAPOLIS' ORDER BY EMP_ID asc;
GROUP BY Age ORDER BY Name;
SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country;
SELECT AVG(Price)FROM Products;
SELECT * FROM My_Schema.views;
SELECT S_NAME, Student_ID
FROM STUDENT
SELECT * FROM Failing_Students;
FROM Products
WHERE Discontinued = No;
SELECT * FROM Sys.objects WHERE Type='u'
SELECT * from Sys.Objects WHERE Type='PK'
SELECT * FROM Sys.Objects WHERE Type='uq'
SELECT * FROM Sys.Objects WHERE Type='f'
18. Displaying Triggers
A Trigger is sort of an ‘event listener’ — i.e, it’s a pre-specified set of
instructions that execute when a certain event occurs. The list of defined
triggers can be viewed using the following query.
SELECT * FROM Sys.Objects WHERE Type='tr'
SELECT * FROM Sys.Objects WHERE Type='it'
SELECT * FROM Sys.Objects WHERE Type='p'
With this in mind, we can easily imagine an Orders table which likewise contains
the indexed customer ID field, along with details of each order placed by the
customer. This table will include the order Number, Quantity, Date, Item, and
Price. In our first one of SQL examples, imagine a situation where the zip and
phone fields were transposed and all the phone numbers were erroneously
entered into the zip code field. We can easily fix this problem with the following
SQL statement:
UPDATE Customers SET Zip=Phone, Phone=Zip
SELECT DISTINCT ID FROM Customers
SELECT TOP 25 FROM Customers WHERE Customer_ID<>NULL;
24. Searching for SQL Tables with Wildcards
Wildcard characters or operators like “%” make it easy to find particular strings
in a large table of thousands of records. Suppose we want to find all of our
customers who have names beginning with “Herb” including Herberts, and
Herbertson. The % wildcard symbol can be used to achieve such a result. The
following SQL query will return all rows from the Customer table where
the Customer_name field begins with “Herb”:
SELECT * From Customers WHERE Name LIKE 'Herb%'
SELECT ID FROM Orders WHERE
Date BETWEEN ‘01/12/2018’ AND ‘01/13/2018’
SELECT ID FROM Customers INNER
JOIN Orders ON Customers.ID = Orders.ID
The point of INNER JOIN, in this case, is to select records in the Customers table
which have a matching customer ID values in the Orders table and return only
those records. Of course there are many types of JOIN, such as FULL, SELF, and
LEFT, but for now, let’s keep things interesting and move on to more diverse
types of queries.
SELECT phone FROM Customers
UNION SELECT item FROM Orders
The UNION keyword makes it possible to combine JOINS and other criteria to
achieve very powerful new table generation potential.
SELECT Item FROM Orders
WHERE id = ALL
(SELECT ID FROM Orders
/*
*/
*/
WHERE id
ALL = (SELECT ID FROM Orders
CREATE DATABASE AllSales
CREATE TABLE Customers (
ID varchar(80),
Name varchar(80),
Phone varchar(20),
....
);
Although most databases are created using a UI such as Access or OpenOffice, it
is important to know how to create and delete databases and tables
programmatically via code with SQL statements. This is especially so when
installing a new web app and the UI asks new users to enter names for DBs to be
added during installation.
ALTER TABLE Customers ADD Birthday varchar(80)
If a table becomes corrupted with bad data you can quickly delete it like this:
DROP TABLE table_name
CREATE TABLE Customers (
ID int NOT NULL,
Name varchar(80) NOT NULL,
PRIMARY KEY (ID)
);
ID int NOT NULL AUTO_INCREMENT
SELECT * FROM Customers
Address, Zip FROM Customers
Performance pitfalls can be avoided in many ways. For example, avoid the time
sinkhole of forcing SQL Server to check the system/master database every time
by using only a stored procedure name, and never prefix it with SP_. Also setting
NOCOUNT ON reduces the time required for SQL Server to count rows affected
by INSERT, DELETE, and other commands. Using INNER JOIN with a
condition is much faster than using WHERE clauses with conditions. We advise
developers to learn SQL server queries to an advanced level for this purpose. For
production purposes, these tips may be crucial to adequate performance. Notice
that our tutorial examples tend to favor the INNER JOIN.
SELECT Name FROM Customers WHERE EXISTS
(SELECT Item FROM Orders
In this example above, the SELECT returns a value of TRUE when a customer
has orders valued at less than $50.
INSERT INTO Yearly_Orders
SELECT * FROM Orders
WHERE Date<=1/1/2018
This example will add any records from the year 2018 to the archive.
SELECT Item, Price *
FROM Orders
SELECT COUNT(ID), Region
FROM Customers
GROUP BY Region
HAVING COUNT(ID) > 0;
40. Tie things up with Strings!
Let’s have a look at processing the contents of field data using functions.
Substring is probably the most valuable of all built-in functions. It gives you
some of the power of Regex, but it’s not so complicated as Regex. Suppose you
want to find the substring left of the dots in a web address. Here’s how to do it
with an SQL Select query:
This line will return everything to the left of the second occurrence of “. ” and so,
in this case, it will return
Dataset
Let’s assume we have a table named “employees” with the following columns:
Here’s the MySQL script to create the employees table and insert the sample
data:
Employee Table
Image by Author
Queries:
1. Write a query to find the average salary of male and female employees in each
department.
Solution Query:
Image by Author
2. Write a query to find the name and salary of the employee with the highest
salary in each department.
Solution Query:
Output:
Image by Author
3. Write a query to find the names of employees who earn more than the average
salary in their department.
Solution Query:
SELECT name, salary, department
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees AS e2
WHERE e2.department = employees. Department
);
Output:
Image by Author
Solution Query:
Output:
Image by Author
5. Find the names of employees who have a salary greater than the average
salary of their department.
Solution Query:
SELECT e.name
FROM employees e
JOIN (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
) AS dept_avg ON e.department = dept_avg.department
WHERE e.salary > dept_avg.avg_salary;
Output:
Image by Author
Query Solution:
WITH max_salary AS (
SELECT department, MAX(salary) AS highest_salary
FROM employees
GROUP BY department
)
SELECT m.department, e.name, e.salary
FROM employees e
JOIN max_salary m ON e.department = m.department AND e.salary = m.highest_salary;
Output:
Image by Author
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
ORDER BY ranking
In the above query, we use the function RANK(). It is a window function that returns each
row’s position in the result set, based on the order defined in the OVER clause (1 for the
highest salary, 2 for the second-highest, and so on). We need to use an ORDER BY ranking
clause at the end of the query to indicate the order on which the result set will be shown.
If you want to know more about ranking functions in SQL, I recommend our article What
Is the RANK() Function in SQL, and How Do You Use It?
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
SELECT
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ranking <= 5
ORDER BY ranking
Finally, in the WHERE of the main query we ask for those rows with a ranking value smaller
or equal than 5. This lets us obtain only the top 5 rows by ranking value. Again, we use
an ORDER BY clause to show the result set, which is ordered by rank ascending.
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
RANK() OVER (ORDER BY salary ASC) as ranking
FROM employee
SELECT
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ranking <= 5
ORDER BY ranking
In the main query, we use WHERE ranking <= 5 to filter the rows with the 5 lowest salaries.
After that, we use ORDER BY ranking to order the rows of the report by ranking value.
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
SELECT
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ranking = 2
The WHERE condition ranking = 2 is used to filter the rows with the salary in position 2.
Note that we can have more than one employee in position 2 if they have the same
salary.
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
dept_id
FROM employee
SELECT
dept_id,
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ranking = 2
ORDER BY dept_id, last_name
The main change introduced in this query is the PARTITION BY dept_id clause in OVER. This
clause groups rows with the same dept_id, ordering the rows in each group by
salary DESC. Then the RANK() function is calculated for each department.
In the main query, we return the dept_id and the employee data for those employees in
position 2 of their departmental ranking.
For those readers who want to find out more about finding the Nth highest row in a
group, I recommend the article How to Find the Nth-Highest Salary by Department with
SQL.
Tired of doing simple SQL exercises? Let's move to a more advanced level! Check
out our Advanced SQL track!
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
SELECT
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ntile = 1
ORDER BY salary
The above query returns only the rows in the first half of a report of employees ordered
by salary in ascending order. We use the condition ntile = 1 to filter only those rows in
the first half of the report. If you are interested in the NTILE() window function, see the
article Common SQL Window Functions: Using Partitions With Ranking Functions.
WITH employee_ranking AS (
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
)
SELECT
employee_id,
last_name,
first_name,
salary
FROM employee_ranking
WHERE ntile = 4
ORDER BY salary
The WHERE ntile = 4 condition filters only the rows in the last quarter of the report. The
last clause ORDER BY salary orders the result set to be returned by the query, while OVER
(ORDER BY salary) orders the rows before dividing them into 4 subsets using NTILE(4).
SELECT
employee_id,
last_name,
first_name,
salary,
FROM employee
If you want to learn about different advanced ranking functions, I recommend the
article Overview of Ranking Functions in SQL.
We have a product table with 3 records (corn flakes, sugared corn flakes and rice flakes)
and another table called box_size with 3 records one for 1 pound and two records for 3
and 5 pounds, respectively. If we want to create a report with the price list for our nine
combinations, we can use the following query:
SELECT
grain.product_name,
box_size.description,
grain.price_per_pound * box_size.box_weight
FROM product
CROSS JOIN box_sizes
The CROSS JOIN clause without any condition produces a table with all row combinations
from both tables. Note we calculate the price based on the per-pound price stored in
the product table and the weight from box_sizes with the expression:
grain.price_per_pound * box_size.box_weight
A deep dive into the CROSS JOIN can be found in An Illustrated Guide to the SQL CROSS
JOIN.
Example #10 – Join a Table to Itself
In some cases, we need to join a table to itself. Think about the employee table. Every row
has a column called manager_id with the ID of the manager supervising this employee.
Using a self-join we can obtain a report with the columns employee_name and manager_name;
this will show us who manages each employee. Here is the query:
SELECT
FROM employee e1
JOIN employee e2
ON e1.employee_id = e2.manager_id
In the above query, we can see the table employee is referenced twice as e1 and e2, and
the join condition is e1.employee_id = e2.manager_id. This condition links each employee
row with the manager row. The article What Is a Self Join in SQL? An Explanation With
Seven Examples will give you more ideas about when you can apply self joins in your
SQL queries.
first_name,
last_name,
salary
FROM employee
You can see the subquery that obtains the average salary in the WHERE clause. In the
main query, we select the employee name and salary. You can read more about
subqueries in the article How to practice SQL subqueries.
SELECT
first_name,
last_name,
salary
FROM employee e1
WHERE salary >
(SELECT AVG(salary)
FROM employee e2
WHERE e1.departmet_id = e2.department_id)
SELECT
first_name,
last_name
FROM employee e1
WHERE department_id IN (
SELECT department_id
FROM department
WHERE manager_name=‘John Smith’)
The previous subquery is a multi-row subquery: it returns more than one row. In fact, it
will return several rows because John Smith manages many departments. When working
with multi-row subqueries, you need to use specific operators (like IN) in the WHERE
condition involving the subquery.
SELECT
employee_id,
last_name,
first_name,
dept_id,
manager_id,
salary
FROM employee
GROUP BY
employee_id,
last_name,
first_name,
dept_id,
manager_id,
salary
HAVING COUNT(*) > 1
The rows that are not duplicated will have a COUNT(*) equal to 1, but those rows that exist
many times will have a COUNT(*) returning the number of times that the row exists. I
suggest the article How to Find Duplicate Values in SQL if you want to find more details
about this technique.
SELECT
employee_id,
last_name,
first_name,
dept_id,
manager_id,
salary,
COUNT(*) AS number_of_rows
FROM employee
GROUP BY
employee_id,
last_name,
first_name,
dept_id,
manager_id,
salary
HAVING COUNT(*) > 1
Again, you can find valuable information about how to manage duplicate records in the
article How To Find Duplicate Records in SQL.
SELECT
last_name,
first_name
FROM employee
INTERSECT
SELECT
last_name,
first_name
FROM employee_2020_jan
As a result, we will obtain a list of employees that appear in both tables. Perhaps they’ll
have different values on the columns like salary or dept_id. In other words, we are
obtaining those employees who worked for the company in Jan 2020 and who are still
working for the company.
If you are interested in finding more about set operators, I suggest the article Introducing
SQL Set Operators: Union, Union All, Minus, and Intersect.
Example #17 – Grouping Data with
ROLLUP
The GROUP BY clause in SQL is used to aggregate rows in groups and apply functions to
all the rows in the group, returning a single result value. For example, if we want to obtain
a report with the total salary amount per department and expertise level, we can do the
following query:
SELECT
dept_id,
expertise,
SUM(salary) total_salary
FROM employee
GROUP BY dept_id, expertise
SELECT
dept_id,
expertise,
SUM(salary) total_salary
FROM employee
GROUP BY ROLLUP (dept_id, expertise)
IT Senior 250000
IT NULL 250000
The rows in the result set with a NULL are the extra rows added by the ROLLUP clause.
A NULL value in the column expertise means a group of rows for a specific value
of dept_id but without a specific expertise value. In other words, it is the total amount of
salaries for each dept_id. In the same way, the last row of the result having a NULL for
columns dept_id and expertise means the grand total for all departments in the
company.
If you want to learn more about the ROLLUP clause and other similar clauses like CUBE, the
article Grouping, Rolling, and Cubing Data has lots of examples.
SELECT
SUM (CASE
WHEN dept_id IN (‘SALES’,’HUMAN RESOURCES’)
THEN salary
ELSE 0 END) AS total_salary_sales_and_hr,
SUM (CASE
WHEN dept_id IN (‘IT’,’SUPPORT’)
THEN salary
ELSE 0 END) AS total_salary_it_and_support
FROM employee
The query returns a single row with two columns. The first column shows the total salary
for the Sales and Human Resources departments. This value is calculated using
the SUM() function on the salary column – but only when the employee belongs to the
Sales or Human Resources department. A zero is added to the sum when the employee
belongs to any other department. The same idea is applied for
the total_salary_it_and_support column.
SELECT
CASE
END AS salary_category,
COUNT(*) AS number_of_employees
FROM employee
GROUP BY
CASE
END
In this query, we use CASE to define the salary range for each employee. You can see
the same CASE statement twice. The first one defines the ranges, as we just said; the
second one in the GROUP BY aggregates records and applies the COUNT(*) function to
each group of records. You can use the CASE statement in the same way to compute
counts or sums for other custom-defined levels.
How to Use CASE in SQL explains other examples of CASE statements like the one
used in this query.
Do you want to take your SQL skills to the next level? Check out our Advanced
SQL track.
Example #20 – Compute a Running Total
in SQL
A running total is a very common SQL pattern, one that’s used frequently in finance and
in trend analysis.
When you have a table that stores any daily metric, such as a sales table with the
columns day and daily_amount, you can calculate the running total as the cumulative sum
of all previous daily_amount values. SQL provides a window function called SUM() to do
just that.
In the following query, we’ll calculate the cumulative sales for each day:
SELECT
day,
daily_amount,
FROM sales
The SUM() function uses the OVER() clause to define the order of the rows; all rows
previous to the current day are included in the SUM(). Here’s a partial result:
The first two columns day and daily_amount are values taken directly from the
table sales. The column running_total is calculated by the expression:
If you wish to go deeper on this topic, I suggest the article What Is a SQL Running Total
and How Do You Compute It?, which includes many clarifying examples.
Let’s calculate the moving average for the last 7 days using the sales table from the
previous example:
SELECT
day,
daily_amount,
AS moving_average
FROM sales
In the above query, we use the AVG() window function to calculate the average using the
current row (today) and the previous 6 rows. As the rows are ordered by day, the current
row and the 6 previous rows defines a period of 1 week.
The article What a Moving Average Is and How to Compute it in SQL goes into detail
about this subject; check it out if you want to learn more.
Example #22 – Compute a Difference
(Delta) Between Two Columns on
Different Rows
There’s more than one way to calculate the difference between two rows in SQL. One
way to do it is by using the window functions LEAD() and LAG(), as we will do in this
example.
Let’s suppose we want to obtain a report with the total amount sold on each day, but we
also want to obtain the difference (or delta) related to the previous day. We can use a
query like this one:
SELECT
day,
daily_amount,
AS delta_yesterday_today
FROM sales
Both elements of the arithmetic difference come from different rows. The first element
comes from the current row and LAG(daily_amount) comes from the previous day
row. LAG() returns the value of any column from the previous row (based on the ORDER
BY specified in the OVER clause).
In this example, we will use the sales table, which has data in a daily granularity. We first
need to aggregate the data to the year or month, which we will do by creating a CTE with
amounts aggregated by year. Here’s the query:
WITH year_metrics AS (
SELECT
extract(year from day) as year,
SUM(daily_amount) as year_amount
FROM sales
GROUP BY year)
SELECT
year,
year_amount,
FROM year_metrics
ORDER BY 1
It is used to calculate the difference (as a value) between the amount of the current year
and the previous year using the LAG() window function and ordering the data by year.
((year_amount-LAG(year_amount ) OVER(ORDER BY year))/LAG(year_amount )
OVER(ORDER BY year))
In the article How to Compute Year-Over-Year Differences in SQL, you can find several
examples of calculating year-to-year and month-to-month differences.
When we have this sort of organization, we can have a hierarchy of various levels. In
each row, the column manager_id refers to the row on the immediate upper level in the
hierarchy. In these cases, a frequent request is to obtain a list of all employees reporting
(directly or indirectly) to the CEO of the company (who, in this case, has
the employee_id of 110). The query to use is:
WITH RECURSIVE subordinate AS (
SELECT
employee_id,
first_name,
last_name,
manager_id
FROM employee
UNION ALL
SELECT
e.employee_id,
e.first_name,
e.last_name,
e.manager_id
FROM employee e
JOIN subordinate s
ON e.manager_id = s.employee_id
SELECT
employee_id,
first_name,
last_name,
manager_id
FROM subordinate ;
In this query, we created a recursive CTE called subordinate. It’s the key part of this
query because it traverses the data hierarchy going from one row to the rows in the
hierarchy immediately below it.
There are two subqueries connected by a UNION ALL; the first subquery returns the top
row of the hierarchy and the second query returns the next level, adding those rows to
the intermediate result of the query. Then the second subquery is executed again to
return the next level, which again will be added to the intermediate result set. This
process is repeated until no new rows are added to the intermediate result.
Finally, the main query consumes the data in the subordinate CTE and returns data in
the way we expect. If you want to learn more about recursive queries in SQL, I suggest
the article How to Find All Employees Under Each Manager in SQL.
1 Jan 25 2023 51
2 Jan 26 2023 46
3 Jan 27 2023 41
4 Jan 30 2023 59
5 Jan 31 2023 73
id day Registered users
6 Feb 1 2023 34
7 Feb 2 2023 56
8 Feb 4 2023 34
There are 3 different data series shown in different colors. We are looking for a query to
obtain the length of each data series. The first data series starts on Jan 25 and has a
length of 3 elements, the second one starts on Jan 30 and its length is 4, and so on.
WITH data_series AS (
SELECT
day,
FROM user_registration )
SELECT
MIN(day) AS series_start_day,
MAX(day) AS series_end_day,
FROM data_series
GROUP BY series_id
ORDER BY series_start_date
In the previous query, the CTE has the column series_id, which is a value intended to be
used as an ID for the rows in the same data series. In the main query, the GROUP BY
series_id clause is used to aggregate rows of the same data series. Then we can obtain
the start of the series with MIN(day) and its end with MAX(day). The length of the series is
calculated with the expression:
MAX(day) - MIN (day) + 1
If you want to go deeper with this topic, the article How to Calculate the Length of a
Series with SQL provides a detailed explanation of this technique.
If you want to continue learning SQL, I suggest our advanced SQL courses: Window
Functions, Recursive Queries, and GROUP BY Extensions in SQL. All of them cover
complex areas of the SQL language in simple words and with plenty of examples.
Increase your skill and invest in yourself with SQL!
If you want to display all the attributes from a particular table, this is the right query to use:
1SELECT * FROM STUDENT;
The ordering of the result can also be set manually, using “asc ” for ascending and “desc” for
descending.
Ascending (ASC) is the default condition for the ORDER BY clause. In other words, if users don’t
specify ASC or DESC after the column name, then the result will be ordered in ascending order
only.
1SELECT EMP_ID, LAST_NAME FROM EMPLOYEE_TBL
2WHERE CITY = 'INDIANAPOLIS' ORDER BY EMP_ID asc;
5. SQL Query for Outputting Sorted Data Using ‘Group By’
The ‘Group By’ property groups the resulting data according to the specified attribute.
The SQL query below will select Name, Age columns from the Patients table, then will filter
them by Age value to include records where Age is more than 40 and then will group records
with similar Age value and then finally will output them sorted by Name. The basic rule is that
the group by clause should always follow a where clause in a Select statement and must
precede the Order by clause.
1SELECT Name, Age FROM Patients WHERE Age > 40
2GROUP BY Name, Age ORDER BY Name;
Another sample of use of Group By: this expression will select records with a price lesser than
70 from the Orders table, will group records with a similar price, will sort the output by price,
and will also add the column COUNT(price) that will display how many records with similar
price were found:
1SELECT COUNT(price), price FROM orders
2WHERE price < 70 GROUP BY price ORDER BY price
Note: you should use the very same set of columns for both SELECT and GROUP BY commands,
otherwise you will get an error. Many thanks to Sachidannad for pointing it out!
Primary, Unique, and Foreign are part of the constraints in SQL. Constraints are essential to the
scalability, compliance, and sincerity of the data. Constraints implement particular rules,
assuring the data adheres to the conditions outlined. For example, these are the laws imposed
on the columns of the database tables. These are applied to restrict the kind of data in the
table. This assures the efficiency and authenticity of the database.
18. Displaying Triggers
A Trigger is sort of an ‘event listener’ – i.e, it’s a pre-specified set of instructions that execute
when a certain event occurs. The list of defined triggers can be viewed using the following
query.
1SELECT * FROM Sys.Objects WHERE Type='tr'
The point of INNER JOIN, in this case, is to select records in the Customers table which have
matching customer ID values in the Orders table and return only those records. Of course,
there are many types of JOIN, such as FULL, SELF, and LEFT, but for now, let’s keep things
interesting and move on to more diverse types of advanced SQL commands.
The UNION keyword makes it possible to combine JOINS and other criteria to achieve a very
powerful new table generation potential.
Although most databases are created using a UI such as Access or OpenOffice, it is important
to know how to create and delete databases and tables programmatically via code with SQL
statements. This is especially so when installing a new web app and the UI asks new users to
enter names for DBs to be added during installation.
If a table becomes corrupted with bad data you can quickly delete it like this:
1DROP TABLE table_name
We can extend the functionality of the Primary Key so that it automatically increments from a
base. Change the ID entry above to add the AUTO_INCREMENT keyword as in the following
statement:
1ID int NOT NULL AUTO_INCREMENT
Performance pitfalls can be avoided in many ways. For example, avoid the time sinkhole of
forcing SQL Server to check the system/master database every time by using only a stored
procedure name, and never prefix it with SP_. Also setting NOCOUNT ON reduces the time
required for SQL Server to count rows affected by INSERT, DELETE, and other commands. Using
INNER JOIN with a condition is much faster than using WHERE clauses with conditions. We
advise developers to learn SQL server queries to an advanced level for this purpose. For
production purposes, these tips may be crucial to adequate performance. Notice that our
tutorial examples tend to favor the INNER JOIN.
In this example above, the SELECT returns a value of TRUE when a customer has orders valued
at less than $50.
37. Copying Selections from Table to Table
There are a hundred and one uses for this SQL tool. Suppose you want to archive your yearly
Orders table into a larger archive table. This next example shows how to do it.
1INSERT INTO Yearly_Orders
2SELECT * FROM Orders
3WHERE Date<=1/1/2018
This example will add any records from the year 2018 to the archive.
This line will return everything to the left of the second occurrence of “. ” and so, in this case, it
will return
1<a href="https://bytescout.com">www.bytescout.com</a>
Syntax
1SELECT COALESCE(NULL,NULL,'ByteScout',NULL,'Byte')
Output
ByteScout
Output
27
44. Query_partition_clause
The query_partition_clause breaks the output set into distributions, or collections, of data. The
development of the analytic query is limited to the confines forced by these partitions, related to the
process a GROUP BY clause modifies the performance of an aggregate function. If the
query_partition_clause is eliminated, the entire output collection is interpreted as a separate
partition.
The following query applies an OVER clause, so the average displayed is based on all the
records of the output set.
SELECT eno, dno, salary,
1
AVG(salary) OVER () AS avg_sal
2
FROM employee;
3
4
EO DNO SALARY AVG_SAL
5
---------- ---------- ---------- ----------
6
7364 20 900 2173.21428
7
7494 30 1700 2173.21428
8
7522 30 1350 2173.21428
9
7567 20 3075 2173.21428
10
7652 30 1350 2173.21428
11
7699 30 2950 2173.21428
12
7783 10 2550 2173.21428
13
147789 20 3100 2173.21428
157838 10 5100 2173.21428
167845 30 1600 2173.21428
177877 20 1200 2173.21428
187901 30 1050 2173.21428
197903 20 3100 2173.21428
207935 10 1400 2173.21428
45. Finding the last five records from the table
Now, if you want to fetch the last eight records from the table then it is always difficult to get such
data if your table contains huge information. For example, you want to get the last 8 records from
the employee table then you can use rownum and a union clause. The rownum is temporary in SQL.
For example,
1Select * from Employee A where rownum <=8
2union
3select * from (Select * from Employee A order by rowid desc) where rownum <=8;
The above SQL query will give you the last eight records from the employee table where
rownum is a pseudo column. It indexes the data in an output set.
46. LAG
The LAG is applied to get data from a prior row. This is an analytical function. For example, the
following query gives the salary from the prior row to compute the difference between the salary of
the current row and that of the prior row. In this query, the ORDER BY of the LAG function is
applied. The default is 1 if you do not define offset. The arbitrary default condition is given if
the offset moves past the range of the window. The default is null if you do not define default.
Syntax
1SELECT dtno,
2 eno,
3 emname,
4 job,
5 salary,
6 LAG(sal, 1, 0) OVER (PARTITION BY dtno ORDER BY salary) AS salary_prev
7FROM employee;
Output
47. LEAD
The LEAD is also an analytical query that is applied to get data from rows extra down the
output set. The following query gives the salary from the next row to compute the deviation between
the salary of the prevailing row and the subsequent row. The default is 1 if you do not define
offset. The arbitrary default condition is given if the offset moves past the range of the window.
The default is null if you do not define default.
1SELECT eno,
2 empname,
3 job,
4 salary,
5 LEAD(salary, 1, 0) OVER (ORDER BY salary) AS salary_next,
6 LEAD(salary, 1, 0) OVER (ORDER BY salary) - salary AS salary_diff
7FROM employee;
8
9ENO EMPNAME JOB SALARY SALARY_NEXT SALARY_DIFF
10
---------- ---------- --------- ---------- ---------- ----------
11
7369 STEVE CLERK 800 950 150
12
7900 JEFF CLERK 950 1100 150
13
7876 ADAMS CLERK 1100 1250 150
7521 JOHN SALESMAN 1250 1250 0
14
7654 MARK SALESMAN 1250 1300 50
15
7934 TANTO CLERK 1300 1500 200
16
7844 MATT SALESMAN 1500 1600 100
17
187499 ALEX SALESMAN 1600 2450 850
197782 BOON MANAGER 2450 2850 400
207698 BLAKE MANAGER 2850 2975 125
217566 JONES MANAGER 2975 3000 25
227788 SCOTT ANALYST 3000 3000 0
237902 FORD ANALYST 3000 5000 2000
247839 KING PRESIDENT 5000 0 -5000
48. PERCENT_RANK
The PERCENT_RANK analytic query. The ORDER BY clause is necessary for this query. Excluding
a partitioning clause from the OVER clause determines the entire output set is interpreted as a
separate partition. The first row of the standardized set is indicated 0 and the last row of
the set is indicated 1. For example, the SQL query example gives the following output.
Syntax
1SELECT
2 prdid, SUM(amount),
3 PERCENT_RANK() OVER (ORDER BY SUM(amount) DESC) AS percent_rank
4 FROM sales
5 GROUP BY prdid
6 ORDER BY prdid;
Output
10
7782 CLARK 10 2450 1300
50. MAX
Using a blank row OVER clause converts the MAX into an analytic function. The lack of a
partitioning clause indicates the entire output set is interpreted as a separate
partition. This gives the maximum salary for all employees and their original data. For
example, the following query displays the use of MAX in the select query.
SELECT eno,
1
empname,
2
dtno,
3
salary,
4
MAX(salary) OVER () AS max_result
5
FROM employee;
6
7
ENO EMPNAME DTNO SALARY MAX_RESULT
8
9
---------- ---------- ---------- ---------- ----------
7369 SMITH 20 800 3000
10
11
7499 ALLEN 30 1600 3000
Example
1SELECT empid,
2 name,
3 dno,
4 salary,
5 job,
6 CORR(SYSDATE - joiningdate, salary) OVER () AS my_corr_val
7FROM employee;
Example
1SELECT empid,
2 name,
3 dno,
4 salary,
5 NTILE(6) OVER (ORDER BY salary) AS container_no
6FROM employee;
54. VARIANCE, VAR_POP, and VAR_SAMP Query
The VARIANCE, VAR_POP, and VAR_SAMP are aggregate functions. These are utilized to
determine the variance, group variance, and sample variance of a collection of data
individually. As aggregate queries or functions, they decrease the number of rows, therefore
the expression “aggregate”. If the data isn’t arranged we change the total rows in the Employee
table to a separate row with the aggregated values. For example, the following query is
displaying the use of these functions:
If there is more than one account after dropping nulls, the STDDEV function gives the result of
the STDDEV_SAMP. Using an empty OVER clause converts the STDDEV query result into an
analytic query. The absence of a partitioning indicates the entire output set is interpreted as a
particular partition, so we accept the standard deviation of the salary and the primary data.
Syntax
Example
1DEFINE
2 UP AS UP.products_sold > PREV(UP.products_sold),
3 FLAT AS FLAT.products_sold = PREV(FLAT.products_sold),
4 DOWN AS DOWN.products_sold < PREV(DOWN.products_sold)
57. FIRST_VALUE
The simplest way to get analytic functions is to begin by studying aggregate functions. An
aggregate function collects or gathers data from numerous rows into a unique result row. For
instance, users might apply the AVG function to get an average of all the salaries in the
EMPLOYEE table. Let’s take a look at how First_Value can be used. The primary explanation for
the FIRST_VALUE analytic function is displayed below.
Syntax:
1FIRST_VALUE
2 { (expr) [NULLS ]
3 | (expr [NULLS ])
4 }
5 OVER (analytic clause)
Example
1SELECT eno,
2 dno,
3 salary,
4 FIRST_VALUE(salary) IGNORE NULLS
5 OVER (PARTITION BY dno ORDER BY salary) AS lowest_salary_in_dept
6FROM employee;
58. LAST_VALUE
The primary explanation for the LAST_VALUE analytic query or function is displayed below.
1Syntax: LAST_VALUE
2 { (expr) [ { NULLS ]
3 | (expr [ NULLS ])
4 OVER (analytic clause)
The LAST_VALUE analytic query is related to the LAST analytic function. The function enables
users to get the last output from an organized column. Applying the default windowing to the
output can be surprising. For example,
1SELECT eno,
2 dno,
3 salary,
4 LAST_VALUE(salary) IGNORE NULLS
5 OVER (PARTITION BY dno ORDER BY salary) AS highest_salary_in_dept
6FROM employee;
59. Prediction
The design sample foretells the gender and age of clients who are most expected to adopt an
agreement card (target = 1). The PREDICTION function takes the price matrix correlated with
the design and applies for marital status, and house size as predictors. The syntax of the
PREDICTION function can also apply a piece of arbitrary GROUPING information when getting a
partitioned model.
SELECT client_gender, COUNT(*) AS ct, ROUND(AVG(age)) AS average_age
1
FROM mining_data_shop
2
WHERE PREDICTION(sample COST MODEL
3
4
USING client_marital_status, house_size) = 1
5 GROUP BY client_gender
6 ORDER BY client_gender;
7
8CUST_GENDER CNT AVG_AGE
9------------ ---------- ----------
10F 270 40
11M 585 41
60. CLUSTER_SET
CLUSTER_SET can get the data in one of the couple steps: It can use a mining type object to the
information, or it can mine the data by performing an analytic clause that creates and uses one
or more moving mining patterns.
This example enumerates the properties that have the biggest influence on cluster distribution
for client ID 1000. The query requests the CLUSTER_DETAILS and CLUSTER_SET functions, which
use the clustering model my_sample.
Example
Syntax
WITH all_emp
1
AS
2
(
3
SELECT empId, BossId, FirstName, LastName
4
FROM Emp
5
6WHERE BossId is NULL
7
8UNION ALL
9
10SELECT e.empId, e.BossId, e.FirstName, e.LastName
11FROM Emp e INNER JOIN all_emp r
12ON e.BossId = r.Id
13)
14SELECT * FROM all_emp
62. NANVL
This function is utilized to deliver an optional value n1 if the inserted value n2 is NaN (not a
number), and gives n2 if n2 is not a number. This function is used only for type BINARY_FLOAT.
The following query is displaying its use:
Example
63. WIDTH_BUCKET
This function is used to obtain the bucket number. In this, it gives the value of the expression
that would come under after being assessed. The following query is displaying its use:
Example
64. COSH
This function is used to deliver the hyperbolic cosine of a number. It accepts all numeric or
non-numeric data types as an argument. The following query is displaying its use:
Example
65. SOUNDEX
The SOUNDEX function delivers a character string comprising the description of char. It allows
users to match words that are spelled antagonistically, but sound similar in English. It does not
support CLOB. The following query is displaying its use:
Example
66. TZ_OFFSET
The TZ_OFFSET gives the time zone offset identical to the case based on the date the statement
is given. The following query is displaying its use:
Example
67. CARDINALITY
CARDINALITY is utilized to obtain the number of components in a nested table. It is supported
in different versions. The following query is displaying its use:
Example
68. DUMP
DUMP is one of the important string/char functions. It is utilized to get a VARCHAR2 value. The
value delivered defines the data type code. The following query is displaying its use:
Example
69. PATH
PATH is applied simply with the UNDER_PATH and EQUALS_PATH requirements. It gives the
corresponding path that points to the resource defined in the main state. The following query
is displaying its use:
Example
70. UNISTR
UNISTR accepts an expression that determines character data and delivers it in the general
character set. It gives support to the Unicode string literals by allowing users to define the
Unicode value. The following query is displaying its use:
Example
3. Create a view with the name “VIEW1” that can be used to query data from table1. The view is
created on columns column1 and column2. It must return the same number of rows as the
underlying table, and it must return the same data type. In this case, we will return the maximum
value for each column in the underlying table when queried against the view. The following
query will be used to populate our view:
1. Write an SQL query to fetch the EmpId and FullName of all the employees working under
the Manager with id – ‘986’.
We can use the EmployeeDetails table to fetch the employee details with a where clause for the
manager-
SELECT EmpId, FullName
FROM EmployeeDetails
2. Write an SQL query to fetch the different projects available from the EmployeeSalary table.
While referring to the EmployeeSalary table, we can see that this table contains project values
corresponding to each employee, or we can say that we will have duplicate project values while
selecting Project values from this table.
So, we will use the distinct clause to get the unique values of the Project.
SELECT DISTINCT(Project)
FROM EmployeeSalary;
3. Write an SQL query to fetch the count of employees working in project ‘P1’.
Here, we would be using aggregate function count() with the SQL where clause-
SELECT COUNT(*)
FROM EmployeeSalary
WHERE Project = 'P1';
4. Write an SQL query to find the maximum, minimum, and average salary of the employees.
We can use the aggregate function of SQL to fetch the max, min, and average values-
SELECT Max(Salary),
Min(Salary),
AVG(Salary)
FROM EmployeeSalary;
5. Write an SQL query to find the employee id whose salary lies in the range of 9000 and
15000.
Here, we can use the ‘Between’ operator with a where clause.
SELECT EmpId, Salary
FROM EmployeeSalary
6. Write an SQL query to fetch those employees who live in Toronto and work under the
manager with ManagerId – 321.
Since we have to satisfy both the conditions – employees living in ‘Toronto’ and working in Project
‘P2’. So, we will use the AND operator here-
SELECT EmpId, City, ManagerId
FROM EmployeeDetails
7. Write an SQL query to fetch all the employees who either live in California or work under a
manager with ManagerId – 321.
This interview question requires us to satisfy either of the conditions – employees living in
‘California’ and working under Manager with ManagerId – 321. So, we will use the OR operator
here-
SELECT EmpId, City, ManagerId
FROM EmployeeDetails
FROM EmployeeSalary
FROM EmployeeSalary
For the difference between NOT and <> SQL operators, check this link – Difference between the
NOT and != operators.
9. Write an SQL query to display the total salary of each employee adding the Salary with
Variable value.
Here, we can simply use the ‘+’ operator in SQL.
SELECT EmpId,
Salary+Variable as TotalSalary
FROM EmployeeSalary;
10. Write an SQL query to fetch the employees whose name begins with any two characters,
followed by a text “hn” and ends with any sequence of characters.
For this question, we can create an SQL query using like operator with ‘_’ and ‘%’ wild card
characters, where ‘_’ matches a single character and ‘%’ matches ‘0 or multiple characters.
SELECT FullName
FROM EmployeeDetails
11. Write an SQL query to fetch all the EmpIds which are present in either of the tables –
‘EmployeeDetails’ and ‘EmployeeSalary’.
In order to get unique employee ids from both tables, we can use the Union clause which can
combine the results of the two SQL queries and return unique rows.
SELECT EmpId FROM EmployeeDetails
UNION
INTERSECT
MySQL – Since MySQL doesn’t have INTERSECT operator so we can use the subquery-
SELECT *
FROM EmployeeSalary
WHERE EmpId IN
13. Write an SQL query to fetch records that are present in one table but not in another table.
SQL Server – Using MINUS- operator-
SELECT * FROM EmployeeSalary
MINUS
MySQL – Since MySQL doesn’t have a MINUS operator so we can use LEFT join-
SELECT EmployeeSalary.*
FROM EmployeeSalary
LEFT JOIN
EmployeeDetails
where EmpId IN
15. Write an SQL query to fetch the EmpIds that are present in EmployeeDetails but not in
EmployeeSalary.
Using subquery-
SELECT EmpId FROM
EmployeeDetails
16. Write an SQL query to fetch the employee’s full names and replace the space with ‘-’.
Using the ‘Replace’ function-
SELECT REPLACE(FullName, ' ', '-')
FROM EmployeeDetails;
17. Write an SQL query to fetch the position of a given character(s) in a field.
Using the ‘Instr’ function-
SELECT INSTR(FullName, 'Snow')
FROM EmployeeDetails;
18. Write an SQL query to display both the EmpId and ManagerId together.
Here we can use the CONCAT command.
SELECT CONCAT(EmpId, ManagerId) as NewId
FROM EmployeeDetails;
19. Write a query to fetch only the first name(string before space) from the FullName column
of the EmployeeDetails table.
In this question, we are required to first fetch the location of the space character in the FullName
field and then extract the first name out of the FullName field.
For finding the location we will use the LOCATE method in MySQL and CHARINDEX in SQL
SERVER and for fetching the string before space, we will use the SUBSTRING OR MID method.
FROM EmployeeDetails;
FROM EmployeeDetails;
20. Write an SQL query to uppercase the name of the employee and lowercase the city
values.
We can use SQL Upper and Lower functions to achieve the intended results.
SELECT UPPER(FullName), LOWER(City)
FROM EmployeeDetails;
21. Write an SQL query to find the count of the total occurrences of a particular character –
‘n’ in the FullName field.
Here, we can use the ‘Length’ function. We can subtract the total length of the FullName field from
the length of the FullName after replacing the character – ‘n’.
SELECT FullName,
FROM EmployeeDetails;
22. Write an SQL query to update the employee names by removing leading and trailing
spaces.
Using the ‘Update’ command with the ‘LTRIM’ and ‘RTRIM’ functions.
UPDATE EmployeeDetails
FROM EmployeeSalary
24. Write an SQL query to fetch employee names having a salary greater than or equal to
5000 and less than or equal to 10000.
Here, we will use BETWEEN in the ‘where’ clause to return the EmpId of the employees with salary
satisfying the required criteria and then use it as a subquery to find the fullName of the employee
from the EmployeeDetails table.
SELECT FullName
FROM EmployeeDetails
WHERE EmpId IN
SQL Server-
SELECT getdate();
Oracle-
SELECT SYSDATE FROM DUAL;
26. Write an SQL query to fetch all the Employee details from the EmployeeDetails table who
joined in the Year 2020.
Using BETWEEN for the date range ’01-01-2020′ AND ’31-12-2020′-
SELECT * FROM EmployeeDetails
Also, we can extract the year part from the joining date (using YEAR in MySQL)-
SELECT * FROM EmployeeDetails
27. Write an SQL query to fetch all employee records from the EmployeeDetails table who
have a salary record in the EmployeeSalary table.
Using ‘Exists’-
SELECT * FROM EmployeeDetails E
WHERE EXISTS
28. Write an SQL query to fetch the project-wise count of employees sorted by project’s
count in descending order.
The query has two requirements – first to fetch the project-wise count and then to sort the result by
that count.
For project-wise count, we will be using the GROUP BY clause and for sorting, we will use the
ORDER BY clause on the alias of the project count.
SELECT Project, count(EmpId) EmpProjectCount
FROM EmployeeSalary
GROUP BY Project
29. Write a query to fetch employee names and salary records. Display the employee details
even if the salary record is not present for the employee.
This is again one of the very common interview questions in which the interviewer just wants to
check the basic knowledge of SQL JOINS.
Here, we can use the left join with the EmployeeDetail table on the left side of the EmployeeSalary
table.
SELECT E.FullName, S.Salary
FROM EmployeeDetails E
LEFT JOIN
EmployeeSalary S
ON E.EmpId = S.EmpId;
FROM TableA
For more questions on SQL Joins, you can also check our top SQL Joins Interview
Questions.
SQL Query Interview Questions for Experienced
Here is a list of some of the most frequently asked SQL query interview questions for experienced
professionals. These questions cover SQL queries on advanced SQL JOIN concepts, fetching
duplicate rows, odd and even rows, nth highest salary, etc.
31. Write an SQL query to fetch all the Employees who are also managers from the
EmployeeDetails table.
Here, we have to use Self-Join as the requirement wants us to analyze the EmployeeDetails table
as two tables. We will use different aliases ‘E’ and ‘M’ for the same EmployeeDetails table.
SELECT DISTINCT E.FullName
FROM EmployeeDetails E
ON E.EmpID = M.ManagerID;
To learn more about Self Join along with some more queries, you can watch the below video that
explains the self-join concept in a very simple way.
Self Join and Its Demonstration
32. Write an SQL query to fetch duplicate records from EmployeeDetails (without considering
the primary key – EmpId).
In order to find duplicate records from the table, we can use GROUP BY on all the fields and then
use the HAVING clause to return only those fields whose count is greater than 1 i.e. the rows having
duplicate records.
SELECT FullName, ManagerId, DateOfJoining, City, COUNT(*)
FROM EmployeeDetails
33. Write an SQL query to remove duplicates from a table without using a temporary table.
Here, we can use delete with alias and inner join. We will check for the equality of all the matching
records and then remove the row with a higher EmpId.
DELETE E1 FROM EmployeeDetails E1
34. Write an SQL query to fetch only odd rows from the table.
In case we have an auto-increment field e.g. EmpId then we can simply use the below query-
SELECT * FROM EmployeeDetails
WHERE MOD (EmpId, 2) <> 0;
In case we don’t have such a field then we can use the below queries.
Using Row_number in SQL server and checking that the remainder when divided by 2 is 1-
SELECT E.EmpId, E.Project, E.Salary
FROM (
FROM EmployeeSalary
)E
WHERE E.RowNumber % 2 = 1;
FROM (
FROM EmployeeSalary
)t
WHERE rn % 2 = 1;
35. Write an SQL query to fetch only even rows from the table.
In case we have an auto-increment field e.g. EmpId then we can simply use the below query-
SELECT * FROM EmployeeDetails
In case we don’t have such a field then we can use the below queries.
Using Row_number in SQL server and checking that the remainder, when divided by 2, is 1-
SELECT E.EmpId, E.Project, E.Salary
FROM (
FROM EmployeeSalary
)E
WHERE E.RowNumber % 2 = 0;
FROM (
FROM EmployeeSalary
)t
WHERE rn % 2 = 0;
36. Write an SQL query to create a new table with data and structure copied from another
table.
CREATE TABLE NewTable
37. Write an SQL query to create an empty table with the same structure as some other table.
Here, we can use the same query as above with the False ‘WHERE’ condition-
CREATE TABLE NewTable
FROM EmployeeSalary
ORDER BY Salary DESC LIMIT N;
FROM EmployeeSalary
39. Write an SQL query to find the nth highest salary from a table.
Using Top keyword (SQL Server)-
SELECT TOP 1 Salary
FROM (
FROM Employee
FROM Employee
40. Write SQL query to find the 3rd highest salary from a table without using the TOP/limit
keyword.
This is one of the most commonly asked interview questions. For this, we will use a correlated
subquery.
In order to find the 3rd highest salary, we will find the salary value until the inner query returns a
count of 2 rows having a salary greater than other distinct salaries.
SELECT Salary
WHERE N-1 = (
This concludes our post on frequently asked SQL query interview questions and answers. I hope
you practice these questions and ace your database interviews.
If you feel, we have missed any of the common interview questions on SQL then do let us know in
the comments and we will add those questions to our list.
RDBMS has primary keys and data is stored in tables. DBMS has no concept of primary keys with
data stored in navigational or hierarchical form.
RDBMS defines integrity constraints in order to follow ACID properties. While DBMS doesn’t follow
ACID properties.
Ques.10. Explain DDL commands. What are the different DDL commands in SQL?
Ans. DDL refers to Data Definition Language. The DDL commands are used to define or alter the
structure of the database. The different DDL commands are-
CREATE – Used to create a table in the DB
DROP – Drops the table from the DB
ALTER – Alters the structure of the DB
TRUNCATE – Deletes all the records from the DB but not its database structure
RENAME – Renames a DB object
Ques.11. Explain DML commands. What are the different DML commands in SQL?
Ans. DML refers to Data Manipulation Language. These commands are used for managing data
present in the database. Some of the DML commands are – select, insert, update, delete, etc.
Ques.12. Explain DCL commands. What are the different DCL commands in SQL?
Ans. DCL refers to Data Control Language. These commands are used to create roles, grant
permission, and control access to the database objects. The three DCL commands are-
GRANT – Grants permission to a database user.
REVOKE – Removes access privileges from a user-provided with the GRANT command.
Deny – Explicitly prevents a user from receiving particular permission(e.g. preventing a
particular user belonging to a group to receive the access controls.
Ques.13. Explain TCL commands. What are the different TCL commands in SQL?
Ans. TCL refers to Transaction Control Language. These commands are used to manage the
changes made by DML statements. These are used to process a group of SQL statements
comprising a logical unit. The three TCL commands are-
COMMIT – Commit write the changes to the database
SAVEPOINT – Savepoints are the breakpoints, these divide the transaction into smaller logical
units which could be further roll-backed.
ROLLBACK – Rollbacks are used to restore the database since the last commit.
Ques.17. What is the difference between a unique key and a primary key?
Ans. A unique key allows null value (although only one) but a primary key doesn’t allow null values.
A table can have more than one unique keys columns while there can be only one primary key. A
unique key column creates a non-clustered index whereas the primary key creates a clustered index
on the column.
);
Ques.26. What is the difference between delete, truncate and drop command?
Ans. The difference between the Delete, Truncate and Drop command is –
Delete command is a DML command. It removes rows from a table based on the condition
specified in the where clause, being a DML statement we can rollback changes made by the
delete command.
Truncate is a DDL command. It removes all the rows from the table and also frees the space
held, unlike the delete command. It takes a lock on the table while the delete command takes
a lock on rows of the table.
Drop is a DDL command. It removes the complete data along with the table structure (unlike
the truncate command that removes only the rows).
2. Left Join – To fetch all rows from the left table and matching rows of the right table
SELECT * FROM TABLE1 LEFT JOIN TABLE2 ON TABLE1.columnA = TABLE2.columnA;
3. Right Join – To fetch all rows from right table and matching rows of the left table
SELECT * FROM TABLE1 RIGHT JOIN TABLE2 ON TABLE1.columnA = TABLE2.columnA;
4. Full Outer Join – To fetch all rows of the left table and all rows of right table
SELECT * FROM TABLE1 FULL OUTER JOIN TABLE2 ON TABLE1.columnA =
TABLE2.columnA;
5. Self Join – Joining a table to itself, for referencing its own data
SELECT * FROM TABLE1 T1, TABLE1 T2 WHERE T1.columnA = T2.columnB;
Ques.28. What is the difference between cross join and full outer join?
Ans. A cross join returns the cartesian product of the two tables. So there is no condition or on
clause as each row of TabelA is joined with each row of TableB whereas a full outer join will join the
two tables on the basis of the condition specified in the on clause and for the records not satisfying
the condition null value is placed in the join result.
A ‘where’ clause is used to fetch data from the database that specifies particular criteria (specified
after the where clause). Whereas a ‘having’ clause is used along with ‘GROUP BY’ to fetch data that
meets particular criteria specified by the aggregate function.
For example – for a table with Employee and Project fields. If we want to fetch Employee working on
a particular project P2, we will use ‘where’ clause-
Select Employee
From Emp_Project
Now if we want to fetch Employees who are working on more than one project. We will first have to
group the Employee column along with the count of the project and then the ‘having’ clause can be
used to fetch relevant records-
Select Employee
From Emp_Project
GROUP BY Employee
Having count(Project)>1;
Ques.30. What is the difference between Union and Union All command?
Ans. The fundamental difference between Union and Union All command is – Union is by default
distinct i.e. it combines the distinct result set of two or more select statements. Whereas, Union
All combines all the rows including duplicates in the result set of different select statements.
FROM Emp_Project
GROUP BY Project;
procedureName
AS
Begin
End
triggerName
triggerTime{Before or After}
ON tableName
triggerBody
FROM ParentTable PT
ON PT.ID = CT.ID
*Remember: Delete with joins requires name/alias before from clause in order to specify the table of
which data is to be deleted.
Data mining is the process of collecting information in order to find patterns, trends, and usable
data that will help a company to make data-driven decisions from large amounts of data. In other
words, Data Mining is the method of analysing hidden patterns of data from various perspectives
for categorization into useful data, which is gathered and assembled in specific areas such as data
warehouses, efficient analysis, data mining algorithm, assisting decision making, and other data
requirements, ultimately resulting in cost-cutting and revenue generation. Data mining is the
process of automatically examining enormous amounts of data for patterns and trends that go
beyond simple analysis. Data mining estimates the probability of future events by utilising
advanced mathematical algorithms for data segments.
Following are the differences between data warehousing and data mining:-
A data warehouse is a database system that is intended for The technique of examining data patterns is
analytical rather than transactional purposes. known as data mining.
Data warehousing is the process of bringing all relevant data Data mining is the process of extracting
together. information from big datasets.
2. What do you mean by OLAP in the context of data warehousing? What guidelines should be followed
while selecting an OLAP system?
3. What do you understand about a fact table in the context of a data warehouse? What are the different
types of fact tables?
In a Data Warehouse system, a Fact table is simply a table that holds all of the facts or business
information that can be exposed to reporting and analysis when needed. Fields that reflect direct
facts, as well as foreign fields that connect the fact table to other dimension tables in the Data
Warehouse system, are stored in these tables. Depending on the model type used to construct the
Data Warehouse, a Data Warehouse system can have one or more fact tables.
A table in a data warehouse's star schema is referred to as a dimension table. Dimensional data
models, which are made up of fact and dimension tables, are used to create data warehouses.
Dimension tables contain dimension keys, values, and attributes and are used to describe
dimensions. It is usually of a tiny size. The number of rows might range from a few to thousands. It
is a description of the objects in the fact table. The term "dimension table" refers to a collection or
group of data pertaining to any quantifiable occurrence. They serve as the foundation for
dimensional modelling. It includes a column that serves as a primary key, allowing each dimension
row or record to be uniquely identified. Through this key, it is linked to the fact tables. When it's
constructed, a system-generated key called the surrogate key is used to uniquely identify the rows
in the dimension.
5. What are the different types of dimension tables in the context of data warehousing?
Following are the different types of dimension tables in the context of data warehousing:-
Slowly Changing Dimensions (SCD): Slowly changing dimensions are dimension attributes that tend
to vary slowly over time rather than at a regular period of time. For example, the address and phone
number may change, but not on a regular basis. Consider the case of a man who travels to several
nations and must change his address according to the place he is visiting. This can be accomplished in
one of three ways:
o Type 1: Replaces the value that was previously entered. This strategy is simple to implement
and aids in the reduction of costs by saving space. However, in this circumstance, history is
lost.
o Type 2: Insert a new row containing the new value. This method saves the history and allows
it to be accessed at any time. However, it takes up a lot of space, which raises the price.
o Type 3: Add a new column to the table. It is the ideal strategy because history can be easily
preserved.
Junk Dimension: A trash dimension is a collection of low-cardinality attributes. It contains a number
of varied or disparate features that are unrelated to one another. These can be used to implement
RCD (rapidly changing dimension) features like flags and weights, among other things.
Conformed Dimension: Multiple subject areas or data marts share this dimension. It can be utilised in
a variety of projects without requiring any changes. This is used to keep things in order. Dimensions
that are exactly the same as or a proper subset of any other dimension are known as conformed
dimensions.
Roleplay Dimension: Role-play dimension refers to the dimension table that has many relationships
with the fact table. In other words, it occurs when the same dimension key and all of its associated
attributes are linked to a large number of foreign keys in the fact table. Within the same database, it
might serve several roles.
Degenerate Dimension: Degenerate dimension attributes are those that are contained in the fact
table itself rather than in a separate dimension table. For instance, a ticket number, an invoice
number, a transaction number, and so on.
The record of a reality or fact table could be made up of attributes from various dimension tables.
The Fact Table, also known as the Reality Table, assists the user in investigating the business
aspects that aid him in call taking in order to improve his firm. Dimension Tables, on the other
hand, make it easier for the reality table or fact table to collect dimensions from which
measurements must be taken.
The following table enlists the difference between a fact table and a dimension table:-
Fact Table Dimension Table
It contains the attributes' measurements, facts, or It is the companion table that has the attributes that the
metrics. fact table uses to derive the facts.
It is used for analysis and decision-making and It contains information regarding a company's operations
contains measures. and procedures.
It has a primary key that works as a foreign key in It has a foreign key that is linked to the fact table's
the dimension table. primary key.
It has lesser attributes than a dimension table. It has more attributes than a fact table.
Here, the table grows vertically. Here, the table grows horizontally.
Loading time of data resources is undervalued: We frequently underestimate the time it will take to
gather, sanitize, and post data to the warehouse. Although some resources are in place to minimize
the time and effort spent on the process, it may require a significant amount of the overall
production time.
Source system flaws that go unnoticed: After years of non-discovery, hidden flaws linked with the
source networks that provide the data warehouse may be discovered. Some fields, for example, may
accept nulls when entering new property information, resulting in workers inputting incomplete
property data, even if it was available and relevant.
Homogenization of data: Data warehousing also deals with data formats that are comparable across
diverse data sources. It's possible that some important data will be lost as a result.
10. What are the different types of data marts in the context of data warehousing?
Dependent Data Mart: A dependent data mart can be developed using data from operational,
external, or both sources. It enables the data of the source company to be accessed from a single
data warehouse. All data is centralized, which can aid in the development of further data marts.
Independent Data Mart: There is no need for a central data warehouse with this data mart. This is
typically established for smaller groups that exist within a company. It has no connection to
Enterprise Data Warehouse or any other data warehouse. Each piece of information is self-contained
and can be used independently. The analysis can also be carried out independently. It's critical to
maintain a consistent and centralized data repository that numerous users can access.
Hybrid Data Mart: A hybrid data mart is utilized when a data warehouse contains inputs from
multiple sources, as the name implies. When a user requires an ad hoc integration, this feature comes
in handy. This solution can be utilized if an organization requires various database environments and
quick implementation. It necessitates the least amount of data purification, and the data mart may
accommodate huge storage structures. When smaller data-centric applications are employed, a data
mart is most effective.
The following table enlists the difference between data warehouse and database:-
Data Warehouse is mainly used for analyzing the historical data so as The database aids in the execution of
to make future decisions based on them. basic business procedures.
12. What do you mean by a factless fact table in the context of data warehousing?
A fact table with no measures is known as a factless fact table. It's essentially a crossroads of
dimensions (it contains nothing but dimensional keys). One form of factless table is used to capture
an event, while the other is used to describe conditions.
In the first type of factless fact table, there is no measured value for an event, but it develops the
relationship among the dimension members from several dimensions. The existence of the
relationship is itself the fact. This type of fact table can be utilised to create valuable reports on its
own. Various criteria can be used to count the number of occurrences.
The second type of factless fact table is a tool that's used to back up negative analytical reports.
Consider a store that did not sell a product for a period of time. To create such a report, you'll need
a factless fact table that captures all of the conceivable product combinations that were on offer.
By comparing the factless table to the sales table for the list of things that did sell, you can figure
out what's missing.
A system that reflects the condition of the warehouse in real time is referred to as real-time data
warehousing. If you perform a query on the real-time data warehouse to learn more about a
specific aspect of the company or entity described by the warehouse, the result reflects the status
of that entity at the time the query was run. Most data warehouses contain data that is highly
latent — that is, data that reflects the business at a specific point in time. A real-time data
warehouse provides current (or real-time) data with low latency.
The technical capacity to collect transactions as they change and integrate them into the
warehouse, as well as maintaining batch or planned cycle refreshes, is known as active data
warehousing. Automating routine processes and choices is possible with an active data warehouse.
The active data warehouse sends decisions to the On-Line Transaction Processing (OLTP) systems
automatically. An active data warehouse is designed to capture and distribute data in real time.
They give you a unified view of your customers across all of your business lines. Business
Intelligence Systems are linked to it.
Metadata is defined as information about data. Metadata is the context that provides data a more
complete identity and serves as the foundation for its interactions with other data. It can also be a
useful tool for saving time, staying organised, and getting the most out of the files you're working
with. Structural Metadata describes how an object should be classified in order to fit into a wider
system of things. Structural Metadata makes a link with other files that allows them to be
categorized and used in a variety of ways. Administrative Metadata contains information about an
object's history, who owned it previously, and what it can be used for. Rights, licences, and
permissions are examples. This information is useful for persons who are in charge of managing and
caring for an asset.
When a piece of information is placed in the correct context, it takes on a whole new meaning.
Furthermore, better-organized Metadata will considerably reduce search time.
17. Enlist a few data warehouse solutions that are currently being used in the industry.
Some of the major data warehouse solutions currently being used in the industry are as follows :
Snowflakes
Oracle Exadata
Apache Hadoop
SAP BW4HANA
Microfocus Vertica
Teradata
AWS Redshift
GCP Big Query
18. Enlist some of the renowned ETL tools currently used in the industry.
Some of the renowned ETL tools currently used in the industry are as follows :
Informatica
Talend
Pentaho
Abnitio
Oracle Data Integrator
Xplenty
Skyvia
Microsoft – SQL Server Integrated Services (SSIS)
19. Explain what you mean by a star schema in the context of data warehousing.
Star schema is a sort of multidimensional model and is used in a data warehouse. The fact tables
and dimension tables are both contained in the star schema. There are fewer foreign-key joins in
this design. With fact and dimension tables, this schema forms a star.
20. What do you mean by snowflake schema in the context of data warehousing?
Snowflake Schema is a multidimensional model that is also used in data warehouses. The fact
tables, dimension tables, and sub dimension tables are all contained in the snowflake schema. With
fact tables, dimension tables, and sub-dimension tables, this schema forms a snowflake.
21. What do you understand about a data cube in the context of data warehousing?
A data cube is a multidimensional data model that stores optimized, summarized, or aggregated
data for quick and easy analysis using OLAP technologies. The precomputed data is stored in a data
cube, which makes online analytical processing easier. We all think of a cube as a three-
dimensional structure, however in data warehousing, an n-dimensional data cube can be
implemented. A data cube stores information in terms of dimensions and facts.
Data Cubes have two categories. They are as follows :
Multidimensional Data Cube : Data is stored in multidimensional arrays, which allows for a
multidimensional view of the data. A multidimensional data cube aids in the storage of vast amounts
of information. A multidimensional data cube uses indexing to represent each dimension of the data
cube, making it easier to access, retrieve, and store data.
Relational Data Cube : The relational data cube can be thought of as an "expanded version of
relational DBMS." Data is stored in relational tables, and each relational table represents a data
cube's dimension. The relational data cube uses SQL to produce aggregated data, although it is
slower than the multidimensional data cube in terms of performance. The relational data cube, on the
other hand, is scalable for data that grows over time.
A data warehouse is a single schema that organizes a heterogeneous collection of multiple data
sources. There are two techniques to building a data warehouse. They are as follows:
External Sources - An external source is a location from which data is collected, regardless of the
data format. Structured, semi-structured, and unstructured data are all possibilities.
Stage Area - Because the data gathered from external sources does not follow a specific format, it
must be validated before being loaded into the data warehouse. ETL tool is used for this purpose in
the stage area.
Data-warehouse - After data has been cleansed, it is kept as a central repository in the data
warehouse. The meta data is saved here, while the real data is housed in data marts. In this top-down
approach, the data warehouse stores the data in its purest form.
Data Marts - A data mart is a storage component as well. It maintains information about a single
organization's function that is managed by a single authority. Depending on the functions, an
organization can have as many data marts as it wants.
Data Mining - Data mining is the process of analyzing large amounts of data in a data warehouse.
With the use of a data mining algorithm, it is used to discover hidden patterns in databases and data
warehouses.
The data is first gathered from external sources (same as happens in top-down approach).
The data is then imported into data marts rather than data warehouses after passing through the
staging area (as stated above). The data marts are built first, and they allow for reporting. It focuses
on a specific industry.
After that, the data marts are incorporated into the data warehouse.
23. What are the advantages and disadvantages of the top down approach of data warehouse
architecture?
Because data marts are formed from data warehouses, they have a consistent dimensional
perspective.
This methodology is also thought to be the most effective for corporate reforms. As a result, large
corporations choose to take this method.
It is simple to create a data mart from a data warehouse.
The disadvantage of the top down approach is that the cost, time, and effort required to design
and maintain it are all very expensive.
24. What are the advantages and disadvantages of the bottom up approach of data warehouse
architecture?
The reports are generated quickly since the data marts are created first.
We can fit a greater number of data marts here, allowing us to expand our data warehouse.
In addition, the cost and effort required to build this model are quite minimal.
Because the dimensional view of data marts is not consistent as it is in the top-down approach, this
model is not as strong as the top-down approach and this is a disadvantage of the bottom up
approach.
Following table enlists the difference between a data warehouse and a data mart:
Data warehousing involves a big portion of the company, Because they can only handle tiny amounts of
which is why it takes so long to process. data, data marts are simple to use, create, and
Data Warehouse Data Mart
install.
When opposed to data mart, the data kept in the Data Data Marts are designed for certain user groups.
Warehouse is always detailed. As a result, the data is brief and limited.
Data is collected from a variety of sources in a data Data in Data Mart comes from a limited number
warehouse. of sources.
The time it takes to implement a Data Warehouse might The Data Mart implementation process is only a
range from months to years. few months long.
From the perspective of the end-users, the data stored is The transaction data is provided straight from
read-only. the Data Warehouse.
26. What do you mean by data purging in the context of data warehousing?
Data purging is a term that describes techniques for permanently erasing and removing data from a
storage space. Data purging, which is typically contrasted with data deletion, involves a variety of
procedures and techniques.
Purging removes data permanently and frees up memory or storage space for other purposes,
whereas deletion is commonly thought of as a temporary preference. Automatic data purging
features are one of the methods for data cleansing in database administration. Some Microsoft
products, for example, feature an automatic purge strategy that uses a circular buffer mechanism,
in which older data is purged to create room for fresh data. Administrators must manually remove
data from the database in other circumstances.
27. What do you mean by dimensional modelling in the context of data warehousing?
Dimensional Modelling (DM) is a data structure technique that is specifically designed for data
storage in a data warehouse. The goal of dimensional modelling is to optimise the database so that
data can be retrieved more quickly. In a data warehouse, a dimensional model is used to read,
summarise, and analyse numeric data such as values, balances, counts, weights, and so on. Relation
models, on the other hand, are designed for adding, modifying, and deleting data in a real-time
Online Transaction System.
Following are the steps that should be followed while creating a dimensional model:
Identifying the business process : The first step is to identify the specific business processes that a
data warehouse should address. This might be Marketing, Sales, or Human Resources, depending on
the organization's data analytic needs. The quality of data available for that process is also a factor in
deciding which business process to use. It is the most crucial step in the Data Modeling process, and
a failure here would result in a cascade of irreversible flaws.
Identifying the grain : The level of detail for the business problem/solution is described by the grain.
It's the procedure for determining the lowest level of data in any table in your data warehouse. If a
table contains sales data for each day, the granularity should be daily. Monthly granularity is defined
as a table that contains total sales data for each month.
Identifying the dimension : Date, shop, inventory, and other nouns are examples of dimensions. All of
the data should be saved in these dimensions. The date dimension, for example, could include
information such as the year, month, and weekday.
Identifying the fact : This stage is linked to the system's business users because it is here that they
gain access to data housed in the data warehouse. The majority of the rows in the fact table are
numerical values such as price or cost per unit.
Building the schema : The Dimension Model is implemented in this step. The database structure is
referred to as a schema (arrangement of tables).
28. What do you understand by data lake in the context of data warehousing? Differentiate between data
lake and data warehouse.
A Data Lake is a large-scale storage repository for structured, semi-structured, and unstructured
data. It's a location where you can save any type of data in its original format, with no restrictions
on account size or file size. It provides a significant amount of data for improved analytical
performance and native integration.
A data lake is a huge container that looks a lot like a lake or a river. Similar to how a lake has
various tributaries, a data lake has structured data, unstructured data, machine-to-machine
communication, and logs flowing through in real-time.
The following table enlists the differences between data lake and data warehouse:
Data Lake Data Warehouse
All data is stored in the data lake, regardless of its Data extracted from transactional systems or data
source or structure. The data is stored in its consisting of quantitative measures and their properties
unprocessed state. When it is ready to be used, it is will be stored in a data warehouse. The information has
converted. been cleansed and changed.
Captures semi-structured and unstructured data in Captures structured data and organises it according to
their original form from source systems. defined standards for data warehouse purposes.
The cost of storing data in big data technology is less Data warehouse storage is more expensive and time-
than that of storing data in a data warehouse. consuming.
29. Differentiate between star schema and snowflake schema in the context of data warehousing.
Following table enlists the difference between the star schema and the snowflake schema:
Star Schema Snowflake Schema
The fact tables and dimension tables are both The fact tables, dimension tables, and sub dimension tables are
contained in the star schema. all contained in the snowflake schema.
It has a lot of redundancy in its data. It has a low level of data redundancy.
The execution of queries takes less time. The execution of queries takes longer than star schema.
Divisive Clustering : This approach also eliminates the need to define the number of clusters ahead
of time. It necessitates a method for breaking a cluster that contains all of the data and then
recursively splitting clusters until all of the data has been split into singletons.
When compared to agglomerative clustering, divisive clustering is more complicated since we require
a flat clustering algorithm as a "subroutine" to split each cluster until each data has its own singleton
cluster.
If we don't create a complete hierarchy all the way down to individual data leaves, divisive clustering
is more efficient.
A divisive algorithm is also more precise. Without first examining the global distribution of data,
agglomerative clustering makes judgments based on local patterns or neighbour points. These early
decisions are irreversible. When generating top-level dividing decisions, divisive clustering takes into
account the global distribution of data.
Conclusion:
In this article, we have covered the most frequently asked interview questions on data
warehousing. ETL tools are often required in a data warehouse and so one can expect interview
questions on ETL tools as well in a data warehouse interview.
Top 45+ SSRS Interview Questions and Answers for 2023
If you are looking for a career in the field of database development, then SSRS is good to learn. Most
organizations value SSRS as additional expertise along with SQL. In this article, we have collated a
list of the top SSRS interview questions that you can expect in an interview. Go through the following
questions to understand the kind of questions you need to be prepared for while giving an interview
for a job related to anything with data science, data engineering, or data analysis.
SQL Server Reporting Services or SSRS is a server-based reporting platform that provides detailed
reporting functionality for various data sources. Reporting services comprise an entire set of tools
to manage, generate and deliver reports and APIs that enables developers to coordinate data and
report processes in a custom application.
Report Designer
Report Manager
Report Server
Data Sources
3. Explain the term data regions and mention the different data regions.
Data regions are nothing but report items that display recurrent rows of summarized data or
information from datasets. Different data regions include
Matrix
Gauge
Chart
List
Table
Compile: It analyses the expressions in the report definitions and saves the compiled intermediate
format on the server internally.
Process: It executes dataset queries and combines intermediate format with layout and data.
Render: It sends a processed report to a rendering extension to display how much information can
fit on each page and then creates the page report.
5. Explain what are parameterized reports. What are cascading parameters in SSRS reports? Do you feel that
issues exist when multi-select / multi-value parameters are allowed and utilized?
Reports which accept parameters from users to fetch and report data conditionally are called
parameterized reports. When there are multiple parameters available in a report and the values of
the different parameters are populated dynamically depending on the value of parent parameters,
then these parameters are known as cascading parameters. A tangent to cascading parameters is
the multi-value parameters. In this scenario, multiple values are selected (or all values) within a
parameter selector.
Become a Skilled Web Developer in Just 9 Months!
Caltech PGP Full Stack DevelopmentEXPLORE PROGRAM
The typical development methodology for an SSRS report is to begin by developing a data source.
Based on this data source, the report designer creates one or multiple datasets as required for the
parameters as well as the body of the report. Next, the report designer adds required controls from
the toolbox, which acts as a container for the fields available in the dataset. Subsequently, the
formatting of controls needs to take place. Next, the designer should verify and validate the report
and finally deploy the report. It is a good idea to follow specific best practices so that the report can
convey a story and perform optimally.
7. What is a dataset and what are the different types of datasets? How do these relate to a data source?
A dataset is identical to a query definition that is executed just as the report is executed. There are
two types of datasets - Embedded and Shared. An embedded dataset is exclusive to the report in
which it is available and can only be used by that specific report. A shared dataset can be shared
across reports. Once a dataset is shared, it has to be published to the Reporting Service Server to
allow it to be used across various reports. Suitable folder permissions need to be set to be able to
access shared datasets.
A dataset is the query definition of the data that is required and a data source is the "pipe" that is
used to link the SSRS to the root location of the data - be it a Teradata, SQL Server, or a whole
sundry of other sources. The data source typically comprises the credentials used to connect to the
source.
8. Would you rather store your query in a Database server or an SSRS report? State the reasons why or why
not.
SSRS has matured over the years and the answer to this question has pretty much changed too.
Earlier, storing SQL queries in text format directly in the dataset was generally not recommended.
However, since shared datasets have been introduced, an SSRS can store a single data set and the
data set can be shared by multiple reports. However, the ideal scenario would be to access a
database server to use a stored procedure. The advantage is that in a stored procedure, SQL will be
available in a compiled format that offers all the benefits of using an SP as compared to using an
ad-hoc query from the report. However, if you are using multi-select parameters you must
understand that using a query rooted in a report or a shared dataset enables the report designer to
take advantage of the multi-value parameter.
9. Name the various types of reports that can be created while using the features of SSRS?
Snapshot reports
Parameterized reports
Ad-hoc reports
Cached reports
Subreports
Linked reports
SSRS is used to create a user interface and is available to access various parameters
11. Shed some light on the important highlights of SSRS.
You can work on the reports generated by using XML, Excel, or other multi-dimensional sources
and can recuperate information from OLEDB and ODBC association suppliers.
In SSRS, clients can create reports in different structures, such as freestyle, even, diagrams,
graphical, and framework structure
SSRS strengthens online highlights; anyone can interact and collaborate with the report server
available on the web legitimately and can view reports in electronic applications.
Any number of unplanned reports can be created by using pictures, illustrations, or outer
substances and can be stored on a server.
All the reports created in SSRS can be sent out in multiple arrangements such as CSV, XML, PDF,
TIFF, HTML, and Excel.
SSRS has the automated choice for transferring the reports to the client's letter drop, movable and
shared area.
12. Explain what benefits you get after using the SSRS Services.
Following is the list of advantages you get by using the SSRS Service of Microsoft-
SSRS can easily be utilized on your existing hardware, as reports are housed in one brought-
together web server from where clients can execute a report from one single spot.
Business clients can access the data without the help of IT experts.
Security can be applied to folders just as we do to reports and can be overseen in a determined job
order manner.
The Tablix is a sum of the tables with matrices in SSRS. Each report that we create using the SSRS
technology is based on the Tablix data region. In other words, a Tablix can be managed with the
collective capabilities of a matrix and a table.
The reporting life cycle of SSRS comprises the following phases mainly:
Development of Reports (Developer): It states that we need to develop a report, something that the
report developer primarily does.
Management of Reports (DBA): It states that DBA should ensure that the report is being created.
Security: It states that only an authorized user can access the report.
Execution: It states how the report will be run to improve the performance of the data sources.
Scheduling of reports: This is required so that the report can be executed on the timings as
scheduled.
Report Delivery (DBA + Developer): It states that once the report is created and executed, the report
should reach the final recipients (business users), who then need to understand and analyze the
data of the report. If there are any changes, we need to go back to the development stage again.
RDL files are written in XML (Extensible Markup Language). These files are an extension of XML
used for SSRS reporting services.
16. What are the different stages of Report Processing in SSRS?
Compile: It internally analyses expressions in the report definitions and the compiled intermediate
format on the server.
Process: It executes dataset queries and combines the intermediate format with the data and
layout.
Render: It sends the processed report to a rendering extension to specify how much information
fits on each page. It also creates the page report.
17. What is the Reporting Services Configuration file name in SSRS, and where does it exist?
18. What are the three different parts of the RDL file in SSRS?
Data: It includes the dataset on which the query is written and the data set is linked to the data
source.
Design: You can design reports in the design reports and create matrix reports and tables. It also
helps us to drag column values from the source.
Preview: This part is used to check the preview once the report is executed.
In SSRS, sub-reports are the subpart of the main reports. These reports can be inserted into the
main part and queries and parameters can be passed to subreports just like the main report. A sub-
report can be thought of as an extension to the main report, but it includes a different data set. For
example, if you prepare a report for students, you can also use a sub-report to show the marks
associated with each student.
20. Do you think it is possible to implement data mining within SSRS? If yes, tell us how.
Yes, it is possible to implement data mining in SSRS. You can execute the implementation using a
DMX designer and develop data mining queries needed for SSRS reports. SSRS allows us to create
an exclusive custom data mining report that comprises images and text, which can be exported
HTML.
Cache results are based on the format of the report. SSRS enables cache reports on the reporting
server, and it also offers built-in caching capability. However, the server only caches one instance of
the report in most cases. It also allows users to access and view reports quickly.
Word
XML
HTML
CSV or Text
23. Which tool do the Business Users use to create their reports?
Typically, business developers use the Report Builder tool to generate reports, and this is the best
choice for creating reports.
24. What is the role of a report manager in SSRS?
In SSRS, a report manager is a web application in SSRS that can be accessed by a URL. The report
manager interface depends on the user permissions. This means that the user must be allocated a
role to access any functionality or execute any task. A user assigned the role of full permission can
manage all the menus and features of the report. A URL must be defined to configure the report
manager.
25. What are the open-source software that can be used as an alternative to SSRS?
Following are some open-source software that can be used as an alternative to SSRS:
DataVision Reports
JFree Reports
Jasper Reports
OpenReport
A report server component hosts and processes reports in different formats such as HTML, PDF,
Excel, and CSV.
Developers integrating with custom applications or creating custom tools can use APIs to manage
or build reports.
1. Create a big server or you can utilize the reporting services of other database servers.
2. Store the duplicate copy of the data for the advanced installation of the logic, report contents, and
characteristics of the report's application.
3. You can solve the locking issues by adopting no lock. It can also improvise the query of the
performance.
28. Explain what are the data types used to develop Radio Button Parameter Type in an SSRS Report.
The Radio Button Parameter Type in SSRS Reports is created with the help of a Boolean data type.
Ensure to set the data type to Boolean while using the bit-type column to add a query for the report.
Report Server: Used for services such as the delivery of implementations and reports.
Report Manager: A web-based administration tool used to manage the report server.
Visual Studio
Report servers
31. List out what other servers you can use with SSRS?
SQL Server Reporting Services (SSRS) can be used with the following servers:
1. Microsoft SQL Server
2. SharePoint Server
3. Microsoft Azure SQL Database and Azure Synapse Analytics (formerly SQL Data Warehouse)
4. Oracle Database
Tabular Report: A tabular report is the most basic type of SSRS report that displays data in a
tabular format, similar to a spreadsheet. It is used for displaying data in a simple, organized
manner.
Matrix Report: A matrix report, also known as a crosstab report, displays data in a grid format. It is
useful for comparing data across multiple dimensions and categories.
Chart Report: A chart report displays data in a graphical format, such as a bar chart, line chart, or
pie chart. This type of report is useful for visualizing trends and patterns in data.
33. Name some of the open-source software you can use as an alternative to SSR?
Some open-source alternatives to SSRS include JasperReports, BIRT (Business Intelligence and
Reporting Tools), and Pentaho Report Designer.
To configure a running aggregate in SSRS, you can use the RunningValue function in a calculated
field or expression. You can specify the aggregate function (such as Sum, Count, or Average) and
the area to be aggregated.
The main function of a query parameter in SSRS is to allow users to filter the data that is displayed
in a report.
36. Explain how SSRS reports Cache results?
Yes, SSRS reports can cache results. Caching allows an account to store the data retrieved from the
data source in memory to quickly access it without retrieving the data again.
37. Mention what are the three command line utilities and what are their primary functions?
SSRS's three command line utilities are RsConfig.exe, RsKeymgmt.exe, and Rs.exe. RsConfig.exe is
used to configure the SSRS service and manage encryption keys. RsKeymgmt.exe is used to
manage encryption keys for SSRS. Rs.exe is used to execute SSRS scripts and manage report server
operations.
38. What method can you use to reduce the overhead of Reporting Services data sources?
One method to reduce the overhead of Reporting Services data sources is to use shared data
sources. A shared data source allows multiple reports to use the same connection information,
reducing the need to duplicate connection information in numerous words.
39. Explain what is the difference between Tabular and Matrix report?
A tabular report is a simple, organized report that displays data in a tabular format, similar to a
spreadsheet. It is useful for displaying data in a simple, organized manner. A matrix report, also
known as a crosstab report, displays data in a grid format. It is useful for comparing data across
multiple dimensions and categories.
It is designed to meet the user's specific needs and can be customized as needed. Ad Hoc reports
are created using a drag-and-drop interface, allowing users to quickly create reports without needing
technical expertise.
41. What are the command prompt utilities for SSRS? List out some and explain.
1. RS.exe: This utility is used for deploying and managing reports on the SSRS server. It can create
and manage report folders, upload and download messages, and set security permissions for
reports.
2. RSCONFIG.exe: This utility configures the SSRS server, including setting up the connection to the
report server database and managing the encryption keys.
3. RSKeyMgmt.exe: This utility is used for managing the encryption keys used by the SSRS server.
42. Explain the minimum software requirements for the SSRS framework.
The minimum software requirements for the SSRS framework include the following:
A web browser (such as Internet Explorer, Chrome, or Firefox) to access the report server
Mixed mode database security is a type of security that allows for both Windows and SQL Server
authentication to be used in a SQL Server database. This allows Windows and SQL Server users to
access the database with different permissions and access rights.
44. If you have created a report with a month name as its parameter, explain the easiest way to provide
values for the parameter?
One of the easiest ways to provide values for the parameter is to use the built-in functions in SSRS
to generate the list of month names dynamically. You can use the following steps:
1. In the report design view, navigate to the "Report Data" pane and right-click on the "Parameters"
folder. Select "New Parameter" to create a new parameter for the month name.
2. In the "Parameter Properties" window, set the "Prompt" to "Month Name" and set the "Data type" to
"String."
To create a calendar parameter in an SSRS report, you can follow these steps:
46. How would you generate a Sequence Number for all the Records in the SSRS Report?
To generate a sequence number for all records in an SSRS report, you can use the ROW_NUMBER()
function within the query for the dataset.
47. How will you display data on a single Tablix extracted from two datasets in an SSRS report by joining on
the single column?
To display data on a single Tablix extracted from two datasets in an SSRS report by joining on a
single column, you can use the JOIN clause in the query for the primary dataset.
SSRS Interview Questions And Answers
Ans. In reporting services, the Tablix is a sum of tables with matrices. Every report we create using
SSRS technology is based on the Tablix data region. In other words, It can be managed with the
combined capabilities of a table and a matrix.
Ans. It is an extension for XML used for SSRS reporting services. These files are written in XML
(Extensible Markup Language).
Q3. What is the name of the Reporting Services Configuration file and where it exists?
Ans. Rsreportserver.config is the reporting services file and it can be found in settings in Report
Manager or that used in Report Server Web Service and background processes.
Ans. In a subreport, you can pass parameters and queries to it. Simply, a sub-report is considered
to be an expanded version of your main report. Any report can be used as a subreport that is
inserted into the main domain, just like the main report. However, in the subreport, it consists of a
separate data set. For example, in a sub-report, you will be able to create a customer's report and
then a sub-report can show an order list for every customer.
Ans. Companies use SQL servers with SSRS most of the time, however, other servers can be
integrated with SSRS as well.
Oracle
Flat XML files
ODBC and OLEDB
Teradata
Hyperion
------ Related Article: What is SSRS ------
Ans. There are the top 6 rendering extensions that will be available in SQL Server Reporting
Services.
Excel
Word
XML
HTML
CSV or Text
PDF
Q7. Which tool can be used by the Business Users to create their reports?
Ans. Report Builder could be the best choice for a tool that can be used by business users or even
by developers to create reports.
Ans. Ad Hoc Reports enable users with limited technical skills to create new, easy to access reports
on their own. Usually, these reports are created from report models. Also, users can select their
priorities and requirements whether to save the reports to a personal server or to share with
others by posting them to the reporting services center. Ad hoc reports can generate quick reports
which are created to meet user needs and requirements. Users can modify these reports with a
powerful analysis of report data.
Ans. Yes, the Implementation of data mining is possible in SSRS. You execute the implementation
using DMX designer and create data mining queries required for SSRS reports. In SSRS, you have
the option to create a custom data mining report containing text and images and export them into
HTML, email, and get prints to distribution.
Ans. Cache results are based on the format of the report.SSRS allows cache reports on the
reporting server, you will also find built-in caching capability. However, the server only caches for
one instance of the report in most cases. It also enables users to access and view reports quickly.
Q11. Explain the different stages of Report Processing?
Compile: In this stage, it analyzes all expressions in the report definitions and saves the compiled
in intermediate format to the server.
Process: Process is the stage where SSRS runs dataset queries and compile intermediate format
with data and layout
Render: In the third stage it sends an end processed report to a rendering extension to show how
much data fits in each page to create the page report.
Export: Finally in this stage, It exports the reports into a different file format to be shared with
Ans. Before you begin developing and using SSRS reports you need to start by creating data
sources, as to create an SSRS report one or multiple datasets are required for parameters and to
form the body of the report. Then add necessary controls from the toolbox to make it work as a
container for all datasets. Next format all the controls that you have added to the report body.
Then verify and validate the report to deploy the report.
Q13. What data type should be used when creating Radio Button Parameter Type in SSRS
Report?
Ans. When using the bit-type column to add a query for your report, go to Parameter properties to
set the data type to boolean. As it is to show Radio Buttons, Or else, a text box will appear for the
parameter value.
Q14. If you have created a report that has a month name as its parameter, Explain what
would be the easiest way to provide values for the parameter?
Ans. As we have fixed the set of month names, they will don't change in the report. We can add
month names in it as the static values. And If the values of parameters change often, then it is a
great idea to add them in a table and use a Query to get them for Parameters in the SSRS report.
With such practices, if we are adding or removing any of the values from the report we don't have
to make changes every time. We can simply add or remove values in the reports that will be
collected by the query.
Q15. How to Create a Calendar Parameter in SSRS Report
Ans. We would be able to create the parameters and have the Calendar icon to choose a date. As
we always have the column/s in the table which is the date of the DateTime type and we can write
our query as below.
2) Click on the default value, if you want to set the current date as the default date, use this
expression: = Today.
3) Do not specify any value in the Available Values tab and leave it as a No Default value.
By following this, your date parameter will be set and the user will be enabled with Calendar
control when the report is run on the server.
Q16. How would you generate a Sequence Number for all the Records in SSRS Report?
Ans. Use the row number function to generate a sequence number for all the records in your SSRS
report. You can do that by put in a new blank column to your Tablix and then click on the cell to
pivoted to expressions and write expressions.
Q17. How will you display data on a single Tablix extracted from two datasets in an SSRS
report by joining on the single column?
Ans. To display data on a single Tablix using the ‘Lookup Function’ in the SQL server report to find
corresponding values in a dataset that has unique values, join the data from two datasets. Also,
there are many other guidelines that you may be required to follow to create SSRS reports using
two or more datasets. Such as there should be at least one matching column on which we will join
the datasets.
We can use the Lookup Function and write our expressions as shown below.
Q18. Will you store your query in an SSRS Report or a database server? Explain why?
Ans. The SQL queries should be stored in a stored procedure in a database server. As practices of
storing SQL queries in text format are not considered to be good anymore and it should be
avoided. By storing queries in SP to a data server SQL would be in an accumulated format while
providing all the benefits of using an SP.
SSRS Advanced Interview Questions
Q19. What are the command prompt utilities for SSRS? List out some and explain.
RSS Utility: Command ‘RS.exe’, is the command-line utility that supports SharePoint and Native
development modes. That can perform many scripted operations related to SQL SSRS and also be
used while publishing reports on the report server or move another report from the server.
Get-Command
Get-Member
Get-Help
Get-Process
Rsconfig utility: Rsconfig utility is a script host that is used to perform scripted operations such as
Publish reports, create items in a report server database, also configuring and managing report
server connection with report server database. The command file is ‘rsconfig.exe’, which only
supports native developer mode.
Ans. The multi-value parameter enables users to enter and pass more than one query for the
parameter while creating an SSRS report. In any report parameters that we use to filter out the data
and extract useful information that is required for the current scenario. In the multi-value
parameter, you can choose to enter either static values or we can get values from the databases.
Q21. List out the drawbacks reported in the previous versions of SSRS?
Ans. SSRS is most suitable for the reporting needs, however, it has drawbacks too, which are listed
below for better assessment of the reporting tool. (Modifications can be different for these
drawbacks in the current version before and after posting this article)
Q22. Explain the minimum software requirements for the SSRS framework?
Operating System - Windows Server 2000, Windows XP, Windows Server 2003.
Q23. Why should you use SSRS for your next project?
Ans. SSRS can have many advantages, compared to other reporting platforms. Following are:
Ans. At the point when you install SQL Server, it's smarter to get a different SQL Server user name
and password when signing in to the database server. As windows are not considered to be the
most secured database security option. In SSRS you have the alternative to allow SQL Server to
integrate with Windows or require your users to keep a different SQL Server user ID and password.
Your reports will require their username and password key to run reports from SSRS.
Alibaba Cloud
·
Follow
3 min read
3. View the execution plan: Select the SQL statement to be analyzed, and then
click the Explain plan button (that is, the execution plan) on the toolbar, or press
F5 directly; this is mainly used to analyze the execution efficiency of the SQL
statement and the structure of the table, which is convenient for SQL Provide an
intuitive basis for tuning;
1), create a text file shortcuts.txt, and write the following content:
s=SELECT
Copy the code and save it to the ~/PlugIns directory under the installation path
of PL/SQL Developer
2) Tools–>Preferences–>User Interface–>Editor–>AutoReplace, select the
Enable check box, then browse the file to select the shortcuts.txt created earlier,
and click Apply.
3) Restart PL/SQL Developer, enter s+space in the sql window, and sc+space to
test.
Note: shortcuts.txt cannot be deleted, otherwise the shortcut keys cannot be used
Debug shortcuts
Toggle breakpoint: ctrl+b
Start: f9
Run: ctrl+r
Step into: ctrl+n
Step over: ctrl+o
Step out: ctrl+t
Run to exception: ctrl+y
8. Template shortcut keys
By default, after PLSQL Developer logs in, all objects will be selected in Brower.
If the user you log in is dba, to expand the tables directory, it normally takes a
few seconds to wait, but after selecting My Objects, the response rate is
calculated in milliseconds.
Setting method:
Tools menu -> Brower Filters, the order window of Brower Folders will be
opened, and “My Objects” can be set as the default.
In the Tools menu -> Brower Folders, move the directories you often click (for
example Tables Views Seq Functions Procedures) a little higher, and add color
distinctions so that your average time to find tables will be greatly shortened, try
it out.
Vaishali Goilkar
·
Follow
2 min read
FEATURES OF PL/SQL
FEATURES OF PL/SQL
PL/SQL is tightly integrated with SQL.
It offers extensive error checking.
It offers numerous data types.
It offers a variety of programming structures.
It supports structured programming through functions and
procedures.
It supports object-oriented programming.
It supports the development of web applications and server pages.
PL/SQL SYNTAX
SYNTAX OF PL/SQL:
DECLARE
<declaration syntax>
BEGIN
<executable commands>
EXCEPTION
<exception handling>
END;
EXAMPLE OF PL/SQL:
DECLARE
BEGIN
dbms_output.put_line(message);
END;
DECLARATIONS:
EXECUTABLE COMMANDS:
This section is enclosed between the keyword BEGIN and END. It
is a mandatory section.
It consists of the executable/SQL statement of the program.
It should have at least one executable line of code, which may be
just a NULL command to indicate that nothing should be executed.
EXCEPTION HANDLING:
Vaishali Goilkar
·
Follow
2 min read
May 5, 2020
2
In this article, we learn about data warehouse architecture.
DATA WAREHOUSE
TOP-DOWN
This model contains consistent data marts and these data marts
can be delivered quickly.
Data is cleansed, transformed, and loaded into this layer using
back-end tools.
MIXED
ETL TOOLS
METADATA
SSRS REPORT
Vaishali Goilkar
·
Follow
5 min read
1
SSIS package Extract, Transform, Loaded the data into OLAP
databases. Data can be various formats like CSV, Excel.
SSAS is about analysis. Analysis means calculations like sum, count
or some complicated formula which does forecasting, analysis
calculation. In SSAS we create a cube for precalculation which is to
allow queries to return data quickly.
SSRS is a reporting service that helps to fetch data from cube also
fetch the data from SQL Server.
In this article, we create a report and fetch the data from the analysis cube. Here
I have already created a cube.
DATA SOURCES
Here we select Microsoft SQL Server Analysis Services because we
fetch the data from cube then click on Edit.
On the Edit option, we enter the server name as well cube name.
CONNECTION
We specify the MDX query to get data for the report by using Query
Builder.
We write query an analysis cube use MDX. MDX is a
Multidimensional Expressions query language for OLAP.
DESIGN THE QUERY
We drag and drop the field from the Available field to the Display field and click
Next.
DESIGN THE TABLE
REPORT
Report Data provides data to a report. Data Source tells where the
server is located. Data Set tells what kind of query we executed.
REPORT DATA
PARAMETER
QUERY DESIGNER
When we click on the OK parameter is created in the report data.
PARAMETER
We create a new Dataset. And we select the data source and create
a query by using a query designer.
DATASET PROPERTIES
Vaishali Goilkar
·
Follow
2 min read
Jan 6, 2020
EMPLOYEE TABLE
FROM [dbo].[EMPLOYEE] e2
RESULT
SELECT * FROM
FROM Employee E
)A
WHERE Rnk=3;
RESULT
3. Select all records from Employee table where name not in ‘RIA’
and ‘RAJ’
SELECT EMP_NAME
FROM Employee
RESULT
SELECT ID
FROM EMPLOYEE
WHERE ID <> 8
RESULT
Query 1:-
SELECT E.[EMP_CODE] FROM [dbo].[EMPLOYEE] E
ON E.ID = E1.ID
Query 2:-
SELECT E.[EMP_CODE]
FROM [dbo].[EMPLOYEE] E
WHERE E.ID NOT IN
SELECT [ID]
FROM [dbo].[EMP] E1
Query 3:-
SELECT E.[EMP_CODE]
FROM [dbo].[EMPLOYEE] E
WHERE NOT EXISTS
SELECT NULL
FROM [dbo].[EMP] E1
RESULT
SELECT 16
SELECT $
SELECT count(*)
RESULT
FROM [dbo].[EMPLOYEE]
Select @EMP_NAME
RESULT
Stored procedure in PL/SQL
siva prasad
·
Follow
5 min read
4
A Procedure is a subprogram unit that contains a group of PL/SQL
statements. A Stored procedure in PL/SQL defined as a series of
declarative SQL statements that can be stored in the database
record.
It performs one or more specific tasks. It is the same as procedures
on other programming languages. Actually, the procedure means a
function or a method.
They can be called through triggers, other procedures, or
applications on Java, PHP tec.The procedure contains two sections.
One is a Header and another one is a body. you can learn more
through sql online training
The Header section contains the name of the procedure and the
parameters or variables passed to the procedure.
The Body section contains the declaration, execution section and
exception section similar to a PL/SQL block.
When you want to create a procedure or function, then you have to define
parameters. So, first of all, we want to about what is a parameter?. Now let us
discuss about parameters.
Parameter:
The parameter is a variable or method. They are used to exchange data between
stored procedures and functions. They allow us to give input to the subprograms
and helps to divide them. The parameters are defined at the time of subprograms
creation. They contain calling statements of those subprograms as well as
interact with the values of subprograms. The data type of the parameter in the
subprogram and the calling statement must be the same. The size of the
parameter should not mention at the time of parameter declaration, because size
is dynamic for this type. become a professional developer in oracle
through oracle pl sql training
1. IN Parameter:
This can be referenced by the procedure of function. It is used to give input to the
subprograms. It is a read-only variable inside the programs. Hence, their values
are not changing inside the subprogram. In the calling statement, these
parameters can act as a variable or a literal value, or an expression.
2.OUT Parameter:
These are used to get output from the subprograms. They contain a read-write
variable inside the subprograms. So, their values can be changed inside the
subprograms. At the calling statement, these parameters always be a variable as
well as they hold the value from the current program.
3. IN OUT Parameter:
This parameter is used for giving input as well as for getting output from the
subprograms. This is one of the read-write variables inside the subprograms. So
their values can be changed inside the subprograms. The parameters should
always be a variable to hold the value from the subprograms in the calling
statement. They can be called at the time of creating subprograms. here is a blog
on oracle pl sql way to success
Syntax to create a Stored procedure:
[ (parameter [,parameter]) ]
IS
[declaration_section]
BEGIN
executable_section
[EXCEPTION
exception_section]
END [procedure_name];
Declarative section:
In this section, a subprogram starts with the DECLARE keyword. It has a type,
cursors, constants, variables, exceptions, and nested subprograms.prepare over
the important pl sql interview questions
Executable section:
It is a mandatory section and it contains statements to perform the allocated
actions.
Exception Section:
IS
r_contact contacts%ROWTYPE;
BEGIN
SELECT *
INTO r_contact
FROM contacts
dbms_output.put_line( r_contact.first_name || ‘ ‘ ||
EXCEPTION
dbms_output.put_line( SQLERRM );
END;
To compile the procedure, click on the Run statement button as shown in the
below diagram.
If the procedure is compiled successfully, then u will see the new procedure at
the procedure node as shown below.
Editing the procedure:
If you want to change the code in an existing procedure, then you follow these
steps.
It looks as below:
Removing a procedure:
If you want to delete a procedure, you can use “DROP PROCEDURE”. The
syntax for the Removing procedure is given below.
DROP PROCEDURE procedure_name;
1. First Right-click on the procedure name, then drop the procedure which you
want.
3. Finally, in the prompt dialog, click the “Apply” button to remove the
procedure.
It is shown below.
Advantages of PL/SQL:
● They decrease the traffic between the database and the application. This
because, the lengthy statements already spread into the database, so we don’t
need to sent again via the application.
● These are having Code reusability, functions in pl sql and methods work in
other languages such as C/C++ and Java.
Disadvantages of PL/SQL:
1. What is Oracle?
Oracle is a relational database management system, which Establishes data in the form of tables. Oracle
makes skillful use of all system resources, on all hardware architecture, to deliver Incomparable performance,
price performance, and scalability.
A table is the vital unit of data storage in an Oracle database. The tables of a database hold all of the user
Available data. Table data is stored in rows and columns. Tables are the Entire unit of data storage in an
Oracle Database. Data is stored in rows and columns. To determine a table with a table name, such as
employees, and a set of columns. A row is a collection of column information Related to a single record.
In Oracle, the view is a virtual table. Every view has a query attached to it. (The query is a SELECT
statement that identifies the columns and rows of the table(s) the view uses.) It is stored in the Oracle data
dictionary and does not store any data. It can be executed when called. A view is generated by a query
joining one or Major tables.
Syntax:
SELECT columns
FROM tables
WHERE conditions;
1. Library cache
2. Data dictionary cache.
Library cache: This layer has information about SQL statements that were parsed, information about cursors
and any plan data.
Data Dictionary cache: this Layer has information about the accounts of the users, their Advantage and
segments information
5. What is PL/SQL?
PL/SQL Determines for Procedural Language extension of Structured Query Language (SQL). It is a block-
structured language having true/false blocks which are made up of 3 sub-blocks i.e. a declarative Segment,
an executable Segment, and an exception building Segment. PL/SQL is combined with Oracle and the
functionalities of PL/SQL extend after each release of Oracle database.
It combines procedural language elements like conditions and loops and allows declaration of constants and
variables, procedures and functions. It also helps the users to develop difficult database applications using
control structures, procedures, modules, etc. PL/SQL is not finite by any case-sensitive letters so you are free
to use lower case letters or upper case letters.
The vital structure of PL/SQL is the BLOCK structure. Each program of PL/SQL Consist Of both the SQL
statement and the PL/SQL statement, which eventually forms the PL/SQL block. Every PL/SQL block has 3
defined sections out of which two sections are optional i.e. the declaration section and the exception handling
section and one more section is mandatory i.e. the execution section.
Constraints will do memory location to table Contrast whereas triggers will do table to table comparison. For
this triggers will use magic tables(inserted, deleted).In the order of procession first Constraints next Triggers,
But performance wise triggers will give the best performance because the table to table comparison is faster
than memory location to table comparison.
Triggers Constraints
Trigger Influence only those rows, which are added Constraints affect all the rows i.e. the one that existed
after it is enabled. before and the ones that were newly added.
TRUNCATE:
TRUNCATE SQL query deletes all rows from a table, without logging the individual row deletions. It is a DDL
command is executed using a table lock and the entire table is locked to remove all records. We cannot use
WHERE clause with TRUNCATE. It deletes all rows from a table. To use Truncate on a table you need at
least ALTER acceptance on the table.
DELETE:
To execute a DELETE queue, delete acceptance is required on the target table. If you need to use a WHERE
clause in a DELETE, select permissions are required as well. It is a DML command. It is completed using a
row lock, each row in the table is locked for deletion. We can use where clause with DELETE to filter & delete
particular records. The DELETE command is used to delete rows from a table based on WHERE condition. It
maintains the log, so it slower than truncates. The DELETE statement deletes rows one at a time and records
an entry in the transaction log for each deleted row. Identification of column keep DELETE retains the
identity. To use Delete you need DELETE acceptance on the table
DELETE FROM Customers;
A database management system (DBMS) consists of a collection of interrelated data and a set of programs
to access that data. A Relational Database Management System (RDBMS) is a DBMS that is relational in the
name. This means that the Inner l workings access data in a relational manner. Oracle is an RDBMS.
Get the best pl sql training from onlineitguru and complete certification also.
Schema objects are abstractions or logical structures that Assign to database objects or structures. Schema
objects are found of such things as clusters, indexes Hold Data, packages, sequences, stored procedures,
synonyms, tables, views, and so on...
SQL*Plus is the ad-hoc user coherence tool for the Oracle RDBMS. With SQL*Plus, you can Attach into the
RDBMS and run SQL commands and PL SQL programs. This is the Main no application interface into the
Oracle RDBMS. SQL*Plus is actually an interactive query tool, with some Programming capabilities. It is a
non-GUI, character-based tool that has been Entire since the dawn of the Oracle age. Using SQL*Plus, you
can Get in an SQL statement, such as a SELECT query, and view the results. You can also execute Data
Definition Language (DDL) commands that allow you to Manage and modify your database. You can even
enter and execute PL/SQL code.
The Oracle Enterprise Manager is the Advance graphical administration tool designed to help the DBA
manage one or more Oracle systems. , In case their data centers or in the Oracle Cloud. Through Inner
integration with Oracle’s product stack, Enterprise Manager Gives market-leading and maintenance
management and automation support for Oracle applications, databases, middle ware, hardware, and
engineered systems.
The errors such as spelling mistakes come below syntax errors, which can easily be found using a PL/SQL
compiler. As far as Run time error is concerned, they are found in the PL/SQL block. You need to add an
exception handling section to handle those errors. Such errors come under SELECT INTO statements that
rebound no rows.
Mirroring is a procedure of having a copy of Redo log files. It is done by Build group of log files together. This
Establish that LGWR automatically writes them to all the members of the current online redo log group. If the
group suddenly fails, the database automatically switches over to the next group. It diminishes the
performance.
A foreign key is a Collection of columns with value is based on the primary key values from another table. A
foreign key constraint, is known as referential integrity constraint. Foreign key identifies the column collection
of columns in the child table that makes up of the foreign key.
The SGA is a shared memory region that Oracle uses to Reserve data and control information for one Oracle
instance. The SGA is allocated when the Oracle instance begins: it is deal located when the Oracle instance
shuts down. Each Oracle instance that begins lias its own SGA. The information in the SGA is made up of
the database buffers, the redo log buffer, and the shared pool: each has a fixed size and is created at
instance started
A database is Split into Logical Storage Unit called table space. A table space is used to grouped related
logical structures stable
PL/SQL packages are schema objects that groups logically relevant functions stored procedures, cursors,
and variables at one place. The package is compiled and stored in a database and its contents can be
shared. Packages have 2 parts: a specification and a body.
There is a considerable Compare between the ROLLBACK and ROLLBACK TO statement. When you apply
the ROLLBACK command a transaction is ending undone where all the blocks are released. As far as
ROLLBACK TO command is concerned, the transaction is undone but till a SAVEPOINT. Thus, the
transaction remains real and active even after the command is implemented.
DUP_VAL_ON_INDEX
NO_DATA_FOUND
CURSOR_ALREADY_OPEN
INVALID_NUMBER
INVALID_CURSOR
TIMEOUT _ON_RESOURCE
LOGON_DENIED
ZERO_DIVIDE
TOO_MANY_ROWS
PROGRAM_ERROR
STORAGE_ERROR
VALUE_ERROR
Triggers are used to define an action when database related events are performed. It is used for preventing
invalid transactions, enforcing complex business rules, etc. Triggers mean activating an action.
A mutating table is the one, which is modified using a DML statement or a table with defined triggers. A
constraining table is the one, which is being read for a referential integrity constraint.
Normal Queries
Sub Queries
Co-related queries
Nested queries
Compound
queries
Join Bringing to gather columns and data from two or more tables. Join is a query in which data is retrieved
from 2 or more table. A join matches data from 2 or more tables, based on the values of one or more columns
in Exclusive table.
A stored procedure is a Set of SQL statements that are pre-parsed and stored in the database. When the
stored procedure is Appeal, only the input and output data is passed; the SQL statements are not transferred
or parsed.
A The SAVE POINT command is utilized to set a point within a transaction to which you may rollback. This
command helps in cancelling the portions of the current transaction. Using ROLLBACK with the SAVE POINT
TO clause, a transaction can be undone in parts rather than rolling back the entire transaction. What Is a
Save point Command?
Use the DBMS_OUTPUT package. One More possible method is to just use the SHOW ERROR command,
but this only shows errors. The DBMS_OUTPUT package can be utilized to show intermediate results from
loops and the status of variables as the procedure is executed. The new package UTL_FILE can also be
used
[contact-form-7 id="5350" title="Post insertion"]
When this Section is used with the DROP command, a parent table can be dropped even when a child table
exists
You get this error when you get a Print too old within rollback. It can usually be solved by expanding the undo
retention or increasing the size of rollbacks. You should also look at the logic Tangled in the application
getting the error message.
------Best of Luck-------