DBMS Unit 4
DBMS Unit 4
Background
SQL was developed in 1970‟s in an IBM laboratory “San Jose Research Laboratory” (now the Amaden
Research center). SQL is derived from the SEQUEL one of the database language popular during 1970‟s.
SQL established itself as the standard relational database language. Two standard organization (ANSI) and
International standards organization (ISO) currently promote SQL standards to industry. In 1986 ANSI &
ISO published an SQL standard called SQL-86. In 1987, IBM published its own corporate SQL standard,
the system application Architecture Database Interface (SAA-SQL). In 1989, ANSI published extended
standard for SQL called, SQL-89. The next version was SQL-92, and the recent version is SQL: 1999.
Query language: language through which user request information from database.
These languages are generally higher level language than programming language.
User instructs the system to perform sequence of operation on the database to compete the desired
result.
Example : relational algebra
User describes the desired information without giving a specific procedure for obtaining that desired
information.
Examples: tuple relational calculus and domain relational calculus.
What is SQL?
SQL is Structured Query Language, which is a computer language for storing, manipulating and
retrieving data stored in relational database.
SQL is the standard language for Relation Database System. All relational database management
systems like MySQL, MS Access, and Oracle, Sybase, Informix, postgres and SQL Server use SQL
as standard database language
Why SQL?
Characteristics of SQL:
SQL usage by its very nature is extremely flexible. It uses a free from syntax that gives the user the ability to
structure SQL statements in a way best suited to him. Each SQL request is parsed by the RDMS before
execution, to check for proper syntax and to optimize the request. Unlike certain programming languages,
there is no need to start SQL statements in a particular column or be finished in a single line. The
same SQL request can be written in a variety of ways.
Advantages of SQL:
SQL is a high level language that provides a greater degree of abstraction than procedural
languages.
SQL enables the end-users and systems personnel to deal with a number of database
management systems where it is available. Increased acceptance and availability of SQL are also
in its favor.
Applications written in SQL can be easily ported across systems. Such porting could be required
when the underlying DBMS needs to be upgraded or changed.
SQL specifies what is required and not how it should be done.
The language while being simple and easy to learn can handle situations.
The standard SQL commands to interact with relational databases are CREATE, SELECT, INSERT,
UPDATE, DELETE and DROP. These commands can be classified into groups based on their nature.
They are:
DDL Commands
DML Commands
DCL Commands
DRL/DQL Commands
TCL Commands
Data definition language is used to create, alter and delete database objects.
The commands used are create, alter and drop.
The principal data definition statements are :
Create table, create view, create index
Alter table
Drop table, drop view, drop index
Command Description
CREATE Creates a new table, a view of a table, or other object in
database
(integrity-constraint1),
.....(integrity-constraint1))
where r is the relation name, Ai is the name of an attribute, and Di is the domain of that attribute. The
allowed integrity-constraints include
and
check(P)
CREATE
Example:
assets integer
Example:
create table customers
(
id number (10) not null,
name varchar (20) not null,
age number (5) not null,
address char (25),
salary decimal (8, 2),
primary key (id)
);
Example:
CREATE TABLE dept
(
dept no NUMBER(2) PRIMARY KEY,
dname VARCHAR2(20) NOT NULL
); (column level Primary Key Constraint)
ALTER:
SQL ALTER TABLE command is used to add, delete or modify columns in an existing table
Syntax:
1. The basic syntax of ALTER TABLE to add a new column in an existing table is as follows:
ALTER TABLE table_name ADD column_name datatype;
EX: ALTER TABLE CUSTOMERS ADD phno number (12);
2. The basic syntax of ALTER TABLE to DROP COLUMN in an existing table is as follows:
ALTER TABLE table_name DROP COLUMN column_name;
EX: ALTER TABLE CUSTOMERS DROP column phno;
3. The basic syntax of ALTER TABLE to change the DATA TYPE of a column in a table is as follow
ALTER TABLE table_name MODIFY COLUMN column_name datatype;
Ex:ALTER TABLE customer MODIFY COLUMN phno number(12);
4. The basic syntax of ALTER TABLE to add a NOT NULL constraint to a column in a table is as follows:
ALTER TABLE table_name MODIFY column_name datatype NOT NULL;
Ex:ALTER TABLE customers MODIFY phno number (12); NOT NULL;
5. The basic syntax of ALTER TABLE to ADD PRIMARY KEY constraint to a table is as follows:
ALTER TABLE table_name ADD PRIMARY KEY (column1, column2...);
Ex:ALTER TABLE customer ADD PRIMARY KEY (id,phno);
TRUNCATE:
SQL TRUNCATE TABLE command is used to delete complete data from an existing table.
Syntax:
The basic syntax of TRUNCATE TABLE is as follows:
TRUNCATE TABLE table name;
EX:TRUNCATE TABLE student;
DROP:
SQL DROP TABLE statement is used to remove a table definition and all data, indexes, triggers,
constraints, and permission specifications for that table.
Syntax:
Basic syntax of DROP TABLE statement is as follows:
DROP TABLE table_name;
EX: DROP TABLE student;
Renaming a Table:
You can rename a table provided you are the owner of the table. The general syntax is:
Syntax:
RENAME old table name TO new table name;
EX: RENAME Test To Test_info;
Basic Structure
Basic structure of an SQL expression consists of select, from and where clauses.
select clause lists attributes to be copied - corresponds to relational algebra project.
from clause corresponds to Cartesian product - lists relations to be used.
where clause corresponds to selection predicate in relational algebra.
SELECT Statement is used to fetch the data from a database table which returns data in the form of result
table. These result tables are called result-sets.
Syntax:
The Following Syntax is used to retrieve specific attributes from the table is as follows:
SELECT column1, column2, columnN FROM table_name;
Here, column1, column2...are the fields of a table whose values you want to fetch
The Following Syntax is used to retrieve all the attributes from the table is as follows:
SELECT * FROM table_name;
Ex: Select * from student;
FROM: The FROM list in the FROM clause is a list of table names. A Table name can be followed by a
range variable. A range variable is particularly useful when the same table name appears more than once in
the from-list.
WHERE: The qualification in the WHERE clause is a Boolean combination (i.e., an expression using the
logical connectives AND, OR, and NOT) of conditions of the form expression op expression, where op is
one of the comparison operators {<, <=, =, <>, >=,>}.
Tuple Variables:
Tuple Variables are mainly used to save typing effort.
They are useful for saving typing, but there are other reasons to use them:
If you join a table to itself you must give it two different names otherwise referencing
the table would be ambiguous.
It can be useful to give names to derived tables, and in some database systems it is
required... even if you never refer to the name.
Example1
select customer_name, T.loan_number,S.amount
from borrower as T, loan as S
where T.loan_number = S.loan_number
Example2
SELECT CUSTOMER_NAME, T.LOAN_NUMBER, S.AMOUNT
FROM BORROWER AS T, LOAN AS S
WHERE T.LOAN_NUMBER = S.LOAN_NUMBER
String Operations:
The string operators in SQL are used to perform important operations such as pattern
matching, concatenation, etc... using the operators. The pattern matching are described using
two special characters (wildcard characters) (1) percent(%) (2) and underscore( _ ) in
concurrence with the Like operator to search for the specific patterns in strings and by the
usage of concatenation operation one or more strings or columns of the tables can be
combined together.
For example:
i) “MUM”: It matches any string beginning with MUM.
ii) “%abc%”: It matches any string of exactly 3 characters.
iii) “_ _ _”: It matches any string having at least 3 characters.
1. Concatenation Operator
SELECT 'Hello' + 'World!' AS StringConcatenated;
->HelloWorld!
2 Like Operator
i) SELECT * FROM STUDENTS WHERE FIRSTNAME='Preety';
In the above query, it can be seen that the wildcard character % is used before „j‟ and this will find the values
which end with „j‟.
Sid FirstName
1 Preety
The „%a%‟ finds any value which has „a‟ in any position of the first name.
Sid FirstName
1 Raj
2 Harry
The above query will display the FIRSTNAME of the students „Raj‟ and „Harry‟ as they have „a‟ at the
second position in their first names.
Sid FirstName
1 Preety
In a result, it can be seen that the first name „Preeti‟ starts with „p‟ and has a length of at least two characters.
Ordering the Display of Tuples:
SQL allows the user to control the order in which tuples are displayed.
order by makes tuples appear in sorted order (ascending order by default).
desc specifies descending order.
asc specifies ascending order.
Syntax:
SELECT column-list|* FROM table-name ORDER BY ASC | DESC;
select *
from loan
Duplicate Tuples:
Formal query languages are based on mathematical relations. Thus no duplicates appear in relations.
As duplicate removal is expensive, SQL allows duplicates.
To remove duplicates, we use the distinct keyword.
To ensure that duplicates are not removed, we use the all keyword.
To find duplicate tuples, use GROUP BY HAVING clause.
In terms of the general approach for either scenario, finding duplicates values in SQL comprises two key
steps:
1. Using the GROUP BY clause to group all rows by the target column(s) – i.e. the column(s) you want
to check for duplicate values on.
2. Using the COUNT function in the HAVING clause to check if any of the groups have more than 1
entry; those would be the duplicate values.
select * from StudentTable;
Id Name
100 Aahan
101 Ram
101 Hari
100 Binod
100 Aahan
100 Aahan
select Id,Name,count(*)
from StudentTable
group by Id,Name
having count(*) > 1;
This query here will find the count of records that have the same values. The having clause ensures that only
rows with a count greater than one are shown. The columns in the select clause and the group by clause are
those that indicate a row as a duplicate.
In this example if a row has the same Id and Name as another row it‟s a duplicate in the table.
Id Name Count(*)
100 Aahan 3
Set Operations:
SQL has the set operations union, intersect and except.
1) union:
In SQL the UNION clause combines the results of two SQL queries into a single table of all
matching rows. The two queries must result in the same number of columns and compatible data
types in order to unite. Any duplicate records are automatically removed unless UNION ALL is
used.
Sales1
person amount
shyam 1000
ram 2000
hari 3000
Sales2
Person amount
shyam 2000
ram 2000
kiran 4000
mohan 5000
OUTPUT
Person amount
shyam 1000
shyam 2000
ram 2000
Hari 3000
kiran 4000
mohan 5000
Person amount
shyam 1000
shyam 2000
ram 2000
ram 2000
hari 3000
kiran 4000
mohan 5000
2) intersect:
The SQL INTERSECT operator takes the results of two queries and returns only rows that appear in both
result sets. For purposes of duplicate removal the INTERSECT operator does not distinguish between
NULLs. The INTERSECT operator removes duplicate rows from the final result set. The INTERSECT ALL
operator does not remove duplicate rows from the final result set, but if a row appears X times in the first
query and Y times in the second, it will appear min(X, Y) times in the result set.
One Two
Id Name Id Name
1 Ram 2 Adam
2 Adam 3 Kamal
3) except:
The SQL EXCEPT operator takes the distinct rows of one query and returns the rows that do not appear
in a second result set. For purposes of row elimination and duplicate removal, the EXCEPT operator
does not distinguish between NULLs. The EXCEPT ALL operator does not remove duplicates, but if a
row appears X times in the first query and Y times in the second, it will appear max(X - Y, 0) times in
the result set.
Notably, the Oracle platform provides a MINUS operator which is functionally equivalent to the SQL
standard EXCEPT and DISTINCT operator.
from depositor)
except
(select cname
from borrower)
In SQL we can compute functions on groups of tuples using the group by clause.
Attributes given are used to form groups with the same values. SQL can then compute
Examples:
We use distinct so that a person having more than one account will not be counted more than
once.
3) Find branches and their average balances where the average balance is more than $1200.
Predicates in the having clause are applied after the formation of groups.
4) Find the average balance of each customer who lives in Vancouver and has at least three
accounts:
If a where clause and a having clause appear in the same query, the where clause predicate is
applied first.
Tuples satisfying where clause are placed into groups by the group by clause.
The having clause is applied to each group.
Groups satisfying the having clause are used by the select clause to generate the
result tuples.
If no having clause is present, the tuples satisfying the where clause are treated as a
single group.
Null Values:
1. With insertions, we saw how null values might be needed if values were unknown. Queries
involving nulls pose problems.
2. If a value is not known, it cannot be compared or be used as part of an aggregate function.
3. All comparisons involving null are false by definition. However, we can use the keyword null
to test for null values:
from loan
4. All aggregate functions except count ignore tuples with null values on the argument attributes.
Nested Subqueries:
Subquery:
A subquery, also known as a nested query or subselect, is a SELECT query embedded within the
WHERE or HAVING clause of another SQL query.
Subqueries provide an easy and efficient way to handle the queries that depend on the results from
another query. They are almost identical to the normal SELECT statements, but there are few
restrictions. The most important ones are listed below:
Subqueries are most frequently used with the SELECT statement, however you can use them
within a INSERT, UPDATE, or DELETE statement as well, or inside another subquery.
A subquery can be nested inside other subqueries. SQL has an ability to nest queries within one
another. A subquery is a SELECT statement that is nested within another SELECT statement and
which return intermediate results. SQL executes innermost subquery first, then next level.
1) Set Membership
from borrower
where cname in
(select cname
from account
where bname=``SFU'')
and bname=``SFU''
This finds all customers who have a loan and an account at the SFU branch in yet another way.
Finding all customers who have a loan but not an account, we can use the not in operation.
2) Set Comparison
and S.bcity=``Burnaby''
or we can write
select bname
from branch
(select assets
from branch
where bcity=``Burnaby'')
to find branches whose assets are greater than some branch in Burnaby.
We can use any of the equality or inequality operators with some. If we change > some to > all,
we find branches whose assets are greater than all branches in Burnaby.
Example. Find branches with the highest average balance. We cannot compose aggregate
functions in SQL, e.g. we cannot do max (avg ...)).
Instead, we find the branches for which average balance is greater than or equal to all average
balances:
select bname
from account
group by bname
from account
group by bname)
3) Test for Empty Relations
The exists construct returns true if the argument subquery is nonempty.
Find all customers who have a loan and an account at the bank.
select cname
from borrower
from depositor
select T.cname
from depositor as T
where unique (select R.cname
from account, depositor as R
where T.cname = R.cname and
R.account# = account.account# and
account.bname = ``SFU")
Derived Relations
SQL-92 allows a subquery expression to be used in the from clause.
If such an expression is used, the result relation must be given a name, and the attributes
can be renamed.
Find the average account balance of those branches where the average account balance is
greater than $1,000.
Views:
A view in SQL is defined using the create view command:
Having defined a view, we can now use it to refer to the virtual relation it creates. View
names can appear anywhere a relation name can.
We can now find all customers of the SFU branch by writing
select cname
from all-customer
where bname=``SFU''
delete from r
where P
Tuples in r for which P is true are deleted. If the where clause is omitted, all tuples are deleted.
The request delete from loan deletes all tuples from the relation loan.
examples:
2) Delete all loans with loan numbers between 1300 and 1500.
delete from loan
where loan# between 1300 and 1500
3) Delete all accounts at branches located in Surrey.
delete from account
where bname in
(select bname
from branch
where bcity=``Surrey'')
We may only delete tuples from one relation at a time, but we may reference any number of
relations in a select-from-where clause embedded in the where clause of a delete.
However, if the delete request contains an embedded select that references the relation from
which tuples are to be deleted, ambiguities may result.
For example, to delete the records of all accounts with balances below the average, we might
write
You can see that as we delete tuples from account, the average balance changes!
Solution: The delete statement first test each tuple in the relation account to check whether the
account has a balance less than the average of the bank. Then all tuples that fail the test are
deleted. Perform all the tests (and mark the tuples to be deleted) before any deletion then delete
them en masse after the evaluations!
Insertion:
To insert data into a relation, we either specify a tuple, or write a query whose result is the set of
tuples to be inserted. Attribute values for inserted tuples must be members of the attribute's
domain.
examples:
To insert a tuple for Smith who has $1200 in account A-9372 at the SFU branch.
from loan
where bname=``SFU''
It is important that we evaluate the select statement fully before carrying out any insertion. If some
insertions were carried out even as the select statement were being evaluated, the insertion
select *
from account
might insert an infinite number of tuples. Evaluating the select statement completely before
performing insertions avoids such problems.
It is possible for inserted tuples to be given values on only some attributes of the schema. The
remaining attributes are assigned a null value denoted by null.
We can prohibit the insertion of null values using the SQL DDL.
Updates:
Updating allows us to change some values in a tuple without necessarily changing all.
examples:
update account
update account
Note: in this example the order of the two operations is important. (Why?)
In general, where clause of update statement may contain any construct legal in a where clause of
a select statement (including nesting).
A nested select within an update may reference the relation that is being updated. As before, all
tuples in the relation are first tested to see whether they should be updated, and the updates are
carried out afterwards.
For example, to pay 5% interest on account whose balance is greater than average, we have
update account
from account
Update of a view:
The view update anomaly previously mentioned in Chapter 3 exists also in SQL.
An example will illustrate: consider a clerk who needs to see all information in the loan relation
except amount.
from loan
Since SQL allows a view name to appear anywhere a relation name may appear, the clerk can
write:
This insertion is represented by an insertion into the actual relation loan, from which the view is
constructed. However, we have no value for amount.
This insertion results in (``SFU'', ``L-307'', null) being inserted into the loan relation.
As we saw, when a view is defined in terms of several relations, serious problems can result. As a
result, many SQL-based systems impose the constraint that a modification is permitted through a
view only if the view in question is defined in terms of one relation in the database.
Joined Relations
The SQL Joins clause is used to combine records from two or more tables in a database. A JOIN is
a means for combining fields from two tables by using values common to each.
Cartesian-product
natural joins
inner join
natural inner join
left outer join
right outer join
full outer join
cross join
union join
These operations are typically used as subquery expressions in the from clause.
To show the results of some of the joins, the following tables will be use:
left outer join
The left outer join is computed with the results of the inner join as above. Then, for every tuple t in the
left-hand-side relation that did not match any tuple in the right-hand-side relation borrower in the inner join, a
tuple r is added to the result of the join, where the left-side is filled in from loan and the remainder is filled in
with null values for each attribute that appears on the right-hand-side. For any tuple in borrower that does not
have a match in loan is ignored.
Note: Again, the attribute loan-number appears twice, because it is on both the left and the right sides.
CROSS JOIN returns the Cartesian product of rows from tables in the join. In other words, it will produce
rows which combine each row from the first table with each row from the second table.
Eg:
SELECT *
FROM employee CROSS JOIN department;
Natural join
The natural join is a special case of equi-join. Natural join (⋈) is a binary operator that is written as (R ⋈
S) where R and S are relations. The result of the natural join is the set of all combinations of tuples in R and
S that are equal on their common attribute names. For an example consider the tables Employee and Dept
and their natural join:
A natural join is a type of equi-join where the join predicate arises implicitly by comparing all columns in
both tables that have the same column-names in the joined tables. The resulting joined table contains only
one column for each pair of equally named columns. In the case that no columns with the same names are
found, the result is a cross join.
SELECT *
FROM
Employee NATURAL JOIN Dept;
Eg: self
SQL is known as the Structured Query Language. It is the language that we use to perform operations and
transactions on the databases.
Static SQL statements do not change from execution to execution. The full text of static SQL statements
are known at compilation.
When we talk about industry-level applications we need properly connected systems which could draw
data from the database and present to the user. In such cases, the embedded SQL comes to our rescue.
The sql standard defines embedding of sql in a variety of programming language.
We embed SQL queries into high-level languages such that they can easily perform the logic part of our
analysis.
Some of the prominent examples of languages with which we embed SQL are as follows:
C++
Java
Python etc.
A language to which SQL queries embedded is referred to as a host language, and SQL structure permitted
in the host language comprise embedded SQL.
Embedded SQL starts with identifier, usually EXEC SQL.
Ends with terminator dependent on host language:Ada, C: terminator is semicolon (;), COBOL: terminator
is END-EXEC etc…
comment VARCHAR2(40));
With static SQL, all of the data definition information, such as table definitions, referenced by the SQL
statements in your program must be known at compilation. If the data definition changes, you must change
and recompile the program. Dynamic SQL programs can handle changes in data definition information,
because the SQL statements can change "on the fly" at runtime. Therefore, dynamic SQL is much more
flexible than static SQL. Dynamic SQL enables you to write application code that is reusable because the code
defines a process that is independent of the specific SQL statements used.
As the name indicates, it is a technique that allows professionals to build SQL statements that can be changed
dynamically at the runtime. A dynamic query is a statement that can be constructed at execution or runtime;
for example, the application may allow users to run their own queries at execution.
Transactions group a set of tasks into a single execution unit. Each transaction begins with a specific task and
ends when all the tasks in the group successfully complete. If any of the tasks fail, the transaction fails.
Therefore, a transaction has only two results: success or failure.
Incomplete steps result in the failure of the transaction. A database transaction, by definition, must be atomic,
consistent, isolated and durable. These are popularly known as ACID properties.
COMMIT
ROLLBACK
SAVEPOINT
These commands are only used with the DML Commands such as – INSERT, UPDATE and DELETE.
COMMIT
COMMIT command in SQL is used to save all the transaction-related changes permanently to the disk.
Whenever DDL commands such as INSERT, UPDATE and DELETE are used, the changes made by these
commands are permanent only after closing the current session. So before closing the session, one can easily
roll back the changes made by the DDL commands. Hence, if we want the changes to be saved permanently to
the disk without closing the session, we will use the commit command.
Syntax:
COMMIT;
Examle;
Following is an example which would delete those records from the table which have age = 20 and then
COMMIT the changes in the database.
Queries:
COMMIT;
SAVEPOINT:
We can divide the database operations into parts. For example, we can consider all the insert related
queries that we will execute consecutively as one part of the transaction and the delete command as the
other part of the transaction. Using the SAVEPOINT command in SQL, we can save these different parts
of the same transaction using different names. For example, we can save all the delete related queries with
the savepoint named SP1. To save all the delete related queries in one savepoint, we have to execute the
SAVEPOINT query followed by the savepoint name after finishing the delete command execution.
Syntax:
SAVEPOINT SAVEPOINT_NAME;
SAVEPOINT SP1;
This command is used to save the data at a particular point temporarily, so that whenever needed can be
rollback to that particular point.
ROLLBACK:
While carrying a transaction, we must create savepoints to save different parts of the transaction.
According to the user's changing requirements, he/she can roll back the transaction to different savepoints.
Consider a scenario: We have initiated a transaction followed by the table creation and record insertion
into the table. After inserting records, we have created a savepoint student (let it be above student table).
Then we executed a delete query, but later we thought that mistakenly we had removed the useful record.
Therefore in such situations, we have an option of rolling back our transaction. In this case, we have to roll
back our transaction using the ROLLBACK command to the savepoint student, which we have created
before executing the DELETE query.
ROLLBACK TO SAVEPOINT_NAME;
ROLLBACK TO student;