Dbms Unit II
Dbms Unit II
Data Models
• Definition: It is a collection of conceptual tools for describing data,
relationships among data, semantics (meaning) of data and constraints.
• There are various data models used in database systems and these are as
follows –
Relational Model:
o Relation model consists of collection of tables, each of which is assigned a
unique name, stores data and also represents the relationship among the
data.
o Table is also known as relation. The table contains one or more columns
o Each table contains record of particular type, and each record type defines
a fixed number of fields or attributes.
o For example – Following figure shows the relational model by showing the
relationship between Student and Result database. For example – Student
Ram lives in city Chennai and his marks are 78. Thus the relationship
between these two relations is maintained by the SeatNo. Column
Mark
SeatNo Name City SeatNo s
Advantages :
(i) Structural Independence: Structural independence is an ability that allows us
to make changes in one database structure without affecting other. The relational
model have structural independence. Hence making required changes in the
database is convenient in relational database model.
(ii)Conceptual Simplicity:The relational model allows the designer to simply focus
on logical design and not on physical design. Hence relational models are
conceptually simple to understand.
(iii) Query Capability: Using simple query language (such as SQL) user can get
information from the database or designer can manipulate the database
structure.
(iv) Easy design, maintenance and usage: The relational models can be
designed logically hence they are easy to maintain and use.
Disadvantages :
(i) Relational model requires powerful hardware and large data storage
devices.
(ii) May lead to slower processing time.
(iii) Poorly designed systems lead to poor implementation of database systems.
Relational Databases:
• Relational database is a collection of tables having unique names.
• For example – Consider the example of Student table in which the information
about the student is stored.
RollN Nam Phone
o e
001 AAA 111111111
1
002 BBB 222222222
2
003 CC 333333333
C 3
Student table
The above table consists of three column headers RollNo, Name and Phone.
Each row of the table indicates the information of each student by means of his
Roll Number, Name and Phone number.
Similarly consider another table named Course as follows –
103 Electrical 5
104 Civil 3
Course table
RollNo CourseID
001 102
002 104
003 101
Admission
From this third table we can easily find out that the course to which the RollNo 001
is admitted is computer Science.
There are some commonly used terms in Relational Model and those are -
Tuple or record or row : The single entry in the table is called tuple. The tuple
represents a set of related data. In above Student table there are four tuples. One of
the tuple can be represented as
00 AA 8 111111111
1 A 8 1
Attribute or columns: It is a part of table that contains several records. Each record
can be broken down into several small parts of data known as attributes. For example
the above table consists of four attributes such as RollNo,Name,Marks and Phone.
Relation schema: A relation schema describes the structure of the relation, with
the name of the relation (i.e. name of table), its attributes and their names and type.
Domain : For each attribute of relation, there is a set of permitted values called
domain. For example – in above table, the domain of attribute Marks is set of all
possible permitted marks of the students. Similarly the domain of Name attribute is all
possible names of students. That means Domain of Marks attribute is (88,83,98)
Atomic : The domain is atomic if elements of the domain are considered to be
indivisible units. For example in above Student table, the attribute Phone is non-
atomic.
NULL attribute : A null is a special symbol, independent of data type, which means
either unknown or inapplicable. It does not mean zero or blank. For example - Consider
a salary table that contains NULL.
Super Key(SK): It is a set of one or more attributes within a table that can uniquely
identify each record within a table. For example – Consider the Student table as follows –
Thus every candidate key is a super key but every super key is not a candidate key.
2) Primary Key (PK): The primary key is a candidate key chosen by the database
designer to identify the tuple in the relation uniquely. For example – Consider the
following representation of primary key in the student table.
Other than the above mentioned primary key, various possible primary keys can be
(RollNo), (RollNo,Name), (RollNo, Phone)
The relation among super key, candidate key and primary can be denoted by
Candidate Key=Super Key – Primary Key
Rules for Primary Key
(i) The primary key may have one or more attributes.
(ii) There is only one primary key in the relation.
(iii) The value of primary key attribute cannot be NULL.
(iv) The value of primary key attribute does not get changed.
4) Alternate key: The alternate key is a candidate key which is not chosen by the
database designer to uniquely identify the tuples. For example –
• Example -
From above example, we can see that two tables are linked. For instance we
could easily find out that the ‘Student C:
Database integrity means correctness or accuracy of data in the database. A
database may have number of integrity constraints. For example –
(i) The Employee ID and Department ID must consists of two digits.
(ii) Every Employee ID must start with letter.
The integrity constraints are classified based on the concept of primary key and
foreign key. Let us discuss the classification of constraints based on primary key and
foreign key as follows –
• The referential integrity rule states that “whenever a foreign key value
is used it must reference a valid, existing primary key in the parent table”.
• Example:Consider the situation where you have two tables : Employees and
Managers. The Employees table has a foreign key attribute entitled
ManagedBy, which points to the record for each employee’s manager in the
Managers table.
Referential integrity enforces the following three rules :
i) You cannot add a record to the Employees table unless the ManagedBy
attribute points to a valid record in the Managers table. Referential integrity
prevents the insertion of incorrect details into a table. Any operation that doesn't
satisfy referential integrity rule fails.
ii) If the primary key for a record in the Managers table changes, all corresponding
records in the Employees table are modified.
iii) If a record in the Managers table is deleted, all corresponding records in the
Employees table are deleted.
Advantages of Referential Integrity
Referential integrity offers following
advantages:
i) Prevents the entry of duplicate data.
ii) Prevents one table from pointing to a non-existent field in another table.
iii) Guaranteed consistency between "partnered" tables.
iv) Prevents the deletion of a record that contains a value referred to by a foreign
key in another table.
v) Prevents the addition of a record to a table that contains a foreign key unless
there is a primary key in the linked table.
Database integrity
• The foreign key is a key in one table that refers to the primary key of another
table.
• The foreign key is basically used to link two tables. For example – Consider
Customer table as follows –
Customer
Order
OrderI Descripti CustI
D on D
• The "CustID" column in the "Customer" table is the PRIMARY KEY in the
"Customer" table.
• The "CustID" column in the "Order" table is a FOREIGN KEY in the "Order" table.
• The table containing the foreign key is called the child table, and the table
containing the primary key is called the referenced or parent table.
• The FOREIGN KEY constraint is used to prevent actions that would destroy links
between tables.
• The FOREIGN KEY constraint also prevents invalid data from being inserted into
the foreign key column, because it has to be one of the values contained in the
table it points to.
Relational Algebra
• There are two formal query languages associated with relational model
andmodern SQL and other relational query languages are based on these
languages. Those are relational algebra and relational calculus.
• The queries present in the relational algebra are denoted using operators.
Relational Operations :
The above set operators are divided into Fundamental, Additional and Extended
Relational operations.
(1) Selection :
• This operation is used to fetch the rows or tuples from the
table(relation) that satisfy given predicate. We use lowercase Greek
letter sigma (σ) to denote selection.
Syntax : The syntax is σ predicate (relation)
σ age >18(Student)
The output will be -
sna
me
Ram
Gee
ta
We can also specify conditions using and, or operators.
σage >18 and gender = ‘Male’(Student)
snam
e
Ram
(2) Projection :
• Thus to display particular column from the relation, the projection operator
is used.
• It will only project or show the columns or attributes asked for, and will also
remove duplicate data from the columns.
• It is a Unary operator i.e uses single Table. It is denoted by Greekm letter pi(∏)
• Syntax:
∏C1, C2… (r) where C1, C2 etc. are attribute
names(column names).
• For example - Consider the Student table given in Fig.
Query : Display the name and age all the students
This can be written in relational algebra as
(∏)sname, age(Student)
Above statement will show us only the Name and Age columns for all the rows of
data in Student table.
sname age
Ram 21
Shyam 18
Seeta 16
Geeta 23
Fig
We can compose or combine Relational operators to retrieve
more complex information. Below we combined Projection and
Selection operations as follows:
∏name(σage=18(student)
Binary Operation are those operation which uses two tables to
produce Result.
(3) Cartesian product :
• This is used to combine data from two different relations(tables) into one and
fetch data from the combined relation.
• Syntax :A × B
• For example : Suppose there are two tables named Student and Reserve as
follows
Student Reserve
ag sid isbn day
sid sname
e
1 005 07-07-
1 Ram 21 18
2 Shyam 18 2 005 03-03-
17
3 Seeta 16
3 007 08-11-
4 Geeta 23 16
• Query : Find the names of all the students who have reserved isbn = 005. To
satisfy this query we need to extract data from two table. Hence the cartesian
product operator is used as
(σStudent.sid = Reserve.sid ^ Reserve.Isbn =005(Student × Reserve)
(4)Set operations : Various set operations are - union, intersection and set-
difference.
Let us understand each of these operations with the help of examples.
(i) Union:
o For this operation to work, the relations(tables) specified should have same
number of attributes(columns) and same attribute domain. Also the
duplicate tuples are automatically eliminated from the result.
Student Book
snam isb bnam Autho
sid age
e n e r
1 Ram 21 005 DBMS XYZ
4 Geeta 23
o Query : We want to display both the student name and book names from
both the tables then
∏Sname(Student) ∪∏bname (Book)
(ii) Intersection :
o This operation is used to fetch data from both tables which is common in both
the tables.
o Syntax : A ∩ B where A and B are relations.
(5) Join : The join operation is used to combine information from two or more
relations. Formally join can be defined as a cross-product followed by selections and
A⋈c∏B = ∏c(A × B)
Thus ⋈ is defined to be a cross-product followed by a selection. Note that the
condition c can refer to attributes of both A and B. The condition C can be
specified using <,<=,>,<= or = operators.
For example consider two table student and reserve as follows -
Student Reserve
si snam Ag si
isbn day
d e e d
1 Ram 21 1 005 07-07-
18
2 Shya 18
m 2 005 03-03-
17
3 Seeta 16
3 007 08-11-
4 Geeta 23 16
If we want the names of students with sid(Student) = sid(Reserve) and isbn = 005,
then we can write it using Cartesian product as -
ii) Equijoin : This is a kind of join in which there is equality condition between
two attributes(columns) of relations(tables). For example - If there are two
table Book and Reserve table and we want to find the book which is reserved by
the student having isbn 005 and name of the book is ‘DBMS’, then :
Book Reserve
isb bnam Autho si isb
day
n e r d n
005 DBMS XYZ 1 005 07-07-18
iii) Natural Join: When there are common columns and we have to equate these
common columns then we use natural join. The symbol for natural join is
simply without any condition. For example, consider two tables -
Book Reserve
isb bnam isb
Author sid day
n e n
005 DBMS XYZ 07-07-
1 005
18
006 OS PQR
2 005 03-03-
007 DAA ABC 17
3 007 08-11-
16
STUDENT
EMPLOYEE
Left Outer Join(⟕): When applying join on two relations R and S, some tuples of R or S
does not appear in result set which does not satisfy the join conditions. But Left Outer
Joins gives all tuples of R in the result set. The tuples of R which do not satisfy join
condition will have values as NULL for attributes of S.
945512 NUL NU
1 RAM DELHI 3451 18 NULL L NULL NULL LL
Right Outer Join(⟖): When applying join on two relations R and S, some tuples of R or S
does not appear in result set which does not satisfy the join conditions. But Right Outer
Joins gives all tuples of S in the result set. The tuples of S which do not satisfy join
condition will have values as NULL for attributes of R.
Example: Select students whose ROLL_NO is greater than EMP_NO of employees and
details of other Employees as well
STUDENT⟖STUDENT.ROLL_NO>EMPLOYEE.EMP_NOEMPLOYEE
RESULT:
NU SURE 915676
NULL NULL NULL NULL LL 4 SH DELHI 8971 18
Full Outer Join(⟗): When applying join on two relations R and S, some tuples of R or S
does not appear in result set which does not satisfy the join conditions. But Full Outer
Joins gives all tuples of S and all tuples of R in the result set. The tuples of S which do not
satisfy join condition will have values as NULL for attributes of R and vice versa.
Example:Select students whose ROLL_NO is greater than EMP_NO of employees and
details of other Employees as well and other Students as well
STUDENT⟗STUDENT.ROLL_NO>EMPLOYEE.EMP_NOEMPLOYEE
RESULT:
NU NARE 978291
NULL NULL NULL NULL LL 5 SH HISAR 8192 22
NU SURE 915676
NULL NULL NULL NULL LL 4 SH DELHI 8971 18
(6)Rename operation : This operation is used to rename the output relation for any
query operation which returns result like Select, Project etc. Or to simply rename a
relation(table). The operator ρrho) is used for renaming.
Syntax : (RelationNew, RelationOld)
For example : If you want to create a relation Student_names with sid and sname
from Student, it can be done using rename operator as :
ρ(Student_names, (sid.sname(Student))
(7)Divide operation
The division operator is used when we have to evaluate queries which contain the
keyword ALL.
It is denoted by A/B where A and B are instances of relation.
For example - Find all the customers having accounts in all the branches. For that
consider two tables - Customer and Account as
Customer Account
Branch
Name Branch
Pun
A Pun
e
e
Mumbai
B Mumbai
A Mumbai
C Pun
e
Now A/B will give us
Nam
e
A
Here We check all the branches from Account table against all the names from
Customer table. We can then find that only customer A has all the accounts in all the
branches. RXS
Formal Definition of Division Operation : The operation A/B is define as the set of
all x values (in the form of unary tuples) such that for every y value in (a tuple of) B,
there is a tuple <x,y> in A.
Ex: temp1← RX S