Database Management System [85403]
Module 2
The Relational Data Model
Introduction
‘The relational data model was first introduced by Ted Codd of IBM Research in 1970 in a classic
paper (Codd 1970), and it attracted immediate attention due to its simplicity and mathematical
foundation. The model uses the concept of a mathematical relation—which looks somewhat like a
table of values~as its basic building block, and has its theoretical basis in set theory and first-order
predicate logic.
‘The first commercial implementations of the relational model became available in the early 1980s,
such as the SQL/DS system on the MVS operating system by IBM and the Oracle DBMS. Since
then, the model has been implemented in a large munber of commercial systems, Current popular
relational DBMSs (RDBMS) include DB2 and Informix Dynamic Server (from IBM), Oracle and
Rab (fiom Oracle), Sybase DBMS (from Sybase) and SQLServer and Access (fiom Microsoft), In
addition, several open source systems, such as MySQL and PostereSQL, are available,
2.1 Relational Model Concepts
‘The relational model represents the database as a collection of relations. Informally, each relation
resembles a table of values or, to some extent, a flat file of records. It is called # flat file because
each record has a simple linear or flat structure
When a relation is thought of as a table of values, each row in the table represents a collection of
related data values. A row represents a fact that typically corresponds to a real-world entity or
relationship, The table name and column names are used to help to interpret the meaning of the
values in each row.
For example, in STUDENT relation because each row represents facts about a particular student
entity, The column names—Name, Student_number, Class, and Major—specify how to interpret the
data values in each row, based on the column each value is in, All values in a column are of the same
data type,
In the formal relational model terminology, a row is called a mple, a columa header is called an
attribute, and the table is called a relation. The data type describing the types of values that cam
appear in each colnmn is represented by a domain of possible values.
Dr.Giyamala Kushbu S,AP/ISE,CMRIT, BengaluruDatabase Management System [8CS403]
2.1.1. Domains, Attributes, Tuples, and Relations
Domain
A domain D is a set of atomic values. By atomic we mean that each value in the domain is,
indivisible as far as the formal relational model is concerned. A common method of specifying,
a domain is to specify a dats type from which the data values forming the domain are drawn. It
is also useful to specify a name for the domain, to help in interpreting its values.
Some examples of domains follo
* Usa_phone_numbers: The set of ten-digit phone numbers valid in the United States.
+ Social_security_numbers: The set of valid nine digit Social Security numbers.
= Names; The set of character strings that represent names of persons.
= Employee_ages. Possible ages of employees in a company; each must be an integer
value between 15 and 80.
‘The preceding are called logical definitions of domains. A data type or format is also specified
for each domain, For example, the data type for the domain Uss_pbone_mimbers can be
declared as a character string of the form (ddd)ddddddd, where cach d is a mnmeric (deci!)
digit and the first three digits form a valid telephone area code, The data type for
Employee_ages is an integer number between 15 and 80.
Attribute
An attribute A, is the name of a role played by some domain D in the relation scheme R. D is
called the domain of A; and is denoted by dom(..),
Tuple
Mapping from attributes to values drawn from the respective domains of those attributes. Tuples
are intended to describe some entity (or relationship between entities) in the miniworid
Example: a tuple for a PERSON entity might be
{Name
", Gender > Male, Age—> 25 }
Relation
A nnnmed set of tuples all of the same form ie., having the same set of attributes
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management System [BCS403]_
Reason Name Abus
1 —"
swudent 4g
a ra ‘ica phone | Age Goa
Benjamin Bayer | 905-*1-2496] (617978-1616 [2018 BurbornaLane| NULL 19 [321
ane eas Wineeene [asketeer [ont —[ a
Teles CE Dek Dveon [47211-2020 NULL 2482 Eign Rend | r7}7a0-1958 | 20 [9
Roten Pacha [409-22-1100| (e17a7@9601 [205 Lwk Lane | ernras-oa9a] 20 [aes
Bavtara Benson | 530-0.1208| (a17699-2461| 734 Fontaralare [NUL | 10 [aas
Relation schema
A relation schema R, denoted by R(A1, Aa, ...As), iS made up of a relation name R and a list
of attributes As, Ao, ..As. Each attribute A, is the name of a role played by some domain D in
the relation schema R. D is called the domain of A and is denoted by dom(A). A relation
schema is used to describe a relation; R is called the name of this relation,
‘The degree (or arity) of a relation is the nmber of attributes n of its relation schema. A relation
of degree seven, which stores information about miversity students.wonld contain seven
attributes describing each student. as follows:
STUDENT(Name, Ssa, Home_phone, Address, Oice_phone, Age, Gpa)
Using the data type of each attribute, the definition is sometimes written as:
STUDENT(Name: string, Ssn: string, Home_phone: string, Address: string,
Office_phone: string, Age: integer, Gpa: reel)
Domains for some of the attributes of the STUDENT relation:
dom(Name) = Names: dom(Ssn) = Social_security_mumbers,
dom(HomePhone)=USA_phone_mumbersdom(Office_phone)= USA_phone_mimbers,
Relation (or relation state)
A relation (or relation state) r of the relation schema by R(As, A2, .../Aa), also denoted by 1(),
is a set of n-tuples r= {1 t2, .. ta}. Each n-tuple t is an ordered list of 1 values t =
. ...An). Set of attributes and a relation state r(R) is a finite set of mappings
r= (01. 12,..., um}, where each tuple tiis a mapping from R to D.
According to this definition of tuple as a mapping, a tuple can be considered as a set of
() pairs, where each pair gives the value of the mapping from an attribute
Ato a value vi fiom dom(A) “The ordering of attributes is not important, because the
attribute name appears with its value.
2123 Values and NULLs in the Tuples
Each value in a tuple is atomic. NULL values are used to represent the values of attributes
that may be waknown or may not apply to a tuple. For example some STUDENT tuples have
NULL for their office phones because they do not have an office .Another student has a
NULL for home phone In general, we can have several meanings for NULL values, such as
value unknown, value exists but is not available, or attribute does not apply to this tuple
(also known as valne undefined)
2.1.24 Interpretation (Meaning) of a Relation
The relation schema can be interpreted as a declaration or a type of assertion, For example,
the schema of the STUDENT relation of asserts that, in general, a student entity has a Name,
Son, Home_phone, Address. Office_phone. Age. and Gpa. Fach ple in the relation can then
be interpreted as a particular instance of the assertion For example, the first tuple asserts the
fact that there is a STUDENT whose Name is Benjamin Bayer, Ssu is 305-61-2435, Age is
19, and so on.
An alternative interpretation of a relation schema is as a predicate; in this case, the values in
cach tuple are interpreted as values that satisfy the predicate.Database Management System [BCS403]_
2.1.3 Relational Model Notation
= Relation schema R of degree n is denoted by by R(A1. Ad, ...As)
= Uppercase letters Q, R$ denote relation names
= Lowercase letters q, r,s denote relation states
"Letters, u,v denote tuples
= In general, the name of a relation schema such as STUDENT also indicates the current set of
tuples in that relation
"= Anattribute A can be qualified with the relation name R to which it belongs by using the dot
notation R.A—for example, STUDENT Name or STUDENT Age
= An ntuple ¢ in a relation r(R) is denoted by ¢ = ftom? coresponding to the attributes specified
in the list
2.2 Relational Model
Constraints are restrictions on the actual values in a database state. These constraints are
straints and Relational Database Schemas
Result: This insertion violates the entity integrity constraint (NULL. forthe primary key
Ssn), so itis rejected
2. Operation:
Insert
licia’, “J, “Zelaya’, 999887777", *1960-04-05", °6357 Windy Lane, Katy, TX",
F, 28000, 987654321", >
Result: This insertion violates the key constraint because snother tuple with the same Ssn
value already exists in the EMPLOYEE relation, aud so its rejected,
3. Operation:
Insert
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,
6357 Windswept, Katy.Database Management System [B€S403}
Result: This insertion violates the referential integrity constraint specified on Dao in.
EMPLOYEE because no corresponding referenced tuple exists in DEPARTMENT
with Dnumber= 7.
4, Operation:
Insert <-Cecilia’ “F*, “Kolonsky’, “677678989, °1 960-0405", +6357 Windy Lane. Katy,
v
F, 28000, NULL, 4
‘Result: This insertion satisfies all constraints, so it is acceptable
If an insertion violates one or mote constraints, the default option is to reject the insertion.It would
be usefill if the DBMS could provide a reason to the user as to why the insertion was rejected.
Another option is to an attempt to comect the reason for rejecting the insertion
2.3.2 The Delete Operation
‘The Delete operation can violate only referential integrity. This occurs if the tuple being deleted is
referenced by foreign keys from other tuples in the database. To specify deletion, a condition on the
attributes of the relation selects the tuple (or tuples) to be deleted.
Examples:
1 Operation:
Delete the WORKS_ON tuple with Ess = °999887777" and Pho =10.
Result: This deletion is acceptable and deletes exactly one tuple.
2.Operation:
Delete the EMPLOYEE tuple with S
n= "999887777"
Result: This deletion is not acceptable, because there are tuples in WORKS _ON that refer
to this tuple. Hence, ifthe tuple in EMPLOYEE is deleted, referential integrity
violations will result
3.Operetion:
Delete the EMPLOYED. tuple with Ss =°333445555°
Result: This deletion will result in even worse referential integrity violations, because the
‘tuple involved is referenced by tuples from the EMPLOYEE, DEPARTMENT,
WORKS _ON, and DEPENDENT relations.
Several options are available if a deletion operation causes a violation
1. restrict - isto reject the deletion
2. cascade, isto attempt to cascade (or propagete) the deletion by deleting tuples that reference
the tuple that is being deleted
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management System [6CS403]
3, Set null or set default - is to modify the referencing attribute values that cause the violation:
each such value is either set to NULL or changed to reference another default valid tuple
2.3.3 The Update Operation
‘The Update (or Modify) operation is used to change the values of one or more atiributes in a tuple
(or tuples) of some relation R. It is necessary to specify a condition on the attributes of the relation
to select the tuple (or tuples) to be modified
Examples:
1.Operetion:
Update the salary of the EMPLOYEE tuple with Ssu = *999887777" to 28000.
Result: Acceptable
2.Operetion:
Update the Dno ofthe EMPLOYEE tuple with Ssn="999887777" 1 7.
Result: Unacceptable, becanse it violates referential integrity.
3.Operetion:
Update the $sn of the EMPLOYEE tuple with Ssn = °999887777" to *987654321".
Result: Unacceptable, because it violates primary key constraint by repeating a value
that already exists as a primary key in enother tuple; it violates referential integrity
constraints because there are other relations that refer to the existing value of Ssn
Updating an attribute that is neither part of a primary key nor of a foreign key usually causes no
problems; the DEMS need only check to confirm that the new value is of the comect data type and
domain,
2.3.4 The Transaction Concept
A transaction is an executing program that incindes some database operations, such as reading from
the database, or applying insertions, deletions, or updates to the database. At the end of the
transaction, it must leave the database in a valid or consistent state that satisfies all the constraints
specified on the database schema A single transaction may involve any number of retrieval
operations and any number of update operations, These retrievals and updates will together form an
atomic unit of work against the database For example, a transaction to apply a bank withdrawal will
typically read the user account record, check if there is a sufficient balance, and then update the
record by the withdrawal amount,
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management System [6CS403]
Relational Algebra
Introduction
Relational algebra is the basic set of operations for the relational model. These operations enable a
user to specif’ basic retrieval requests as relational algebra expressions. The result of an operation is
a newselation, which may have been formed from one or more input relations.
‘The relational algebra is very important for several reasons
‘¢ Firs, it provides a formal foundation for relational model operations.
# Second, and perhaps more important, itis used as a basis for implementing and optimizing
queries in the query processing and optimization modules that are iniegral parts of
relational database management systems (RDBMSs)
‘© Third, some of its concepts are incorporated nto the SQL standard query language for
RDBMSs
2.4 Unary Relational Operations: SELECT and PROJECT
1 The SELECT Operation
The SELECT operation denoted by 6 (sigma) is used to select a subset of the tuples from a relation
based on a selection conaition. The selection condition acts as a filter that keeps only those tuples
that satisfy a qualifying condition. Alternatively, we can consider the SELECT operation to restrict
the tuples in a relation to only those tuples that satisfy the condition.
‘The SELECT operation can also be visualized es a hortzontal partition of the relation into two sets
of tuples—those tuples that satisfy the condition and are selected, and those tuples that do not satisfy
the condition and are discarded,
In general, the select operation is denoted by
‘© (R)
where
~ the symbol ¢ is used to denote the select operator
~ the selection condition is a Boolean (conditional) expression specified on the attributes of
relation R
tuples that make the condition true are selected
+ appearin the result of the operation
tuples that make the condition false are filtered out
+ discarded from the result of the operation
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management System [6CS403]_
@
Frame [Minit [ name | Sen Bate, ‘hadiess ‘Sex | Salary | Super_sen [Ono
Frankin | T | Wong | 239445555 | 1055-12-08 | 638 Voss, Houston, TX | M | 40000 [888665555 | 5
Jennifer |S _| Wallece | 087654321 | 1941-06-20 | 201 Beny. Belaire. Tx__| F | 49000 | aaase555 | 4
Ramesh | K | Narayan | 666884444 | 1962-00-15 | 975 Fire Oak Humble, Tx] M_| 38000 | 333445565 | 5
) ©.
Lname | Frame | Salay ‘Sex| Salary
‘Smith | John | 30000 ‘™ | 30000)
Wong | Frankin | 40000 ‘™ | 40000)
Zelaya | Aicia [25000 F_| 25000
Wallace | Jennifer | 43000 F_| 43000
Narayan | Ramesh | 98000 m_| 88000
English | Joyce [25000 ™ | 25000
Jabbar_| Abmad_| 25000 ™ | 85000)
Borg | James | $5000
‘The Boolean expression specified in is made up of aumber of clauses of the
fom:
where
is the name of an attribute of R,
“comparison op> is one of the operators,
AL, and
is a constant value from the attribute domain
Clauses can be connected by the standard Boolean operators and, or, and not to form a general
selection condition
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management Sysiem [BCS403]_
Examples:
1 Select the EMPLOYEE tuples whose department number is 4
6 pxo-s (EMPLOYEE)
2.Select the employee ples whose salary is greater than $30,000.
4 sat any> p00 (EMPLOYEE)
3.Select the tuples for all employees who either work in department 4 and make over $25.000
pet year, or work in department 5 and make over $30,000
‘GiDn0=4 AND Salary>25000) OR (Dno=5 AND Salory~-30000(EMPLOYEE)
‘The result of a SELECT operation can be determined as follows:
‘The is applied independently to each individual tuple tin R
If the condition evaluates to TRUE, then tuple ¢ is selected All the selected tuples appear in
the result of the SELECT operation
‘The Boolean conditions AND, OR, and NOT have their normal interpretation, as follows:
~ (cond1 AND cond2) is TRUE
fboth (cond!) and (cond2) are TRUE; otherwise,it is
FALSE.
= (cond1 OR cond?) is TRUE if either (cond1) or (cond?) or both are TRUE; othenwise, it is
FALSE.
- (NOT cond) is TRUE if cond is FALSE; otherwise, it is FALSE.
Dr.Ciyamala Kushbu S,AP/ISE,CMRIT, Bengaluru,Database Management System [CS403]
‘The SELECT operator is unary; that is, it is applied to a single relation. The degree of the relation
resulting from a SELECT operation is the same as the degree of R-The number of tuples in the
resulting telation is always less than or equal to the number of tuples in R. That is,
love (R)
‘The fraction of tuples selected by a selection condition is refered to as the selectivity of the
R| for any condition C
condition.
‘The SELECT operation is commutative: that is,
econ AGcond AR) ~ Gecont2ACccna-AB))
Hence, a sequence of SELECTs can be applied in any ordewe can always combine a cascade (or
sequence) of SELECT operations into a single SELECT operation with a conjunctive (AND)
condition; that is,
conti AO(.u Geconde>(R)) -~)) = Gecondl> ANDcond2> AND AND 25000;
24.2 The PROJECT Operation
‘The PROJECT operation denoted by 7 (pi) selects certain columns from the table and discards the
other columns Used when we are interested in only certain attributes of a relation, The result of the
PROJECT operation can be visualized as a vertical partition of the relation into two relations:
one has the needed columns (attributes) and contains the result of the operation
- the other contains the discarded columns
‘The gencral form of the PROJECT operation is
Meatesibutetise-(R)
where
(pi) - symbol used to represent the PROJECT operation,
- desired sublist of attributes fiom the attributes of relation R.
‘The result of the PROJECT operation has only the attributes specified in in the same
‘onder as they appear in the list. Hence, its degree is equal to the umber of ettributes in