Entity Relationship Model
Entity Relationship Diagram(ERD) represents the conceptual database as viewed
by the end user
The main components of ERD are entities, attributes and relationships.
Entity:- entity can be defined as place ,thing or a person whose data is important
for the organization.
Entity Set:- A group of entities sharing common characteristics is called entity
set.
Entity set is represented using a rectangle symbol.
Entities are classified into two categories
o Strong entity:- An entity that has primary key of its own
o Weak entity:-: an entity that depends for primary key on another table
Ex:-
E-name
E-add D-Name
E-id
D-age
EMPLOYEE HAS DEPENDENT
In the above example the entity EMPLOYEE has an attribute E-id that can qualify
as primary key therefore it is a strong entity.
The entity DEPENDENT is not possessing any attribute that can qualify as a
primary key therefore it is a weak entity
As per the relational Database rules every entity should possess a primary key
therefore primary key for DEPENDENT entity was build using the primary key
of EMPLOYEE entity.
1
Attributes
Attribute describes the characteristics or properties of an entity.
An attribute is represented using an oval shape symbol.
Each table can have one or more attributes
Attributes are classified into different categories
o Required and optional attributes
o Primary key attribute
o Composite and simple attribute
o Single-valued or Multi-valued attributes
o Derived attributes
o Primary key(Identifier)
Required and Optional Attribute:-
o Required attribute:-An attribute that must have a value
These attributes does not allow NULL values
o Optional attribute:- An attribute that may or may not have a value
These attributes allows NULL values
S-Addr S-DOB
S-Name
S-Phone
S-No
STUDENT
Figure 1
In the above figure the attributes S-No, S-Name, S-Addr, S-DOB are required
attributes and the attribute S-Phone is an optional attribute.
Primary Key Attribute(Identifier):-
o One or more attributes that uniquely identify an entity instance
o Primary key attributes of an entity are underlined by solid line in ER
diagrams.
o Each entity has only one primary key.
o It does not allow null values
o In the above example S-No is a primary key for STUDENT entity
Composite and Simple Attribute:-
o Composite attribute:-An attribute that can be further subdivided to yield
additional attributes
2
o Simple attributes:-An attribute that cannot be further subdivided
In the figure 1 S-Name attribute is a composite attribute so it is subdivided into
FName MName,LName and the transformed entity is shown
MName
LName
S-DOB
FName S-Addr
S-Name S-Phone
S-No
STUDENT
Figure 2
Single-valued and Multi-valued attributes
o Single-valued attribute is an attribute that can take single value only.
o It need not be a simple attribute it can also be composite.
o Multi-valued attributes are attributes that can have many values
o They are shown by a double line connecting the attribute to the entity.
Ex:-
Job
E-Name E-Addr Skills DOB
E-No Age
EMPLOYEE
Figure 3
In the above example Skills is multi-valued attribute and the remaining attributes
are single valued attributes
Derived attributes:-
o An attribute whose value is derived from the other attributes value.
o They are referred as computed attributes
o Saves storage space but requires CPU cycles
o In the figure 3 the value of the age is a derived from DOB therefore age is
derived attribute.
Relationships:-
It is an association between entities.
Each relation is identified by a name that describes the relationship
A relationship is represented by diamond symbol.
3
The relationship name is active or passive verb
Relationship between entities always operate in both the directions
Connectivity and Cardinality of a relationship:-
The ER Model uses the term connectivity to label the type of relationship.
There are three types of relationships based on cardinality
One-One Relationship(1:1):
Implies one instance of entity set can be related to only one instance of another
entity set
Ex:-
HOD
EMPLOYEE DEPARTMENT
Figure 4
An employee can be head of only one DEPARTMENT
A DEPARTMENT will have one and only one HOD
One-Many Relationship(1:M):
Implies one instance of entity set can be related to many instances of another
entity set
Ex:-
Offers
DEPARTMENT COURSE
Figure 5
The department offers zero or more courses
Each course is offered by one and only one department.
Many- Many Relationship(M:N):
Implies many instances of entity set can be related to many instances of another
entity set.
Ex:-
Purchases
CUSTOMER PRODUCT
Figure 6
A customer may purchase one or more products.
4
A product can be purchased by more than one customer.
Existence Dependence:
An entity is said to be existence dependent if it can exist in the database only
when it is associated with another entity occurrence.
In implementation terms, an entity is existence dependent if it has a mandatory
foreign key.
If an entity can exist apart from one or more related entities then it is referred as
existence-independent.
Ex:-
Has
EMPLOYEE DEPENDENT
Figure 7
In the above example the entity DEPENDENT is existence dependent on the
entity EMPLOYEE.
Relationship Strength :-
Relationship strength is based on the primary key of the participating entities.
Weak(Non-identifying) Relationships
o A weak relationship also known as a non-identifying relationship, exists if
the participating entities have attributes that qualify as Primary Key.
o Ex:- Figure 8
Crs-name Class-id
Crs-Id
Class-name
Generates
COURSE CLASS
In the above example both the entities are having attributes that qualify
themselves as primary key, therefore the relationship GENERATES is weak
relationship.
Strong(Identtifying ) Relationships
A strong relationship also known as a identifying relationship, exists if one of
the participating entity does not have any attributes that qualify themselves as
Primary Key.
So the primary key of such an entity is formed using the primary key of other
participating entities.
5
Ex:-
E-name
E-add D-Name
E-id
D-age
Has
EMPLOYEE DEPENDENT
Figure 9
In the above example the entity DEPENDENT does not have any attribute that
can become primary key.So the attribute E-id of EMPLOYEE becomes the
primary key for the DEPENDENT entity.
Relationship participation:-
Participation in an entity relationship is either optional or mandatory.
Optional participation means that one entity occurrence does not require a
corresponding entity occurrence in a particular relationship.
Mandatory participation means that one entity occurrence requires a
corresponding entity occurrence in a particular relationship.
Ex:- In the Figure 9
An employee may not have a dependent, so relationship participation is optional.
Each dependent is associated compulsory with one employee, therefore the
relationship participation is mandatory.
Relationship Degree:-
A relationship degree indicates the number of entities or participants associated
with a relationship..
There are three types of relationships based on the degree, they are
o Unary relationship
o Binary relationship
o Ternary relationship
Unary Relationship:-
In unary relationship the entity has relationship with itself.
One instance of the entity is related with the another instance of the same entity.
It is also called recursive relationship.
6
Manages
EMPLOYEE
Figure 10
Employee manages zero or more employees.
Each employee is managed by only one manager
Binary Relationship:-
A binary relationship exists when two entities are associated in a relationship.
A binary relationship can be weak or strong based on the participating entities.
Ex:-
Purchases
CUSTOMER
PRODUCT
Figure 11
Each customer can purchase one or more products.
Each product can be purchased by more than one
Ternary relationship:-
A simultaneous relationship that exists between instances of three entities is
called ternary relationship.
Case 1:
Pres_Date
Prescription
DOCTOR PATIENT
DRUG
7
A Doctor writes one or more prescriptions.
A Patient may receive one or more prescriptions.
A Drug may appear in one or more prescriptions.
Prescription is an associative entity since many to many relationships exist
between participating entities.
Case:2
ShippingMode Unit_cost
Supplies
VENDOR WAREHOUSE
PART
A Vendor may supply one or more Parts to a Warehouse.
A Warehouse may receive one or more Parts from Vendor.
A Part may be supplied by one or more Vendor to the Warehouse.
Supplies is an associative entity since many to many relationships exist between
participating entities.
Recursive Relationships:-
A relationship that is exists between the instances of the same entity is called
recursive relationship.
All unary relationships are recursive.Ex:-
Married
PERSON
Figure 13
A person is married to zero or one person.
Each person can marry only one person.
8
Associative Entity:-
o The associative entity is used to implement M:N relationships between two or
more entities.
o This entity is composed of the primary key of the participating entities and any
attribute of the relationship.
o Ex:-
Enroll
ENROLLS
STUDENT COURSE
Figure 14
o A STUDENT may enroll to one or more courses.
o Each course may be enrolled by one or more students.
o The relationship between student and course is M:N relationship. So the relation
ENROLLS is converted into associative entity. The new entity ENROLL takes
the primary key of the participating entities as its attributes.
o The modified ER diagram is as follows
Has ENROLL Found
STUDENT CLASS
o The details of the student who have enrolled into a course are present in the entity
ENROLL.
o A student may enroll into more than one courses, then the student will occur
more than once in the ENROLL entity. Each course may be enrolled by more than
one students, so class will occur more than once in the ENROLL entity.
9
THE EXTENDED ENTITY RELATIONSHIP MODEL
As the complexity of the data structures being modeled has increased and as
application software requirements havebecome more stringent, there has been an
increasing need to capture more information in the data model.
The extended entity relationship model (EERM), sometimes referred to as the
enhanced entity relationship model, is the result of adding more semantic
constructs to the original entity relationship (ER) model.
A diagram using this model is called an EER diagram (EERD).
Entity Supertypes and Subtypes
Entity Supertype:
An entity supertypeis a generic entity type that is related to one or more entity
subtypes.
An entity supertype contains the common characteristics.
A supertype entity can have relationship with another entity.
Entity Subtype:
An entity subtype is a specific entity type that is related to one and only one
entity supertype.
An entity subtypes contain the unique characteristics of each entity subtype.
A subtype entity can also have relationship with another entity.
Example:
The aviation business employs pilots, mechanics, secretaries, accountants,
database managers, and many other types of employees.
Because most employees possess a wide range of skills and special qualifications,
data modelers must find a variety of ways to group employees based on employee
characteristics.
The grouping of employees to create various types of employees provides two
important benefits:
o It avoids unnecessary nulls in the employee attributes when some
employees have characteristics that are not shared by other employees.
o It enables a particular employee type to participate in relationships that are
unique to that employee type.
Figure 6.1
Specialization Hierarchy
10
Entity supertypes and subtypes are organized in a specialization hierarchy, which
depicts the arrangement of higher-level entity supertypes (parent entities) and lower-level
entity subtypes (child entities).
Figure 6.2 shows the specialization hierarchy formed by an EMPLOYEE supertype and
three entity subtypes—PILOT, MECHANIC, and ACCOUNTANT.
The specialization hierarchy reflects the 1:1 relationship between EMPLOYEE and its
subtypes. Forexample, a PILOT subtype occurrence is related to one instance of the
EMPLOYEE supertype, and a MECHANIC subtype occurrence is related to one instance
of the EMPLOYEE supertype.
Figure 6.2
A specialization hierarchy provides the means to:
o Support attribute inheritance.
o Define a special supertype attribute known as the subtype discriminator.
o Define disjoint/overlapping constraints and complete/partial constraints.
Inheritance
The property of inheritance enables an entity subtype to inherit the attributes and
relationships of the supertype.
One important inheritance characteristic is that all entity subtypes inherit
theirprimary key attribute from their supertype.
At the implementation level, the supertype and its subtype(s) depicted in the
specialization hierarchy maintain a 1:1 relationship.
Subtype Discriminator
11
A subtype discriminator is the attribute in the supertype entity that determines to
which subtype the supertype occurrence is related.
It is common practice to show the subtype discriminator and its value for each
subtype in the ER diagram,
It’s important to note that the default comparison condition for the subtype
discriminator attribute is the equality comparison.
Disjoint and Overlapping Constraints
An entity supertype can have disjoint or overlapping entity subtypes.
Disjoint subtypes, also known as non-overlapping subtypes, are subtypes that
contain a
uniquesubset of the supertype entity set; in other words, each entity instance of
the supertype can appear in only one of the subtypes.
Overlapping subtypes are subtypes that contain nonunique subsets of the
supertype entity set; that is, each entity instance of the supertype may appear in
more than one subtype.
For example, in a university environment, a person may be an employee or a
student or both. In turn, an employee may be a professor as well as an
administrator.
Completeness Constraint
The completeness constraint specifies whether each entity supertype occurrence
must also be a member of at least one subtype. The completeness constraint can
be partial or total.
Partial completeness (symbolized by a circle over a single line) means that not
every supertype occurrence is a member of a subtype; that is, there may be some
supertype occurrences that are not members of any subtype.
Total completeness (symbolized by a circle over a double line) means that every
supertype occurrence must be a member of at least one subtype.
12
EXTENDED ENTITY RELATIO NORMALIZATION OF DATABASE
TABLES
Normalization:- Normalization is a process for evaluating and correcting table’s
structures to minimize data redundancy thereby avoiding the occurring of data anomalies.
Need for Normalization
To minimize data redundancy
To avoid data anomalies resulting during insert, update or delete operations.
The Normalization Process
The objective of normalization is to ensure that each table conforms to the
concept of well defined relations that satisfy the following characteristics
Each table represents a single subject
No data item will be unnecessarily stored in more than one table.
All nonprime attributes in a table are dependent on the primary key.
Each table should not exhibit insert , update and delete anomalies.
Normalization process takes us through the steps that lead us through normal
form to accomplish the above objective.
Functional Dependency:- the attribute B is fully
Normal form:- Normalization works through a series of stages called normal form.
Following are the different normal forms:-
First normal form(1NF)
Second normal form(2NF)
Third normal form(3NF)
Boyce Codd normal form(BCNF)
Fourth normal form(4NF)
First Normal Form(1NF)
It is performed by the three steps
1. Eliminate the repeating groups.
2. Identifying the primary key
3. Identify all the functional dependencies
If the non-key attributes depending fully on entire primary key then it is
fully functional dependent.
If the non-key attributes are partially depending on primary key then it is
partial functional dependent.
When attribute A is depending on attribute B and B is depending on C
then there exists transitive dependency between A,B & C.
13
Ex: -
PROJ_NUM PROJ_NAME EMP-ID ENAME JOB CHG-HRS NO_HOUR
The primary key for the above table was made up of attributes
(PROJ_NUM, EMP-ID )
Functional dependencies:-
1. (PROJ_NUM, EMP-ID ) NO_HOUR
2. EMP-ID ENAME,JOB,CHG_HRS
3. PROJ_NUM PROJ_NAME
4. JOB CHG_HRS
The attribute CHG-HRS is depending functionally on JOB and JOB is
depending on EMP-ID then we can say transitive dependence exist between
EMP-ID, JOB & CHG-HOUR
Second Normal Form(2NF):-
A relation or a table is in second normal form if it is satisfies the following
conditions
o In should be in first normal from(1NF)
o Every non-key attribute should fully functionally dependent on the
primary key.
If any relation or table is not satisfying the above conditions it is said not to be
2NFand the following steps are to convert the relation into the second normal
form:-
o Step:-1 write each key component on a separate line.
o Step:-2 Assign corresponding dependent attributes
o Step3:create a separate table for each determinant ad its dependencies.
Ex:-
The above table is not 2NF as it is not satisfying rule-2 i.e. the attributes
ENAME,JOB,CHG_HRS, PROJ_NAME are exhibiting partial dependencies.
14
The table is restructured by following the above three steps.
The resultant tables are
PROJECT
PROJ_NUM PROJ_NAME
ASSIGNMENT
PROJ_NUM EMP-ID NO_HOUR
EMPLOYEE
EMP-ID ENAME JOB CHG-HRS
Third Normal Form(3NF):-
A relation is in third normal form (3NF) if it satisfies the following conditions.
1. The relation should be in 2 NF
2. Transitive dependency between the attributes should not exist.
If a table is in 2 NF and it exhibits transitive dependency it can be corrected by
the following steps
1. Step1:- Identify each new determinant
2. Step2:- Identify the dependent attribute
3. Step3:- Remove the dependent attribute from the table and construct new
table with determinant and the dependents.
Ex:-
EMPLOYEE
EMP-ID ENAME JOB CHG-HRS
The above table is in 2NF
15
In the EMPLOYEE table The attribute CHG-HRS is depending functionally on
JOB and JOB is depending on EMP-ID then we can say transitive dependence
exist between EMP-ID, JOB & CHG-HRS
So the above table is corrected by following the above steps .the resultant tables
are
EMPLOYEE
EMP-ID ENAME JOB
JOB-DETAILS
JOB CHG-HRS
THE BOYCE-CODD NORMAL FORM (BCNF)
A table is in Boyce-Codd Normal Form (BCNF) when every determinant in the
table is a candidate key.
When a table contains only one candidate key then 3NF and BCNF are
equivalent.
When a table contains more than one candidate key then the table need to be
corrected so that it contains only one candidate key.
Ex:-
The attributes A & B together form a primary key. But it is identified that
the non
key attribute C is determining the attribute B which is the part of primary
key. The
entity has two candidate keys hence not in BCNF.
The above entity is corrected using the following steps
Create a table that contains attributes (A,C,D) i.e the part of primary key, new
candidate key and other non key attribute.
Create the second table that contains (C,B) i.e. the new candidate key and its
dependent.
16
NL Denormalization
During Normalization process the tables are split into smaller tables in order to
minimize redundant data.
Sometimes to solve our query we need to access data from more than one table.
We would do a join operation between these multiple tables that consumes more
CPU time. Therefore overall time required to get the response increases.
Denormalization is a database optimization technique in which two are more
tables which are very frequently accessed together are combined into single
table .
It is a strategy that database managers use to increase the performance of a
database structure.
It is an optimization technique that is applied after doing normalization.
This can help us avoid costly joins in a relational database.
Queries to retrieve can be simpler
Gives scope for Insert, Update and Delete Anomalies
Not suitable for tables that have frequent Updates
17