KEMBAR78
Unit 1 Notes | PDF | Databases | Relational Database
0% found this document useful (0 votes)
26 views48 pages

Unit 1 Notes

Uploaded by

preethi.m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views48 pages

Unit 1 Notes

Uploaded by

preethi.m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

VEL TECH HIGH TECH

Dr. RANGARAJAN Dr. SAKUNTHALA ENGINEERING COLLEGE


An Autonomous Institution
Approved by AICTE-New Delhi, Affiliated to Anna University, Chennai
Accredited by NBA, New Delhi & Accredited by NAAC with ―A‖ Grade & CGPA of 3.27

Course code 21HC34T Semester III


Category PROFESSIONAL CORE COURSE (PCC) L T P C
DATABASE MANAGEMENT SYSTEMS
3 0 0 3
Course Title (Common to CSE, AI&DS)

SYLLABUS No. of Credits: 3

UNIT I INTRODUCTION TO RELATIONAL DATABASES 9

Introduction, History and Purpose of Database System – Introduction to Big data– Database Schema
and Instances- Views of data – Data Models – Database System Architecture – Introduction to
relational databases Relational Model – Keys – Relational Algebra.

COURSE OBJECTIVES:

The objective of this course is to prepare the students

● To understand the fundamentals of data models and to represent a database system using
ER diagrams and relational database design..

COURSE OUTCOMES:

CO. Course Blooms


No. Outcome level
On successful completion of this Course, students will be able to
Understand the applications of database models on various
C304. K2
1 parameters.
UNIT-1
UNIT I INTRODUCTION TO RELATIONAL DATABASES
Introduction, History and Purpose of Database System – Introduction to Big
data– Database Schema and Instances- Views of data – Data Models –
Database System Architecture – Introduction to relational databases
Relational Model – Keys – Relational Algebra.

INTRODUCTION TO DATABASE SYSTEM

Database is a collection of related data. Database management system is software designed to


assist the maintenance and utilization of large scale collection of data.

Data
• Data is raw fact or figures or entity.
• When activities in the organization takes place, the effect of these activities need to be
recorded which is known as Data.

Information
• Processed data is called information
• The purpose of data processing is to generate the information required for carrying out the
business activities.

In general data management consists of following tasks

• Data capture: Which is the task associated with gathering the data as and when they
originate.

• Data classification: Captured data has to be classified based on the nature and intended usage.

• Data storage: The segregated data has to be stored properly.

• Data arranging: It is very important to arrange the data properly

• Data retrieval: Data will be required frequently for further processing, Hence it is very
important to create some indexes so that data can be retrieved easily.

• Data maintenance: Maintenance is the task concerned with keeping the data upto-date.

• Data Verification: Before storing the data it must be verified for any error.

• Data Coding: Data will be coded for easy reference. • Data Editing: Editing means re-
arranging the data or modifying the data for presentation.

• Data transcription: This is the activity where the data is converted from one form into
another.
• Data transmission: This is a function where data is forwarded to the place where it would be
used further.
Database

• Database may be defined in simple terms as a collection of data


• A database is a collection of related data.
• The database can be of any size and of varying complexity.
• A database may be generated and maintained manually or it may be computerized.

Database Management System

• A Database Management System (DBMS) is a collection of program that enables user to


create and maintain a database.
• The DBMS is hence a general purpose software system that facilitates the process of defining
constructing and manipulating database for various applications.

Characteristics of DBMS
• To incorporate the requirements of the organization, system should be designed for easy
maintenance.
• Information systems should allow interactive access to data to obtain new information without
writing fresh programs.
• System should be designed to co-relate different data to meet new requirements.
• An independent central repository, which gives information and meaning of available data is
required.
• Integrated database will help in understanding the inter-relationships between data stored in
different applications.
• The stored data should be made available for access by different users simultaneously.
• Automatic recovery feature has to be provided to overcome the problems with processing
system failure.

DBMS Utilities
• A data loading utility: Which allows easy loading of data from the external format without
writing programs.
• A backup utility: Which allows to make copies of the database periodically to help in cases
of crashes and disasters.
• Recovery utility: Which allows to reconstruct the correct state of the database from the
backup and history of transactions.
• Monitoring tools: Which monitors the performance so that internal schema can be changed
and database access can be optimized.
• File organization: Which allows restructuring the data from one type to another?

Database Management System (DBMS) and Its Applications:

A Database management system is a computerized record-keeping system. It is a repository or


a container for collection of computerized data files.
The overall purpose of DBMS is to allow the users to define, store, retrieve and update
the information contained in the database on demand. Information can be anything that is of
significance to an individual or organization.
Databases touch all aspects of our lives. Some of the major areas of application are as
follows:

● Banking
● Airlines
● Universities
● Manufacturing and selling
● Human resources

PURPOSE OF DATABASE SYSTEM


Database systems arose in response to early methods of computerized management of
commercial data. As an example of such methods, typical of the 1960s, consider part of a
university organization that, among other data, keeps information about all instructors, students,
departments, and course offerings.
One way to keep the information on a computer is to store it in operating system files.
To allow users to manipulate the information, the system has a number of application
programs that manipulate the files, including programs to:\
1.Add new students, instructors, and courses 
2.Register students for courses and generate class rosters 
3.Assign grades to students, compute grade point averages (GPA), and generate transcripts

System programmers wrote these application programs to meet the needs of the
university. New application programs are added to the system as the need arises. For example,
suppose that a university decides to create a new major (say, computer science).
As a result, the university creates a new department and creates new permanent files (or
adds information to existing files) to record information about all the instructors in the
department, students in that major, course offerings, degree requirements, etc.
The university may have to write new application programs to deal with rules specific to
the new major. New application programs may also have to be written to handle new rules in
the university. Thus, as time goes by, the system acquires more files and more application
programs.
This typical file-processing system is supported by a conventional operating system. The
system stores permanent records in various files, and it needs different application programs to
extract records from, and add records to, the appropriate files. Before database management
systems (DBMSs) were introduced, organizations usually stored information in such systems.

Keeping organizational information in a file processing v system has a number of major


disadvantages:

Data redundancy and inconsistency:


Since different programmers create the files and application programs over a long period,
the various files are likely to have different structures and the programs may be written in
several programming languages. Moreover, the same information may be duplicated in several
places (files).
For example, if a student has a double major (say, music and mathematics) the address
and telephone number of that student may appear in a file that consists of student records of
students in the Music department and in a file that consists of student records of students in the
Mathematics department.
This redundancy leads to higher storage and access cost. In addition, it may lead to data
inconsistency; that is, the various copies of the same data may no longer agree. For example, a
changed student address may be reflected in the Music department records but not elsewhere in
the system.

Difficulty in accessing data:


Suppose that one of the university clerks needs to find out the names of all students who
live within a particular postal-code area.
The clerk asks the data-processing department to generate such a list. Because the
designers of the original system did not anticipate this request, there is no application program
on hand to meet it. There is, however, an application program to generate the list of all students.
The university clerk has now two choices: either obtain the list of all students and
extract the needed information manually or ask a programmer to write the necessary application
program. Both alternatives are obviously unsatisfactory.
Suppose that such a program is written, and that, several days later, the same clerk needs
to trim that list to include only those students who have taken at least 60 credit hours. As
expected, a program to generate such a list does not exist.
Again, the clerk has the preceding two options, neither of which is satisfactory. The point
here is that conventional file-processing environments do not allow needed data to be retrieved
in a convenient and efficient manner. More responsive data-retrieval systems are required for
general use.

Data isolation:
Because data are scattered in various files, and files may be in different formats, writing
new application programs to retrieve the appropriate data is difficult.

Integrity problems:
The data values stored in the database must satisfy certain types of consistency
constraints. Suppose the university maintains an account for each department, and records the
balance amount in each account.
Suppose also that the university requires that the account balance of a department may
never fall below zero. Developers enforce these constraints in the system by adding appropriate
code in the various application programs. However, when new constraints are added, it is
difficult to change the programs to enforce them. The problem is compounded when constraints
involve several data items from different files.

Atomicity problems:
A computer system, like any other device, is subject to failure. In many applications, it is
crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to
the failure.

Consider a program to transfer $500 from the account balance of department A to the
account balance of department B. If a system failure occurs during the execution of the
program, it is possible that the $500 was removed from the balance of department A but was not
credited to the balance of department B, resulting in an inconsistent database state. Clearly, it is
essential to database consistency that either both the credit and debit occur, or that neither occur.

That is, the funds transfer must be atomic—it must happen in its entirety or not at all. It is
difficult to ensure atomicity in a conventional file-processing system.

Concurrent-access anomalies:
For the sake of overall performance of the system and faster response, many systems
allow multiple users to update the data simultaneously. Indeed, today, the largest Internet
retailers may have millions of accesses per day to their data by shoppers. In such an
environment, interaction of concurrent updates is possible and may result in inconsistent data.
Consider department A, with an account balance of $10,000. If two department clerks debit the
account balance (by say $500 and $100, respectively) of department A at almost exactly the
same time, the result of the concurrent executions may leave the budget in an incorrect (or
inconsistent) state.
Suppose that the programs executing on behalf of each withdrawal read the old balance,
reduce that value by the amount being withdrawn, and write the result back. If the two programs
run concurrently, they may both read the value $10,000, and write back $9500 and $9900,
respectively. Depending on which one writes the value last, the account balance of department
A may contain either $9500 or $9900, rather than the correct value of $9400. To guard against
this possibility, the system must maintain some form of supervision.

But supervision is difficult to provide because data may be accessed by many different
application programs that have not been coordinated previously.

As another example:
suppose a registration program maintains a count of students registered for a course, in
order to enforce limits on the number of students registered. When a student registers, the
program reads the current count for the courses, verifies that the count is not already at the limit,
adds one to the count, and stores the count back in the database. Suppose two students register
concurrently, with the count at (say) 39. The two program executions may both read the value
39, and both would then write back 40, leading to an incorrect increase of only 1, even though
two students successfully registered for the course and the count should be 41.

Furthermore, suppose the course registration limit was 40; in the above case both students
would be able to register, leading to a violation of the limit of 40 students.

Security problems:
Not every user of the database system should be able to access all the data. For example,
in a university, payroll personnel need to see only that part of the database that has financial
information. They do not need access to information about academic records. But, since
application programs are added to the file-processing system in an ad hoc manner, enforcing
such security constraints is difficult.

These difficulties, among others, prompted the development of database systems. In what
follows, we shall see the concepts and algorithms that enable database systems to solve the
problems with file-processing systems.

Advantages of DBMS:

Controlling of Redundancy: Data redundancy refers to the duplication of data (i.e storing
same data multiple times). In a database system, by having a centralized database and
centralized control of data by the DBA the unnecessary duplication of data is avoided. It also
eliminates the extra time for processing the large volume of data. It results in saving the storage
space.
INTRODUCTION TO BIG DATA

Big Data is also data but with a huge size. Big Data is a term used to describe a collection of
data that is huge in volume and yet growing exponentially with time. In short such data is so
large and complex that none of the traditional data management tools are able to store it or
process it efficiently.

➤ ―Extremely large data sets that may be analyzed computationally to reveal patterns, trends
and association, especially relating to human behavior and interaction are known as Big Data.‖

Examples Of Big Data


Following are some the examples of Big Data-
 The New York Stock Exchange generates about one terabyte of new trade data per day.

 Social Media:
The statistic shows that 500+terabytes of new data get ingested into the databases of social
media site Facebook, every day. This data is mainly generated in terms of photo and video
uploads, message exchanges, putting comments etc.
 A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many
thousand flights per day, generation of data reaches up to many Petabytes.
Unlike structured data, which is stored in data warehouses, unstructured data is stored in data
lakes. Data lakes preserve the raw format of data as well as all of its information. Data lakes
make data more malleable, unlike data warehouses where data is limited to its defined schema.

Benefits or advantages of Big Data


Following are the benefits or advantages of Big Data:

Big data analysis derives innovative solutions. Big data analysis helps in understanding
and targeting customers. It helps in optimizing business processes.

It helps in improving science and research.

It improves healthcare and public health with availability of record of patients.

It helps in financial tradings, sports, polling, security/law enforcement etc.

Any one can access vast information via surveys and deliver anaswer of any query.

Every second additions are made.

One platform carry unlimited information.

Drawbacks or disadvantages of Big Data


● Following are the drawbacks or disadvantages of Big Data:
Traditional storage can cost lot of money to store big data.

Lots of big data is unstructured.

Big data analysis violates principles of privacy.

It can be used for manipulation of customer records.

It may increase social stratification.

Big data analysis is not useful in short run. It needs to be analyzed for longer duration to
leverage its benefits.

Big data analysis results are misleading sometimes.
Speedy updates in big data can mismatch
.
DATABASE SCHEMA AND INSTANCES

DATABASE SCHEMA
Schema is the overall description of the database.
The basic structure of how the data will be stored in the database is called schema.
A database schema is the skeleton structure of the database. It represents the logical view of the
entire database.
A schema contains schema objects like table, foreign key, primary key, views, columns, data
types, stored procedure, etc.
A database schema can be represented by using the visual diagram. That diagram shows the
database objects and relationship with each other.
TYPES OF DB SCHEMA
1.Physical schema
2.Logical schema
3.View schema

Physical schema:
Physical schema is a term used in data management to describe how data is to be represented
and stored (files, indices, et al.) in secondary storage using a particular database management
system (DBMS) (e.g., Oracle RDBMS, Sybase SQL Server, etc.).

Logical schema:
A logical schema can be defined as the design of the database at its logical level. In this level,
the programmers, as well as the database administrator (DBA), work. At this level, data can be
described as certain types of data records that can be stored in the form of data structures.

View schema:
View schema can be defined as the design of the database at the view level, which generally
describes end-user interaction with database systems.
Example:
Let suppose you are storing students' information on a student's table. At the physical level,
these records are described as chunks of storage (in bytes, gigabytes, terabytes, or higher) in
memory, and these elements often remain hidden from the programmers.
Then comes the logical level; here at a logical level, these records can be illustrated as
fields and attributes along with their data type(s); their relationship with each other can be
logically implemented. Programmers generally work at this level because they are aware of
such things about database systems.
At view level, a user can able to interact with the system, with the help of GUI, and enter
the details on the screen. The users are not aware of the fact of how the data is stored and
what data is stored; such features are hidden from them.

DATABASE INSTANCES
The data which is stored in the database at a particular moment of time is called an instance of
the database.
One can easily change these instances using certainoperations, such as deletion and addition of
data and information.
search queries will not make any changes in any instances.
An instance is also called a current state or database state. The database schema that defines
variables in tables which belong to a specific database, the records of these variables at a
particular moment are called the instance of the database.
Example –
Let’s say a table teacher in our database whose name is School, suppose the table has 50 records
so the instance of the database has 50 records for now and tomorrow we are going to add
another fifty records so tomorrow the instance have total 100 records. This is called an
instance.

VIEWS OF DATA
Views of data refer to the different ways to show the database management system to the user
hiding its complexity.

Data Abstraction
Data Abstraction refers to the process of hiding irrelevant details from the user.
Example: If we want to access any mail from our Gmail then we don't know where that data is
physically stored i.e is the data present in India or USA or what data model has been used to
store that data? We are not concerned about these things. We are only concerned with our email.
So, information like these i.e. location of data and data models are irrelevant to us and in data
abstraction, we do this only. Apart from the location of data and data models, there are other
factors that we don't care of. We hide the unnecessary data from the user and this process of
hiding unwanted data is called Data Abstraction.

Three levels of data abstraction:


1.View Level
2.Conceptual Level
3.Physical Level
Physical Level:
The physical or the internal level schema describes how the data is stored in the
hardware. It also describes how the data can be accessed. The physical level shows the data
abstraction at the lowest level and it has complex data structures. Only the database
administrator operates at this level.
It tells the actual location of the data that is being stored by the user. The Database
Administrators(DBA) decide that which data should be kept at which particular disk drive, how
the data has to be fragmented, where it has to be stored etc. They decide if the data has to be
centralized or distributed. Though we see the data in the form of tables at view level the data
here is actually stored in the form of files only. It totally depends on the DBA, how he/she
manages the database at the physical level.
Logical Level:
It is a level above the physical level. Here, the data is stored in the form of the entity
set, entities, their data types, the relationship among the entity sets, user operations performed to
retrieve or modify the data and certain constraints on the data. Well adding constraints to the
view of data adds the security. As users are restricted to access some particular parts of the
database.
Example: Let us take an example where we use the relational model for storing the data. We
have to store the data of a student, the columns in the student table will be student_name, age,
mail_id, roll_no etc. We have to define all these at this level while we are creating the database.
Though the data is stored in the database but the structure of the tables like the student table,
teacher table, books table, etc are defined here in the conceptual level or logical level. Also,
how the tables are related to each other are defined here. Overall, we can say that we are
creating a blueprint of the data at the conceptual level.
View Level:
It is called the highest level of abstraction or external level. It describes the user
interaction with database systems via application programs, that hide details of data types. It
shows only a part of the entire database.
Example:
If we have a login-id and password in a university system, then as a student, we can view our
marks, attendance, fee structure, etc. But the faculty of the university will have a different view.
He will have options like salary, edit marks of a student, enter attendance of the students, etc.
So, both the student and the faculty have a different view. By doing so, the security of the
system also increases. In this example, the student can't edit his marks but the faculty who is
authorized to edit the marks can edit the student's marks. Similarly, the dean of the college or
university will have some more authorization and accordingly, he will has his view. So,
different users will have a different view according to the authorization they have.

Data Independence

Data independence defines the extent to which the data schema can be changed at one
level without modifying the data schema at the next level.

Data independence can be classified as shown below:

1.Logical Data Independence:

Logical data independence describes the degree up to which the logical or conceptual
schema can be changed without modifying the external schema. Now, a question arises what is
the need to change the data schema at a logical or conceptual level?
Well, the changes to data schema at the logical level are made either to enlarge or reduce the
database by adding or deleting more entities, entity sets, or changing the constraints on data.

2.Physical Data Independence:


Physical data independence defines the extent up to which the data schema can be
changed at the physical or internal level without modifying the data schema at logical and view
level.
Well, the physical schema is changed if we add additional storage to the system or we
reorganize some files to enhance the retrieval speed of the records.

Mappings
Process of transforming request and results between three level it's called mapping.
There are the two types of mappings:
1.Conceptual/Internal Mapping
2.External/Conceptual Mapping
1. Conceptual/Internal Mapping:
The conceptual/internal mapping defines the correspondence between the conceptual
view and the store database.
It specifies how conceptual record and fields are represented at the internal level.
It relates conceptual schema with internal schema.
If structure of the store database is changed.
If changed is made to the storage structure definition-then the conceptual/internal mapping must
be changed accordingly, so that the conceptual schema can remain invariant.
There could be one mapping between conceptual and internal levels.

2. External/Conceptual Mapping:
The external/conceptual mapping defines the correspondence between a particular
external view and conceptual view.
It relates each external schema with conceptual schema.
The differences that can exist between these two levels are analogous to those that can exist
between the conceptual view and the stored database.
Example: fields can have different data types; fields and record name can be changed; several
conceptual fields can be combined into a single external field.
Any number of external views can exist at the same time; any number of users can share a given
external view: different external views can overlap.
There could be several mapping between external and conceptual levels.

DATA MODELS
Data Models are used to show how data is stored, connected, accessed and updated in the
database management system.

Types of data Models

1. Hierarchical Model

2. Network Model

3. Entity-Relationship Model

4. Relational Model

5. Object-Oriented Data Model

6. Object-Relational Data Model

7. Flat Data Model

8. Semi-Structured Data Model

9. Associative Data Model

10. Context Data Model


Hierarchical Model

Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. This model easily represents
some of the real-world relationships like food recipes, sitemap of a website etc.

Example: We can represent the relationship between the shoes present on a shopping website
in the following way:

Features of a Hierarchical Model

1. One-to-many relationship: The data here is organised in a tree-like structure where the
one-to-many relationship is between the datatypes. Also, there can be only one path from
parent to any node. Example: In the above example, if we want to go to the
node sneakers we only have one path to reach there i.e through men's shoes node.
2. Parent-Child Relationship: Each child node has a parent node but a parent node can
have more than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically
deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used to
navigate between the stored data. Example: In the above example the 'shoes' node points
to the two other nodes 'women shoes' node and 'men's shoes' node.
Advantages of Hierarchical Model
● It is very simple and fast to traverse through a tree-like structure.

● Any change in the parent node is automatically reflected in the child node so, the
integrity of data is maintained.
Disadvantages of Hierarchical Model

● Complex relationships are not supported.

● As it does not support more than one parent of the child node so if we have some
complex relationship where a child node needs to have two parent node then that can't be
represented using this model.

● If a parent node is deleted then the child node is automatically deleted.

Network Model

This model is an extension of the hierarchical model. It was the most popular model
before the relational model. This model is the same as the hierarchical model, the only
difference is that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.

Features of a Network Model

1. Ability to Merge more Relationships: In this model, as there are more relationships so
data is more related. This model has the ability to manage one-to-one relationships as
well as many-to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to the
same record. This makes data access

fast and simple.


3. Circular Linked List: The operations on the network model are done with the help of
the circular linked list. The current position is maintained with the help of a program and
this position navigates through the records according to the relationship.
Advantages of Network Model

● The data can be accessed faster as compared to the hierarchical model. This is because
the data is more related in the network model and there can be more than one path to
reach a particular node. So the data can be accessed in many ways.

● As there is a parent-child relationship so data integrity is present. Any change in parent


record is reflected in the child record.
Disadvantages of Network Model

● As more and more relationships need to be handled the system might get complex. So, a
user must be having detailed knowledge of the model to work with the model.

● Any change like updation, deletion, insertion is very complex.

Entity-Relationship Model

Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this


model, we represent the real-world problem in the pictorial form to make it easy for the
stakeholders to understand. It is also very easy for the developers to understand the system by
just looking at the ER diagram. We use the ER diagram as a visual tool to represent an ER
Model. ER diagram has the following three components:
● Entities: Entity is a real-world thing. It can be a person, place, or even a
concept. Example: Teachers, Students, Course, Building, Department, etc are some of the
entities of a School Management System.

● Attributes: An entity contains a real-world property called attribute. This is the


characteristics of that attribute. Example: The entity teacher has the property like teacher
id, salary, age, etc.

● Relationship: Relationship tells how two attributes are related. Example: Teacher works
for a department.
Example:

In the above diagram, the entities are Teacher and Department. The attributes of Teacher entity
are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The attributes of
entity Department entity are Dept_id, Dept_name. The two entities are connected using the
relationship. Here, each teacher works for a department.

Features of ER Model

● Graphical Representation for Better Understanding: It is very easy and simple to


understand so it can be used by the developers to communicate with the stakeholders.

● ER Diagram: ER diagram is used as a visual tool for representing the model.

● Database Design: This model helps the database designers to build the database and is
widely used in database design.
Advantages of ER Model
● Simple: Conceptually ER Model is very easy to build. If we know the relationship
between the attributes and the entities we can easily build the ER Diagram for the model.

● Effective Communication Tool: This model is used widely by the database designers
for communicating their ideas.

● Easy Conversion to any Model: This model maps well to the relational model and can
be easily converted relational model by converting the ER model to the table. This model
can also be converted to any other model like network model, hierarchical model etc.
Disadvatages of ER Model

● No industry standard for notation: There is no industry standard for developing an ER


model. So one developer might use notations which are not understood by other
developers.

● Hidden information: Some information might be lost or hidden in the ER model. As it


is a high-level view so there are chances that some details of information might be
hidden.

Relational Model

Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and columns.
The basic structure of a relational model is tables. So, the tables are also called relations in the
relational model. Example: In this example, we have an Employee table.

Features of Relational Model

● Tuples: Each row in the table is called tuple. A row contains all the information about
any instance of the object. In the above example, each row has all the information about
any specific individual like the first row has information about John.
● Attribute or field: Attributes are the property which defines the table or relation. The
values of the attribute should be from the same domain. In the above example, we have
different attributes of the employee like Salary, Mobile_no, etc.
Advantages of Relational Model

● Simple: This model is more simple as compared to the network and hierarchical model.

● Scalable: This model can be easily scaled as we can add as many rows and columns we
want.

● Structural Independence: We can make changes in database structure without changing


the way to access the data. When we can make changes to the database structure without
affecting the capability to DBMS to access the data we can say that structural
independence has been achieved.
Disadvantages of Relatinal Model

● Hardware Overheads: For hiding the complexities and making things easier for the
user this model requires more powerful hardware computers and data storage devices.

● Bad Design: As the relational model is very easy to design and use. So the users don't
need to know how the data is stored in order to access it. This ease of design can lead to
the development of a poor database which would slow down if the database grows.
But all these disadvantages are minor as compared to the advantages of the relational model.
These problems can be avoided with the help of proper implementation and organisation.

Object-Oriented Data Model

The real-world problems are more closely represented through the object-oriented data model.
In this model, both the data and relationship are present in a single structure known as an object.
We can store audio, video, images, etc in the database which was not possible in the relational
model(although you can store audio and video in relational database, it is adviced not to store in
the relational database). In this model, two are more objects are connected through links. We
use this link to relate one object to other objects. This can be understood by the example given
below.
In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name, Job_title of
the employee and the methods which will be performed by that object are stored as a single
object. The two objects are connected through a common attribute i.e the Department_id and the
communication between these two will be done with the help of this common id.

Object-Relational Model

As the name suggests it is a combination of both the relational model and the object-
oriented model. This model was built to fill the gap between object-oriented model and the
relational model. We can have many advanced features like we can make complex data types
according to our requirements using the existing data types. The problem with this model is that
this can get complex and difficult to handle. So, proper understanding of this model is required.

Flat Data Model

It is a simple model in which the database is represented as a table consisting of rows and
columns. To access any data, the computer has to read the entire table. This makes the modes
slow and inefficient.

Semi-Structured Model

Semi-structured model is an evolved form of the relational model. We cannot


differentiate between data and schema in this model.

Example: Web-Based data sources which we can't differentiate between the schema and
data of the website. In this model, some entities may have missing attributes while others may
have an extra attribute. This model gives flexibility in storing the data. It also gives flexibility to
the attributes. Example: If we are storing any value in any attribute then that value can be either
atomic value or a collection of values.
Associative Data Model

Associative Data Model is a model in which the data is divided into two parts. Everything
which has independent existence is called as an entity and the relationship among these entities
are called association. The data divided into two parts are called items and links.

● Item: Items contain the name and the identifier(some numeric value).

● Links: Links contain the identifier, source, verb and subject.


Example: Let us say we have a statement "The world cup is being hosted by London from 30
May 2020". In this data two links need to be stored:

1. The world cup is being hosted by London. The source here is 'the world cup', the verb 'is
being' and the target is 'London'.
2. ...from 30 May 2020. The source here is the previous link, the verb is 'from' and the
target is '30 May 2020'.
This is represented using the table as follows:

Context Data Model

Context Data Model is a collection of several models. This consists of models like
network model, relational models etc. Using this model we can do various types of tasks which
are not possible using any model alone.
DATABASE ARCHITECTURE

1. Naive / Parametric End Users :


Parametric End Users are the unsophisticated who don’t have any DBMS knowledge but
they frequently use the data base applications in their daily life to get the desired results.
For examples, Railway’s ticket booking users are naive users. Clerks in any bank is a naive user
because they don’t have any DBMS knowledge but they still use the database and perform their
given task.
Sophisticated Users :
Sophisticated users can be engineers, scientists, business analyst, who are familiar with the
database. They can develop their own data base applications according to their requirement.
They don’t write the program code but they interact the data base by writing SQL queries
directly through the query processor.
Application Programmer
The application programmer users who are responsible for developing the application programs
or user interface. The application programs could be written in high level language. For
example − Java, .net, phpetc,

DML query

A data manipulation language (DML) is a computer programming language used for adding
(inserting), deleting, and modifying (updating) data in a database.

DDL Interpreter
The DDL interpreter interprets DDL statements and records the definition in the data
dictionary.
DML Compiler
The DML compiler translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine understands.

Query Evaluation Plan

Query evaluation engine executes low level instructions generated by the DML compiler.

Compiler & Linker


A Compiler translates lines of code from the programming language into machine
language.

A Linker creates a link between two programs.

Object code

The object code file contains a sequence of machine-readable instructions that is


processed by the CPU in a computer.

Storage Manager:

The storage manager is the component of a database system that provides the interface between
the low-level data stored in the database and the application programs and queries submitted to
the system.

Thus, the storage manager is responsible for storing, retrieving, and updating data in the
database. The storage manager components include:

• Authorization and integrity manager, which tests for the satisfaction of integrity constraints
and checks the authority of users to access data.

• Transaction manager, which ensures that the database remains in a consistent (correct) state
despite system failures, and that concurrent transaction executions proceed without conflicting.
• File manager, which manages the allocation of space on disk storage and the data structures
used to represent information stored on disk.

• Buffer manager, which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in main memory. The buffer manager is a critical part of the
database system, since it enables the database to handle data sizes that are much larger than the
size of main memory. The storage manager implements several data structures as part of the
physical system implementation:

• Data files, which store the database itself.

• Data dictionary, which stores metadata about the structure of the database, in particular the
schema of the database.

• Indices, which can provide fast access to data items. Like the index in this textbook, a
database index provides pointers to those data items that hold a particular value

INTRODUCTION TO RELATIONAL DATABASES:

A database is a means of storing information in such a way that information can be retrieved
from it. In simplest terms, a relational database is one that presents information in tables with
rows and columns. A table is referred to as a relation in the sense that it is a collection of objects
of the same type (rows). Data in a table can be related according to common keys or concepts,
and the ability to retrieve related data from a table is the basis for the term relational database.
RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL,
and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and
Microsoft Access.
A Relational database management system (RDBMS) is a database management system
(DBMS) that is based on the relational model as introduced by E. F. Codd.

Table/Relation

Everything in a relational database is stored in the form of relations. The RDBMS


database uses tables to store data. A table is a collection of related data entries and contains
rows and columns to store data. Each table represents some real-world objects such as person,
place, or event about which information is collected. The organized collection of data into a
relational table is known as the logical view of the database.

Properties of a Relation:
o Each relation has a unique name by which it is identified in the database.
o Relation does not contain duplicate tuples.
o The tuples of a relation have no specific order.
o All attributes in a relation are atomic, i.e., each cell of a relation contains exactly one
value.
A table is the simplest example of data stored in RDBMS.
Let's see the example of the student table.
ID Name AGE COURSE

1 Ajeet 24 B.Tech

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC
Row or record

A row of a table is also called a record or tuple. It contains the specific information of each
entry in the table. It is a horizontal entity in the table. For example, The above table contains 5
records.
Properties of a row:
o No two tuples are identical to each other in all their entries.
o All tuples of the relation have the same format and the same number of entries.
o The order of the tuple is irrelevant. They are identified by their content, not by their
position.

Let's see one record/row in the table.

ID Name AGE COURSE

1 Ajeet 24 B.Tech
Column/attribute
A column is a vertical entity in the table which contains all information associated with a
specific field in a table. For example, "name" is a column in the above table which contains all
information about a student's name.
Properties of an Attribute:
o Every attribute of a relation must have a name.
o Null values are permitted for the attributes.
o Default values can be specified for an attribute automatically inserted if no other value is
specified for an attribute.
o Attributes that uniquely identify each tuple of a relation are the primary key.

Name

Ajeet

Aryan

Mahesh

Ratan

Vimal
Data item/Cells
The smallest unit of data in the table is the individual data item. It is stored at the intersection of
tuples and attributes.
Properties of data items:
o Data items are atomic.
o The data items for an attribute should be drawn from the same domain.
In the below example, the data item in the student table consists of Ajeet, 24 and Btech, etc.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

Degree:

The total number of attributes that comprise a relation is known as the degree of the table.
For example, the student table has 4 attributes, and its degree is 4.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC
Cardinality:
The total number of tuples at any one time in a relation is known as the table's cardinality. The
relation whose cardinality is 0 is called an empty table.
For example, the student table has 5 rows, and its cardinality is 5.

Domain:

The domain refers to the possible values each attribute can contain. It can be specified
using standard data types such as integers, floating numbers, etc. For example, An attribute
entitled Marital_Status may be limited to married or unmarried values.

NULL Values

The NULL value of the table specifies that the field has been left blank during record
creation. It is different from the value filled with zero or a field that contains space.

Relational Data Model

Relational Model

Relational Model (RM) represents the database as a collection of relations. A relation is


nothing but a table of values. Every row in the table represents a collection of related data
values. These rows in the table denote a real-world entity or relationship.
The table name and column names are helpful to interpret the meaning of values in each row.
The data are represented as a set of relations. In the relational model, data are stored as tables.
However, the physical storage of the data is independent of the way the data are logically
organized.

Some popular Relational Database management systems are:

● DB2 and Informix Dynamic Server – IBM


● Oracle and RDB – Oracle
● SQL Server and Access – Microsoft

Relational Model Concepts in DBMS

1. Attribute: Each column in a Table. Attributes are the properties which define a relation.
e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored
along with its entities. A table has two properties rows and columns. Rows represent
records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its
attributes.
5. Degree: The total number of attributes which in the relation is called the degree of the
relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
9. Relation key – Every row has one, two or multiple attributes, which is called relation
key.
10. Attribute domain – Every attribute has some pre-defined value and scope which is
known as attribute domain
Properties of Relations

o Name of the relation is distinct from all other relations.


o Each relation cell contains exactly one atomic (single) value
o Each attribute contains a distinct name
o Attribute domain has no significance
o tuple has no duplicate value
o Order of tuple can have a different sequence

Integrity Constraints

o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.

Types of Integrity Constraint

1. Domain constraints

o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc. The
value of the attribute must be available in the corresponding domain.

Example:

2. Entity integrity constraints

o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation and
if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.

Example:

3. Referential Integrity Constraints

o A referential integrity constraint is specified between two tables.


o In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary
Key of Table 2, then every value of the Foreign Key in Table 1 must be null or be
available in Table 2.

Example:
4. Key constraints

o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary key. A
primary key can contain a unique and null value in the relational table.

Example:

KEYS

o Keys play an important role in the relational database.


o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.

For example, ID is used as a key in the Student table because it is unique for each student. In
the PERSON table, passport_number, license_number, SSN are keys since they are unique for
each person.
Types of keys:

Primary key

o It is the first key used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys, as we saw in the PERSON table. The key which is most
suitable from those lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each
employee. In the EMPLOYEE table, we can even select License_Number and
Passport_Number as primary keys since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.

2. Candidate key

o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the
attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key.

3. Super Key

Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a
candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the


name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key

o Foreign keys are the column of the table used to point to the primary key of another
table.
o Every employee works in a specific department in a company, and employee and
department are two different entities. So we can't store the department's information in
the employee table. That's why we link these two tables through the primary key of one
table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute
in the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.

5. Alternate key

There may be one or more attributes or a combination of attributes that uniquely identify each
tuple in a relation. These attributes or combinations of the attributes are called the candidate
keys. One key is chosen as the primary key from these candidate keys, and the remaining
candidate key, if it exists, is termed the alternate key. In other words, the total number of the
alternate keys is the total number of candidate keys minus the primary key. The alternate key
may or may not exist. If there is only one candidate key in a relation, it does not have an
alternate key.
For example, employee relation has two attributes, Employee_Id and PAN_No, that act as
candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other
candidate key, PAN_No, acts as the Alternate key.

6. Composite key

Whenever a primary key consists of more than one attribute, it is known as a composite key.
This key is also known as Concatenated Key.
For example, in employee relations, we assume that an employee may be assigned multiple
roles, and an employee may work on multiple projects simultaneously. So the primary key will
be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So
these attributes act as a composite key since the primary key comprises more than one attribute.

Relational Algebra

Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
The fundamental operations of relational algebra are as follows −

● Select
● Project
● Union
● Set different
● Cartesian product
● Rename

Select Operation (σ)

It selects tuples that satisfy the given predicate from a relation.


Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula
which may use connectors like and, or, and not. These terms may use relational operators like
− =, ≠, ≥, < , >, ≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.
σsubject = "database" and price = "450" or year >"2010"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books
published after 2010.

Project Operation (∏)

It projects column(s) that satisfy a given predicate.


Notation − ∏A1, A2, An (r)
Where A1, A2 ,An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation Books.

Union Operation (∪)

It performs binary union between two given relations and is defined as −


r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold −

● r, and s must have the same number of attributes.


● Attribute domains must be compatible.
● Duplicate tuples are automatically eliminated.
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who have either written a book or an article or both.

Set Difference (−)

The result of set difference operation is tuples, which are present in one relation but are not in
the second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not articles.

Cartesian Product (Χ)

Combines information of two different relations into one.


Notation − r Χ s
σauthor = 'tutorialspoint'(Books Χ Articles)
Output − Yields a relation, which shows all the books and articles written by tutorialspoint.

Rename Operation (ρ)

The results of relational algebra are also relations but without any name. The rename operation
allows us to rename the output relation. 'rename' operation is denoted with small Greek
letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.

Additional Operations in Relational Algebra-

Additional operations in relational algebra are as below –

▪ Set Intersection operation


▪ Division operation
▪ Assignment operation
▪ Natural Join operation
1. Left Outer Join Operation
2. Right Outer Join Operation
3. Full Outer Join Operation

Tables used for Examples

Note – We are going to use below relation tables to show different dbms relational algebra
examples. These tables will be used for different examples shown in this post.

R-Schema(id, name)
R – Relation
Id Name

101 Raj

102 Rahul

103 Sachin

104 Anil
105 Prasad

S-Schema(id, name)
S – Relation

Id Name

101 Raj

104 Anil

106 Kapil

107 Sumit
The first additional relational algebra operations in dbms is Set intersection operation –

(1) Set Intersection Operation (∩)


Let’s say you have two relation tables A and B. Now, what if you are asked to find out the common
tuples in tables A and B. How can you achieve it?

To make this task easier, there is an operation defined in relational algebra. This is called set
intersection operation.

Set Intersection operation selects common tuples from the two relations. It is denoted by
(∩)

Necessary condition to apply set intersection operation –

For set intersection operation, the two-argument relation must be compatible relation.

So, if R and S are two relations, P = R ∩ S has tuples drawn from R and S, such that each tuple in P is in
R and S.

Notation of Set Intersection Operation

P=R∩S

Where, P = result of set intersection operation,


R and S are relations on which set intersection operation has been applied.

Consider R and S relation shown at start of the post,

Result after applying set intersection operation in R and S relations –

Id Name

101 Raj

104 Anil
It means only row (101, Raj) and (104, Anil) are common in two relations (R and S).

(2) Division Operation (÷)


Division operation is denoted by ÷ sign. It is useful in queries, which involve the phrase ―for all objects
having all the specified properties‖.
Let R (R-Schema) and S(S-Schema) are relations and any attribute of S – Schema is also in R –
Schema. The relation
R / S is a relation on schema R-Schema – S-Schema i.e. on the schema containing all the attributes of
Schema R that are not in Schema S.

A tuple t is in r ÷ s if and only if both the conditions hold.

▪ T is in πR – S (r)
▪ For every tuple ts in S, there is a tuple tr in R satisfying both of the following:

1. tr[s] = ts[s]
2. tr[R-S] = t
Or,
In other words, we can say
Division operation R ÷ S can only be applied if and only if –

▪ Attributes of S is proper subset of Attributes of R.


▪ The relation returned by division operator will have attributes = (All attributes of R – All
Attributes of S)
▪ The relation returned by division operator will return those tuples from relation R which are
associated to every S’s tuple.

Notation of Division Operation

P=R÷S

Where,
Pis result we get after applying division operator,
R and S stands for relation (name of the tables) on which division operation is applied.

Exercises on Division Operation


Let say Relation P is
A B

A1 B1

A1 B2

A2 B1

A3 B1

A4 B2

A5 B1
A5 B2
Now, answer below questions related to join operation –

Question A. Find P ÷ Q if Q is

B1

B2

Question B. Find P ÷ Q if Q is

B1

Question C. Find P ÷ Q if Q is

(3) Assignment operator (←)


Now we will see what is assignment operator in relation algebra.
Suppose you wish to assign result of an expression into a relation R. How would you denote it?
For such work, we use assignment operator (←).

Notation of Assignment Operator

Relational Variable ← Expression.

or,

R ← E.

Where,
Ris relation,
rstands for relation variable.
Eis Expression whose result we wish to assign to relation variable R.

The result of the expression to the right hand side of ← is assigned to relation variable on the left
side of ←. The
relation variable may be used in subsequent expressions.
R1 ← πname(Customer)

R2 ← πname(Employee)

R = R1 – R2

The fourth additional relational algebra operations in dbms is Natural Join operation
(4) Natural Join Operation (⋈)
It is denoted by the join symbol ⋈. The natural join operation forms a Cartesian product of its two
arguments,
performs selection forcing equality on those attributes that appear in both relation schema, and finally
removes
duplicate attributes. Natural join can be defined as:

Notation of Natural Join Operation

P=R⋈S

Where,
Pis resultant relation after applying natural join operation on R and S,
R and S stands for relation (name of the table).
Employee relation
Id Name

101 Sachin

103 Rahul

104 Kapil

107 Ajay

Salary relation

Id Salary

101 65000

103 35000

104 22000

107 21910
When we perform join operation on Relation Employee and Salary we get

Join Operation Result (Employee ⋈ Salary) is


Id Name Salary

101 Sachin 65000

103 Rahul 35000


104 Kapil 22000

107 Ajay 21910


The fifth additional relational algebra operations in dbms is Outer Join operation –

(5) Outer Join Operation


In this section, we will deal with outer join operations, i.e. why do we need outer join operation, when
to use it etc.

The outer join operation is an extension of the join operation to deal with missing information.
Suppose we have the relations with the following schemas which contain data on permanent doctors.

Doctors (doc-id, name)

Permanent-doc (doc-id, address, birthdate, sal)

Doctor relation
It contains id and name of the doctor.

Doc-id Name

1 Anil

2 Ganesh

3 Sunil

4 Reena

Permanent Document relation


It contains id, address, birth date and salary of doctors.

Doc-id Address Birthdate Sal

1 Pune 12/12/1970 20000

2 Mumbai 12/1/1970 20000

3 Nagpur 5/1/1970 30000

4 Nashik 5/1/1979 20000


Suppose we wish to generate a single relation with all the information (Name, address, birthdate, sal)
about Permanent doctors.

One approach would be to use Natural-Join operation as follows:

= (Doctors ⋈ Permanent-Doc)
The result of above is shown below :

Doc-id Name Address Birthdate Sal

1 Anil Pune 12/12/1970 20000

2 Ganesh Mumbai 12/1/1970 20000

4 Reena Nashik 5/1/1979 20000

It is seen that, we have information about Doctor with docid=3 and his address, birthdate, sal. Since the
tuple describing docid=3, is absent in Permanent-doc. Similarly we have lost the name of docid=5, since
the tuple describing docid=5, is absent from Doctors relation.

The outer join operation can be used to avoid this loss of information. There are 3 forms of the
operation.

1. Left outer join


2. Right outer join
3. Full outer join
All these 3 forms of join compute the join and add extra tuples to the result of join

Now, we will study about types of outer join operation –

Left Outer Join Operation (⟕)

The left outer join takes all tuples in the left relation that did not match with any tuple in the right
relation pads the tuples with NULL values for all other attributes from the right relation, and adds them
to the result of natural join.

Notation of Left Outer Join Operation

P = R ⟕S

Where,
Pis resultant relation after applying natural join operation on R and S,
R and S stands for relation (name of the table).

Result of the left outer join operation when applied on Doctors and Permanent-Doc relation –

Doc-id Name Address Birthdate Sal

1 Anil Pune 12/12/1970 20000


2 Ganesh Mumbai 12/1/1970 20000

4 Reena Nashik 5/1/1979 20000

3 Sunil NULL NULL NULL

Right Outer Join Operation (⟖)

The right outer join is symmetric with the left outer join. Tuples from the right relation that did not
match any from the left relation are padded with Nulls and added to the result of Natural join.

Notation of Right Outer Join Operation


P = R ⟖S
Where,
Pis resultant relation after applying natural join operation on R and S,
R and S stands for relation (name of the table).
Result of the right outer join operation when applied on Doctors and Permanent-Doc relation –

Doc-id Name Address Birthdate Sal

1 Anil Pune 12/12/1970 20000

2 Ganesh Mumbai 12/1/1970 20000

4 Reena Nashik 5/1/1979 20000

5 NULL Nagpur 5/1/1970 30000

Full Outer Join Operation


The full outer join does both of those operations, padding tuples from left relation that did not match
any right relation, as well as tuples from the right relation that did not match any from the left relation
and adding them to the result of join.
Notation of Right Outer Join Operation (⟗ )
P = R ⟗S
Where,
Pis resultant relation after applying natural join operation on R and S,
R and S stands for relation (name of the table).
Result of the full outer join operation when applied on Doctors and Permanent-Doc relation –

Doc-id Name Address Birthdate Sal

1 Anil Pune 12/12/1970 20000

2 Ganesh Mumbai 12/1/1970 20000

4 Reena Nashik 5/1/1979 20000


3 Sunil NULL NULL NULL

5 NULL Nagpur 5/1/1970 30000

You might also like