Introduction of DBMS (Database
Management System) | Set 1
                               Important Terminology
Database: Database is a collection of inter-related data which helps in efficient
retrieval, insertion and deletion of data from database and organizes in the form
of tables, views, schemas, reports etc. For Example, a university database
organizes the data about students, faculty, and admin staff etc. which helps in
efficient retrieval, insertion and deletion of data from it.
DDL is the short name of Data Definition Language, which deals with database
schemas and descriptions of how the data should reside in the database.
CREATE: to create a database and its objects like (table, index, views, store
procedure, function, and triggers)
ALTER: alters the structure of the existing database
DROP: delete objects from the database
TRUNCATE: remove all records from a table, including all spaces allocated for
the records are removed
COMMENT: add comments to the data dictionary
RENAME: rename an object
DML is short name of Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT,
INSERT, UPDATE, DLETE,E etc., and it is used to store, modify, retrieve, delete
and update data in a database.
SELECT: retrieve data from a database
INSERT: insert data into a table
UPDATE: updates existing data within a table
DELETE: Delete all records from a database table
MERGE: UPSERT operation (insert or update)
CALL: call a PL/SQL or Java subprogram
EXPLAIN PLAN: interpretation of the data access path
LOCK TABLE: concurrency Control
Database Management System: The software which is used to manage
databases is called Database Management System (DBMS). For Example,
MySQL, Oracle etc. are popular commercial DBMS used in different applications.
DBMS allows users the following tasks:
Data Definition: It helps in creation, modification and removal of definitions that
define the organization of data in database.
Data Updation: It helps in insertion, modification and deletion of the actual data
in the database.
Data Retrieval: It helps in retrieval of data from the database which can be used
by applications for various purposes.
User Administration: It helps in registering and monitoring users, enforcing data
security, monitoring performance, maintaining data integrity, dealing with
concurrency control and recovering information corrupted by unexpected failure.
Paradigm Shift from File System to DBMS
File System manages data using files in hard disk. Users are allowed to create,
delete, and update the files according to their requirement. Let us consider the
example of file based University Management System. Data of students is
available to their respective Departments, Academics Section, Result Section,
Accounts Section, Hostel Office etc. Some of the data is common for all sections
like Roll No, Name, Father Name, Address and Phone number of students but
some data is available to a particular section only like Hostel allotment number
which is a part of hostel office. Let us discuss the issues with this system:
      ● Redundancy of data: Data is said to be redundant if same data is
         copied at many places. If a student wants to change Phone number, he
   has to get it updated at various sections. Similarly, old records must be
   deleted from all sections representing that student.
● Inconsistency of Data: Data is said to be inconsistent if multiple
   copies of same data does not match with each other. If Phone number
   is different in Accounts Section and Academics Section, it will be
   inconsistent. Inconsistency may be because of typing errors or not
   updating all copies of same data.
● Difficult Data Access: A user should know the exact location of file to
   access data, so the process is very cumbersome and tedious. If user
   wants to search student hostel allotment number of a student from
   10000 unsorted students’ records, how difficult it can be.
● Unauthorized Access: thoFile System may lead to unauthorized
   access to data. If a student gets access to file having his marks, he can
   change it in unauthorized way.
● No Concurrent Access: The access of same data by multiple users at
   same time is known as concurrency. File system does not allow
   concurrency as data can be accessed by only one user at a time.
● No Backup and Recovery: File system does not incorporate any
   backup and recovery of data if a file is lost or corrupted.
Introduction of ER Model
ER Model is used to model the logical view of the system from data perspective
which consists of these components:
Entity, Entity Type, Entity Set –
Entity: An Entity may be an object with a physical existence – a particular
person, car, house, or employee – or it may be an object with a conceptual
existence – a company, a job, or a university course.
An Entity is an object of Entity Type and set of all entities is called an entity set.
e.g.; E1 is an entity having Entity Type Student and set of all students is called
Entity Set. In ER diagram, Entity Type is represented as:
Attribute(s):
Attributes are the properties which define the entity type. For example,
Roll_No, Name, DOB, Age, Address, Mobile_No are the attributes which defines
entity type Student. In ER diagram, attribute is represented by an oval.
                              .
1. Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key
attribute.For example, Roll_No will be unique for each student. In ER diagram,
key attribute is represented by an oval with underlying lines.
2. Composite Attribute –
An attribute composed of many other attribute is called as composite attribute.
For example, Address attribute of student Entity type consists of Street, City,
State, and Country. In ER diagram, composite attribute is represented by an oval
comprising of ovals.
3. Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram,
multivalued attribute is represented by double oval.
4. Derived Attribute –
An attribute which can be derived from other attributes of the entity type is
known as derived attribute. e.g.; Age (can be derived from DOB). In ER diagram,
derived attribute is represented by dashed oval.
The complete entity type Student with its attributes can be represented as:
Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For
example,‘Enrolled in’ is a relationship type that exists between entity type
Student and Course. In ER diagram, relationship type is represented by a
diamond and connecting the entities with lines.
A set of relationships of same type is known as relationship set. The following
relationship set depicts S1 is enrolled in C2, S2 is enrolled in C1 and S3 is
enrolled in C3.
Degree of a relationship set:
The number of different entity sets participating in a relationship set is called
as degree of a relationship set.
1. Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship
is called as unary relationship. For example, one person is married to only one
person.
2. Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is
called as binary relationship.For example, Student is enrolled in Course.
3. n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called
as n-ary relationship.
Cardinality:
The number of times an entity of an entity set participates in a relationship
set is known as cardinality. Cardinality can be of different types:
1. One to one – When each entity in each entity set can take part only once in
the relationship, the cardinality is one to one. Let us assume that a male can
marry to one female and a female can marry to one male. So the relationship will
be one to one.
Using Sets, it can be represented as:
2. Many to one – When entities in one entity set can take part only once in the
relationship set and entities in other entity set can take part more than
once in the relationship set, cardinality is many to one. Let us assume that a
student can take only one course but one course can be taken by many students.
So the cardinality will be n to 1. It means that for one course there can be n
students but for one student, there will be only one course.
Using Sets, it can be represented as:
In this case, each student is taking only 1 course but 1 course has been taken by
many students.
3. Many to many – When entities in all entity sets can take part more than
once in the relationship cardinality is many to many. Let us assume that a
student can take more than one course and one course can be taken by many
students. So the relationship will be many to many.
Using sets, it can be represented as:
In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled
by S1, S3 and S4. So it is many to many relationships.
Participation Constraint:
Participation Constraint is applied on the entity participating in the relationship
set.
1. Total Participation – Each entity in the entity set must participate in the
relationship. If each student must enroll in a course, the participation of student
will be total. Total participation is shown by double line in ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT
participate in the relationship. If some courses are not enrolled by any of the
student, the participation of course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set
having total participation and Course Entity set having partial participation.
Using set, it can be represented as,
Every student in Student Entity set is participating in relationship but there exists
a course C4 which is not taking part in the relationship.
Weak Entity Type and Identifying Relationship:
As discussed before, an entity type has a key attribute which uniquely identifies
each entity in the entity set. But there exists some entity type for which key
attribute can’t be defined. These are called Weak Entity type.
For example, A company may store the information of dependents (Parents,
Children, Spouse) of an Employee. But the dependents don’t have existence
without the employee. So Dependent will be weak entity type and Employee will
be Identifying Entity type for Dependent.
A weak entity type is represented by a double rectangle. The participation of
weak entity type is always total. The relationship between weak entity type and
its identifying strong entity type is called identifying relationship and it is
represented by double diamond.
Structured Query Language (SQL)
   ● Difficulty Level : Easy
   ● Last Updated : 28 Jun, 2021
Structured Query Language is a standard Database language which is used to
create, maintain and retrieve the relational database. Following are some
interesting facts about SQL.
      ● SQL is case insensitive. But it is a recommended practice to use
          keywords (like SELECT, UPDATE, CREATE, etc) in capital letters and
          use user defined things (liked table name, column name, etc) in small
          letters.
      ● We can write comments in SQL using “–” (double hyphen) at the
          beginning of any line.
      ● SQL is the programming language for relational databases (explained
          below) like MySQL, Oracle, Sybase, SQL Server, Postgre, etc. Other
         non-relational databases (also called NoSQL) databases like
         MongoDB, DynamoDB, etc do not use SQL
 ● Although there is an ISO standard for SQL, most of the implementations
   slightly vary in syntax. So we may encounter queries that work in SQL
   Server but do not work in MySQL.
   .
   What is Relational Database?
    Relational database means the data is stored as well as retrieved in the
    form of relations (tables). Table 1 shows the relational database with only
    one relation called STUDENT which stores ROLL_NO, NAME,
    ADDRESS, PHONE and AGE of students.
    STUDENT
                                    ADDRE                                A
ROLL_NO             NAME                              PHONE
                                      SS
     1              RAM             DELHI             9455123            1
                      451     8
    RAMES   GURGA   9652431   1
2
      H       ON       543
            ROHTA   9156253   2
3   SUJIT
              K        131
    SURES           9156768   1
4           DELHI
      H                971
●
     TABLE 1
    These are some important terminologies that are used in terms of relation.
    Attribute: Attributes are the properties that define a relation. e.g.;
    ROLL_NO, NAME etc.
    Tuple: Each row in the relation is known as tuple. The above relation
    contains 4 tuples, one of which is shown as:
                                                    9455123          1
     1            RAM              DELHI
                                                       451
●
    Degree: The number of attributes in the relation is known as degree of the
    relation. The STUDENT relation defined above has degree 5.
    Cardinality: The number of tuples in a relation is known as cardinality.
    The STUDENT relation defined above has cardinality 4.
    Column: Column represents the set of values for a particular attribute. The
    column ROLL_NO is extracted from relation STUDENT.
     RO
       L
       L
       _
       N
       O
    1
●
        The queries to deal with relational database can be categories as:
      Data Definition Language: It is used to define the structure of the
      database. e.g; CREATE TABLE, ADD COLUMN, DROP COLUMN and
      so on.
      Data Manipulation Language: It is used to manipulate data in the
      relations. e.g.; INSERT, DELETE, UPDATE and so on.
      Data Query Language: It is used to extract the data from the relations.
      e.g.; SELECT
      So first we will consider the Data Query Language. A generic query to
      retrieve from a relational database is:
               ○ SELECT [DISTINCT] Attribute_List FROM R1,R2….RM
               ○ [WHERE condition]
               ○ [GROUP BY (Attributes)[HAVING condition]]
               ○ [ORDER BY(Attributes)[DESC]];
● Part of the query represented by statement 1 is compulsory if you want to
  retrieve from a relational database. The statements written inside [] are
  optional. We will look at the possible query combination on relation shown
  in Table 1.
  Case 1: If we want to retrieve attributes ROLL_NO and NAME of all
  students, the query will be:
   SELECT ROLL_NO, NAME FROM STUDENT;
    ROLL_
                       NAME
      NO
       1               RAM
                       RAMES
       2
                         H
       3               SUJIT
                       SURES
       4
                         H
Case 2: If we want to retrieve ROLL_NO and NAME of the students whose
ROLL_NO is greater than 2, the query will be:
SELECT ROLL_NO, NAME FROM STUDENT
   ● WHERE ROLL_NO>2;
        ROLL_
                          NAME
          NO
        3                 SUJIT
                          SURES
        4
                            H
CASE 3: If we want to retrieve all attributes of students, we can write * in place of
writing all attributes as:
SELECT * FROM STUDENT
   ● WHERE ROLL_NO>2;
     ROLL                          ADDR
                    NAM                           PHON            A
       _N                            ES
                      E                             E
       O                             S
                                                  91562
                    SUJI           ROHT                           2
     3                                               531
                      T              AK
                                                     31
                    SUR
                                                  91567
                      E                                           1
     4                             DELHI             689
                      S
                                                     71
                      H
●
    CASE 4: If we want to represent the relation in ascending order by AGE,
    we can use ORDER BY clause as:
    SELECT * FROM STUDENT ORDER BY AGE;
ROLL         ADDR
       NAM           PHON     A
  _N           ES
         E             E
  O            S
                     94551
                              1
1      RAM   DELHI      234
                        51
       RAM
             GURG    96524
         E                    1
2              AO       315
         S
               N        43
         H
4      SUR   DELHI   91567    1
         E              689
                        S
                                                      71           8
                        H
                                                   91562
                    SUJI           ROHT                            2
     3                                                531
                      T              AK
                                                      31
●
    Note: ORDER BY AGE is equivalent to ORDER BY AGE ASC. If we want
    to retrieve the results in descending order of AGE, we can use ORDER BY
    AGE DESC.
    CASE 5: If we want to retrieve distinct values of an attribute or group of
    attribute, DISTINCT is used as in:
    SELECT DISTINCT ADDRESS FROM STUDENT;
     ADDRE
       SS
    DELHI
    GURGA
      ON
    ROHTA
      K
●
     If DISTINCT is not used, DELHI will be repeated twice in result set.
     Before understanding GROUP BY and HAVING, we need to
     understand aggregations functions in SQL.
     AGGRATION FUNCTIONS: Aggregation functions are used to perform
     mathematical operations on data values of a relation. Some of the
     common aggregation functions used in SQL are:
           ○ COUNT: Count function is used to count the number of rows
              in a relation. e.g;
● SELECT COUNT (PHONE) FROM STUDENT;
   COUNT(P
     HONE)
           ○ SUM: SUM function is used to add the values of an attribute in
              a relation. e.g;
● SELECT SUM (AGE) FROM STUDENT;
   SUM(A
     GE)
       74
In the same way, MIN, MAX and AVG can be used. As we have seen above, all
aggregation functions return only 1 row.
AVERAGE: It gives the average values of the tupples. It is also defined as sum
divided by count values.
Syntax:AVG(attributename)
OR
Syntax:SUM(attributename)/COUNT(attributename)
The above mentioned syntax also retrieves the average value of tupples.
MAXIMUM:It extracts the maximum value among the set of tupples.
Syntax:MAX(attributename)
MINIMUM:It extracts the minimum value amongst the set of all the tupples.
Syntax:MIN(attributename)
GROUP BY: Group by is used to group the tuples of a relation based on an
attribute or group of attribute. It is always combined with aggregation function
which is computed on group. e.g.;
SELECT ADDRESS, SUM(AGE) FROM STUDENT
   ● GROUP BY (ADDRESS);
      In this query, SUM(AGE) will be computed but not for entire table but for
      each address. i.e.; sum of AGE for address DELHI(18+18=36) and
      similarly for other address as well. The output is:
       ADDRE             SU
         SS                   M
                              (
                              A
             G
             E
             )
DELHI   36
GURGA
        18
  ON
ROHTA
        20
  K
If we try to execute the query given below, it will result in error because although
we have computed SUM(AGE) for each address, there are more than 1
ROLL_NO for each address we have grouped. So it can’t be displayed in result
set. We need to use aggregate functions on columns after SELECT statement to
make sense of the resulting set whenever we are using GROUP BY.
SELECT ROLL_NO, ADDRESS, SUM(AGE) FROM STUDENT
      ● GROUP BY (ADDRESS);
           NOTE: An attribute which is not a part of GROUP BY clause can’t be
         used for selection. Any attribute which is part of GROUP BY CLAUSE
         can be used for selection but it is not mandatory. But we could use
         attributes which are not a part of the GROUP BY clause in an
         aggregrate function.