KEMBAR78
Dbms Mod1 | PDF | Databases | Relational Model
0% found this document useful (0 votes)
26 views82 pages

Dbms Mod1

DBMS Module 3

Uploaded by

kavya.jagtap04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views82 pages

Dbms Mod1

DBMS Module 3

Uploaded by

kavya.jagtap04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

DBMS

MODULE 1

● What is the purpose of Dbms

Purpose of Database System


1. In DBMS, database systems provide a safe and effective platform to manage
vast amounts of data. Their role is to provide services like data organization,
storage, and manipulation, as well as to guarantee data integrity. A database
system’s primary goal is to facilitate data retrieval and provide a dependable
storage platform for essential data.
2. Efficient storage and retrieval are allowed by structured organization of data
through database systems utilizing predefined schemas and data models.
3. DBMS maintains the reliability and accuracy of the information and returns it
through enforced constraints and rules defined in the database schema that
eliminates data redundancy and anomalies, respectively.
4. Protecting confidential data is crucial and database systems successfully achieve
this with their safeguards against unauthorized access.
5. Database systems prioritize the security of sensitive data with their solid
mechanisms in place to preserve data confidentiality.
6. The inclusion of strong security measures in database systems ensures the
protection of sensitive data and upholds its confidentiality. Confidentiality and
privacy of data are maintained by utilizing resilient security measures within
database systems.
7. Collaboration made easy with DBMS. With the provision of a platform to access
and manipulate data, multiple users can now work together and ensure data
consistency across various applications. Data sharing and collaboration are now
synonymous with the help of DBMS.
8. Data backups and transaction management are mechanisms provided by
database systems to ensure data durability. Safeguarding data against system
crashes and failures is their main priority.

● Introduction of DBMS
1. A Database Management System (DBMS) is a software system that is designed to
manage and organize data in a structured manner. It allows users to create, modify, and
query a database, as well as manage the security and access controls for that database.
Key Features of DBMS
2. Data modeling: A DBMS provides tools for creating and modifying data models, which
define the structure and relationships of the data in a database.
3. Data storage and retrieval: A DBMS is responsible for storing and retrieving data from
the database, and can provide various methods for searching and querying the data.
4. Concurrency control: A DBMS provides mechanisms for controlling concurrent access to
the database, to ensure that multiple users can access the data without conflicting with
each other.
5. Data integrity and security: A DBMS provides tools for enforcing data integrity and
security constraints, such as constraints on the values of data and access controls that
restrict who can access the data.
6. Backup and recovery: A DBMS provides mechanisms for backing up and recovering the
data in the event of a system failure.
7. DBMS can be classified into two types: Relational Database Management System
(RDBMS) and Non-Relational Database Management System (NoSQL or Non-SQL)
8. RDBMS: Data is organized in the form of tables and each table has a set of rows and
columns. The data are related to each other through primary and foreign keys.
9. NoSQL: Data is organized in the form of key-value pairs, documents, graphs, or
column-based. These are designed to handle large-scale, high-performance scenarios.
10. A database is a collection of interrelated data which helps in the efficient retrieval,
insertion, and deletion of data from the database and organizes the data in the form of
tables, views, schemas, reports, etc. For Example, a university database organizes the
data about students, faculty, admin staff, etc. which helps in the efficient retrieval,
insertion, and deletion of data from it.

● Need of DBMS
A Data Base Management System is a system software for easy, efficient and reliable data
processing and management. It can be used for:
● Creation of a database.
● Retrieval of information from the database.
● Updating the database.
● Managing a database.
● Multiple User Interface
● Data scalability, expandability and flexibility: We can change schema of the
database, all schema will be updated according to it.
● Overall the time for developing an application is reduced.
● Security: Simplifies data storage as it is possible to assign security permissions
allowing restricted access to data.
Data organization: DBMS allow users to organize large amounts of data in a structured and
systematic way. Data is organized into tables, fields, and records, making it easy to manage,
store, and retrieve information.
Data scalability: DBMS are designed to handle large amounts of data and are scalable to meet
the growing needs of organizations. As organizations grow, DBMS can scale up to handle
increasing amounts of data and user traffic.1.Data Organization and Management
2.Data Security and Privacy
3.Data Integrity and Consistency
4.Concurrent Data Access
5.Data Analysis and Reporting
6.Scalability and Flexibility
7.Cost-Effectiveness
1. Data Organization and Management:
One of the primary needs for a DBMS is data organization and management. DBMSs allow data
to be stored in a structured manner, which helps in easier retrieval and analysis. A
well-designed database schema enables faster access to information, reducing the time
required to find relevant data. A DBMS also provides features like indexing and searching,
which make it easier to locate specific data within the database. This allows organizations to
manage their data more efficiently and effectively.
2. Data Security and Privacy:
DBMSs provide a robust security framework that ensures the confidentiality, integrity, and
availability of data. They offer authentication and authorization features that control access to
the database. DBMSs also provide encryption capabilities to protect sensitive data from
unauthorized access. Moreover, DBMSs comply with various data privacy regulations such as
the GDPR, HIPAA, and CCPA, ensuring that organizations can store and manage their data in
compliance with legal requirements.
3. Data Integrity and Consistency:
Data integrity and consistency are crucial for any database. DBMSs provide mechanisms that
ensure the accuracy and consistency of data. These mechanisms include constraints, triggers,
and stored procedures that enforce data integrity rules. DBMSs also provide features like
transactions that ensure that data changes are atomic, consistent, isolated, and durable (ACID).
4. Concurrent Data Access:
A DBMS provides a concurrent access mechanism that allows multiple users to access the
same data simultaneously. This is especially important for organizations that require real-time
data access. DBMSs use locking mechanisms to ensure that multiple users can access the
same data without causing conflicts or data corruption.
5. Data Analysis and Reporting:
DBMSs provide tools that enable data analysis and reporting. These tools allow organizations to
extract useful insights from their data, enabling better decision-making. DBMSs support various
data analysis techniques such as OLAP, data mining, and machine learning. Moreover, DBMSs
provide features like data visualization and reporting, which enable organizations to present
their data in a visually appealing and understandable way.
6. Scalability and Flexibility:
DBMSs provide scalability and flexibility, enabling organizations to handle increasing amounts of
data. DBMSs can be scaled horizontally by adding more servers or vertically by increasing the
capacity of existing servers. This makes it easier for organizations to handle large amounts of
data without compromising performance. Moreover, DBMSs provide flexibility in terms of data
modeling, enabling organizations to adapt their databases to changing business requirements.
7. Cost-Effectiveness:
DBMSs are cost-effective compared to traditional file-based systems. They reduce storage
costs by eliminating redundancy and optimizing data storage. They also reduce development
costs by providing tools for database design, maintenance, and administration. Moreover,
DBMSs reduce operational costs by automating routine tasks and providing self-tuning
capabilities.

● Database system v/s file system

● DBMS v/s file management system

Basis DBMS Approach File System


Approach
Meaning DBMS is a The file system is a

collection of data. collection of data.

In DBMS, the user is In this system, the

not required to write user has to write

the procedures. the procedures for

managing the

database.

Sharing of data Due to the Data is distributed

centralized in many files, and it

approach, data may be of different

sharing is easy. formats, so it isn't

easy to share data.

Data Abstraction DBMS gives an The file system

abstract view of provides the detail

data that hides the of the data

details. representation and

storage of data.

Security and DBMS provides a It isn't easy to

Protection good protection protect a file under

mechanism. the file system.

Recovery DBMS provides a The file system

Mechanism crash recovery doesn't have a

mechanism, i.e., crash mechanism,


DBMS protects the i.e., if the system

user from system crashes while

failure. entering some data,

then the content of

the file will be lost.

Manipulation DBMS contains a The file system

Techniques wide variety of can't efficiently

sophisticated store and retrieve

techniques to store the data.

and retrieve the

data.

Concurrency DBMS takes care of In the File system,

Problems Concurrent access concurrent access

of data using some has many problems

form of locking. like redirecting the

file while deleting

some information

or updating some

information.

Where to use Database approach File system

used in large approach used in

systems which large systems

interrelate many which interrelate

files. many files.


Cost The database The file system

system is expensive approach is

to design. cheaper to design.

Data Redundancy Due to the In this, the files and

and Inconsistency centralization of the application

database, the programs are

problems of data created by different

redundancy and programmers so

inconsistency are that there exists a

controlled. lot of duplication of

data which may

lead to

inconsistency.

Structure The database The file system

structure is approach has a

complex to design. simple structure.

Data Independence In this system, Data In the File system

Independence approach, there

exists, and it can be exists no Data

of two types. Independence.

○ Logical Data
Independence
○ Physical Data
Independence

Integrity Integrity Constraints Integrity Constraints

Constraints are easy to apply. are difficult to

implement in file

system.

Data Models In the database In the file system

approach, 3 types approach, there is

of data models no concept of data

exist: models exists.

○ Hierarchal data
models

○ Network data
models

○ Relational data
models

Flexibility Changes are often a The flexibility of the

necessity to the system is less as

content of the data compared to the

stored in any DBMS approach.

system, and these

changes are more


easily with a

database approach.

Examples Oracle, SQL Server, Cobol, C++ etc.

Sybase etc.

● Database Administrator
A database administrator (DBA) is a person or group in charge of implementing
DBMS in an organization. The DBA job requires a high degree of technical
expertise. DBA consists of a team of people rather than just one person.

The primary role of Database administrator is as follows −

​ Database design
​ Performance issues
​ Database accessibility
​ Capacity issues
​ Data replication
​ Table Maintenance

Responsibilities of DBA
The responsibilities of DBA are as follows −

​ Makes the decision concerning the content of the database.


​ Plans the storage structure and access strategy.
​ Provides the support to the users.
​ Defines the security and integrity checks.
​ Interpreter backup and recovery strategies.
​ Monitoring the performance and responding to the changes in the
requirements.

Skills required for DBA


The skills required to be a successful DBA are as follows −

● Database designing.
● Knowledge of Structured Query Language (SQL).
● Know about distributed architecture.
● Knowledge on different operating servers.
● Idea on Relational Database Management System (RDBMS).
● Ready to face challenges and solve the problems quickly.

● Function of DBA
Above ans.
● Database Architecture
❖ Level of abstraction or 3 tier architecture
Physical/internal level
Conceptual/logical level
External/view level
DBMS Architecture
○ The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.

○ The client/server architecture consists of many PCs and a workstation which are
connected via the network.

○ DBMS architecture depends upon how users are connected to the database to
get their request done.

Types of DBMS Architecture


Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture

○ In this architecture, the database is directly available to the user. It means the
user can directly sit on the DBMS and uses it.

○ Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.

○ The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick
response.

2-Tier Architecture

○ The 2-Tier architecture is same as basic client-server. In the two-tier architecture,


applications on the client end can directly communicate with the database at the
server side. For this interaction, API's like: ODBC, JDBC are used.

○ The user interfaces and application programs are run on the client-side.

○ The server side is responsible to provide the functionalities like: query processing
and transaction management.

○ To communicate with the DBMS, client-side application establishes a connection


with the server side.
Fig: 2-tier Architecture

3-Tier Architecture

○ The 3-Tier architecture contains another layer between the client and server. In
this architecture, client can't directly communicate with the server.

○ The application on the client-end interacts with an application server which


further communicates with the database system.

○ End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the
application.

○ The 3-Tier architecture is used in case of large web application.


● Data Independence
Data Independence
○ Data independence can be explained using the three-schema architecture.

○ Data independence refers characteristic of being able to modify the schema at


one level of the database system without altering the schema at the next higher
level.

There are two types of data independence:

1. Logical Data Independence

○ Logical data independence refers characteristic of being able to change the


conceptual schema without having to change the external schema.
○ Logical data independence is used to separate the external level from the
conceptual view.

○ If we do any changes in the conceptual view of the data, then the user view of the
data would not be affected.

○ Logical data independence occurs at the user interface level.

2. Physical Data Independence

○ Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.

○ If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.

○ Physical data independence is used to separate conceptual levels from the


internal levels.

○ Physical data independence occurs at the logical interface level.


● Database Schema and instances

What is DBMS Schema?


Here the DBMS schema means designing the database. For example, if we take the
example of the employee table. The employee table contains the following attributes.
These attributes are EMP_ID, EMP_ADDRESS, EMP_NAME, EMP_CONTACT. These are
the schema of the employee table.

Schema is further divided into three types. These three are as follows.

1. Logical schema.

2. View schema.

3. Physical schema.
The schema defines the logical view of the database. It provides some knowledge about
the database and what data needs to go where.

In DBMS, the schema is shown in diagram format.

We can understand the relationship between the data present in the database. With the
help of this schema, we can implement the DBMS function such as delete, insert,
search, update, etc.

Let us understand this by the below diagram. There are three diagrams, i.e., section,
course, and student. This diagram shows the relationship between the section and the
course diagram. Schema is the only type of structural view of the database that is
shown below.
1. Physical schema:

In the physical schema, the database is designed at the physical level. At this level, the
schema describes how the data block is stored and how the storage is managed.

2. Logical schema:

In the logical schema, the database is designed at a logical level. At this level, the
programmer and data administrator perform their work. Also, at this level, a certain
amount of data is stored in a structured way. But the internal implementation data are
hidden in the physical layer for the security proposed.

3. View schema:

In view schema, the database is designed at the view level. This schema describes the
user interaction with the database system.

Moreover, Data Definition Language (DDL) statements help to denote the schema of a
database. The schema represents the name of the table, the name of attributes, and
their types; constraints of the tables are related to the schema. Therefore, if users want
to modify the schema, they can write DDL statements.

What is DBMS Instance?


In DBMS, the data is stored for a particular amount of time and is called an instance of
the database. The database schema defines the attributes of the database in the
particular DBMS. The value of the particular attribute at a particular moment in time is
known as an instance of the DBMS.

For example, in the above example, we have taken the example of the attribute of the
schema. In this example, each table contains two rows or two records. In the above
schema of the table, the employee table has some instances because all the data
stored by the table have some instances.

Let's take another example: Let's say we have a single table student in the database;
today, the table has 100 records, so today, the instance of the database has 100
records. We are going to add another 100 records to this table by tomorrow, so the
instance of the database tomorrow will have 200 records in the table. In short, at a
particular moment, the data stored in the database is called the instance; this change
over time as and when we add, delete or update data in the database.

Differences between Database Schema and


Instance
Both of these help in describing the data available in a database, but there is a
fundamental difference between Schema and Instance in DBMS. Schema refers to the
overall description of any given database. Instance basically refers to a collection of
data and Information that the database stores at any particular moment.

The major differences between schema and instance are as follows:

Database Schema Database Instance

It is the definition of the It is a snapshot of a database at a

database, or it is defined as the specific moment.

description of the database.

It rarely changes. It changes frequently.

This corresponds to the The value of the variable in a program

variable declaration of a at a point in time corresponds to an

programming language. instance of the database schema.

Defines the basic structure of It is the set of Information stored at a

the database, i.e., how the data particular time.

will be stored in the database.

Schema is same for whole Data in instances can be changed

database. using addition, deletion, updation.

It does not change very It changes very frequently

frequently.

● View of Data
Same as data abstraction
● Data abstraction
The database system contains intricate data structures and relations. The developers
keep away the complex data from the user and remove the complications so that the
user can comfortably access data in the database and can only access the data they
want, which is done with the help of data abstraction.

The main purpose of data abstraction is to hide irrelevant data and provide an abstract
view of the data. With the help of data abstraction, developers hide irrelevant data from
the user and provide them the relevant data. By doing this, users can access the data
without any hassle, and the system will also work efficiently.

In DBMS, data abstraction is performed in layers which means there are levels of data
abstraction in DBMS that we will further study in this article. Based on these levels, the
database management system is designed.

Levels of Data Abstractions in DBMS


In DBMS, there are three levels of data abstraction, which are as follows:

1. Physical or Internal Level:


The physical or internal layer is the lowest level of data abstraction in the database
management system. It is the layer that defines how data is actually stored in the
database. It defines methods to access the data in the database. It defines complex
data structures in detail, so it is very complex to understand, which is why it is kept
hidden from the end user.

Data Administrators (DBA) decide how to arrange data and where to store data. The
Data Administrator (DBA) is the person whose role is to manage the data in the
database at the physical or internal level. There is a data center that securely stores the
raw data in detail on hard drives at this level.

2. Logical or Conceptual Level:

The logical or conceptual level is the intermediate or next level of data abstraction. It
explains what data is going to be stored in the database and what the relationship is
between them.

It describes the structure of the entire data in the form of tables. The logical level or
conceptual level is less complex than the physical level. With the help of the logical
level, Data Administrators (DBA) abstract data from raw data present at the physical
level.

3. View or External Level:

View or External Level is the highest level of data abstraction. There are different views
at this level that define the parts of the overall data of the database. This level is for the
end-user interaction; at this level, end users can access the data based on their queries.

Advantages of data abstraction in DBMS

● Users can easily access the data based on their queries.

● It provides security to the data stored in the database.

● Database systems work efficiently because of data abstraction.

● Database Languages
❖ DML
❖ DDL
❖ DCL(not in ppt)
Database Languages in DBMS
○ A DBMS has appropriate languages and interfaces to express database queries
and updates.

○ Database languages can be used to read, store and update the data in the
database.

Types of Database Languages

1. Data Definition Language (DDL)

○ DDL stands for Data Definition Language. It is used to define database structure
or pattern.
○ It is used to create schema, tables, indexes, constraints, etc. in the database.

○ Using the DDL statements, you can create the skeleton of the database.

○ Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.

Here are some tasks that come under DDL:

○ Create: It is used to create objects in the database.

○ Alter: It is used to alter the structure of the database.

○ Drop: It is used to delete objects from the database.

○ Truncate: It is used to remove all records from a table.

○ Rename: It is used to rename an object.

○ Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under
Data definition language.

2. Data Manipulation Language (DML)


DML stands for Data Manipulation Language. It is used for accessing and manipulating
data in a database. It handles user requests.

Here are some tasks that come under DML:

○ Select: It is used to retrieve data from a database.

○ Insert: It is used to insert data into a table.

○ Update: It is used to update existing data within a table.

○ Delete: It is used to delete all records from a table.

○ Merge: It performs UPSERT operation, i.e., insert or update operations.


○ Call: It is used to call a structured query language or a Java subprogram.

○ Explain Plan: It has the parameter of explaining data.

○ Lock Table: It controls concurrency.

3. Data Control Language (DCL)

○ DCL stands for Data Control Language. It is used to retrieve the stored or saved
data.

○ The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the
feature of rolling back.)

Here are some tasks that come under DCL:

○ Grant: It is used to give user access privileges to a database.

○ Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language (TCL)


TCL is used to run the changes made by the DML statement. TCL can be grouped into a
logical transaction.

Here are some tasks that come under TCL:

● Commit: It is used to save the transaction on the database.

● Rollback: It is used to restore the database to original since the last Commit.
● Domain Constraints
Integrity Constraints
○ Integrity constraints are a set of rules. It is used to maintain the quality of
information.

○ Integrity constraints ensure that the data insertion, updating, and other processes
have to be performed in such a way that data integrity is not affected.

○ Thus, integrity constraint is used to guard against accidental damage to the


database.

Types of Integrity Constraint

1. Domain constraints

○ Domain constraints can be defined as the definition of a valid set of values for an
attribute.
○ The data type of domain includes string, character, integer, time, date, currency,
etc. The value of the attribute must be available in the corresponding domain.

Example:

2. Entity integrity constraints

○ The entity integrity constraint states that primary key value can't be null.

○ This is because the primary key value is used to identify individual rows in
relation and if the primary key has a null value, then we can't identify those rows.

○ A table can contain a null value other than the primary key field.

Example:
3. Referential Integrity Constraints

○ A referential integrity constraint is specified between two tables.

○ In the Referential integrity constraints, if a foreign key in Table 1 refers to the


Primary Key of Table 2, then every value of the Foreign Key in Table 1 must be
null or be available in Table 2.

Example:
4. Key constraints

○ Keys are the entity set that is used to identify an entity within its entity set
uniquely.

○ An entity set can have multiple keys, but out of which one key will be the primary
key. A primary key can contain a unique and null value in the relational table.

Example:
● Referral Integrity
Above ans
● Assertions
When a constraint involves 2 (or) more tables, the table constraint mechanism is sometimes
hard and results may not come as expected. To cover such situation SQL supports the creation
of assertions that are constraints not associated with only one table. And an assertion statement
should ensure a certain condition will always exist in the database. DBMS always checks the
assertion whenever modifications are done in the corresponding table.

S.No Assertions Triggers

We can use Assertions when


we know that the given We can use Triggers even particular
1.
particular condition is always condition may or may not be true.
true.

When the SQL condition is Triggers can catch errors if the


2.
not met then there are condition of the query is not true.
chances to an entire table or
even Database to get locked
up.

It helps in maintaining the integrity


Assertions are not linked to
constraints in the database tables,
specific table or event. It
3. especially when the primary key
performs task specified or
and foreign key constraint are not
defined by the user.
defined.

Assertions do not maintain


Triggers maintain track of all
4. any track of changes made in
changes occurred in table.
table.

They have large Syntax to indicate


Assertions have small syntax
5. each and every specific of the
compared to Triggers.
created trigger.

Modern databases do not use Triggers are very well used in


6.
Assertions. modern databases.
Purpose of assertions is to Purpose of triggers is to Executes
7. Enforces business rules and actions in response to data
constraints. changes.

Activation is checked after a Activation is activated by data


8.
transaction completes changes during a transaction

Granularity applies to the Granularity applies to a specific


9.
entire database table or view

Syntax Uses procedural code (e.g.


10. Syntax Uses SQL statements
PL/SQL, T-SQL)

Error handling Causes Error handling can ignore errors or


11.
transaction to be rolled back. handle them explicitly

Assertions may slow down Triggers Can impact performance of


12.
performance of queries. data changes.
Assertions are Easy to debug Triggers are more difficult to debug
13.
with SQL statements. procedural code

Examples –
Examples- CHECK
14. constraints, FOREIGN KEY
AFTER INSERT triggers, INSTEAD
constraints
OF triggers

● Authorization

Authentication
User authentication is to make sure that the person accessing the database is
who he claims to be. Authentication can be done at the operating system level
or even the database level itself. Many authentication systems such as retina
scanners or bio-metrics are used to make sure unauthorized people cannot
access the database.

Authorization
Authorization is a privilege provided by the Database Administer. Users of the
database can only view the contents they are authorized to view. The rest of the
database is out of bounds to them.

The different permissions for authorizations available are:

​ Primary Permission - This is granted to users publicly and directly.


​ Secondary Permission - This is granted to groups and automatically
awarded to a user if he is a member of the group.
​ Public Permission - This is publicly granted to all the users.
​ Context sensitive permission - This is related to sensitive content and
only granted to a select users.

The categories of authorization that can be given to users are:

● System Administrator - This is the highest administrative authorization


for a user. Users with this authorization can also execute some database
administrator commands such as restore or upgrade a database.
● System Control - This is the highest control authorization for a user. This
allows maintenance operations on the database but not direct access to
data.
● System Maintenance - This is the lower level of system control authority.
It also allows users to maintain the database but within a database
manager instance.
● System Monitor - Using this authority, the user can monitor the database
and take snapshots of it.

● Data Models
❖ Relational model
The relational model represents how data is stored in Relational Databases. A relational
database consists of a collection of tables, each of which is assigned a unique name. Consider
a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in
the table.
Table Student

ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18


2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 18

Important Terminologies
● Attribute: Attributes are the properties that define an entity. e.g.; ROLL_NO, NAME,
ADDRESS
● Relation Schema: A relation schema defines the structure of the relation and represents
the name of the relation with its attributes. e.g.; STUDENT (ROLL_NO, NAME,
ADDRESS, PHONE, and AGE) is the relation schema for STUDENT. If a schema has
more than 1 relation, it is called Relational Schema.
● Tuple: Each row in the relation is known as a tuple. The above relation contains 4 tuples,
one of which is shown as:

1 RAM DELHI 9455123451 18

● Relation Instance: The set of tuples of a relation at a particular instance of time is called
a relation instance. Table 1 shows the relation instance of STUDENT at a particular time.
It can change whenever there is an insertion, deletion, or update in the database.
● Degree: The number of attributes in the relation is known as the degree of the relation.
The STUDENT relation defined above has degree 5.
● Cardinality: The number of tuples in a relation is known as cardinality. The STUDENT
relation defined above has cardinality 4.
● Column: The column represents the set of values for a particular attribute. The column
ROLL_NO is extracted from the relation STUDENT.

ROLL_NO

3
4

● NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.
● Relation Key: These are basically the keys that are used to identify the rows uniquely or
also help in identifying tables. These are of the following types.
● Primary Key
● Candidate Key
● Super Key
● Foreign Key
● Alternate Key
● Composite Key

Constraints in Relational Model


While designing the Relational Model, we define some conditions which must hold for data
present in the database are called Constraints. These constraints are checked before
performing any operation (insertion, deletion, and updation ) in the database. If there is a
violation of any of the constraints, the operation will fail.

Domain Constraints
These are attribute-level constraints. An attribute can only take values that lie inside the domain
range. e.g.; If a constraint AGE>0 is applied to STUDENT relation, inserting a negative value of
AGE will result in failure.

Key Integrity
Every relation in the database should have at least one set of attributes that defines a tuple
uniquely. Those set of attributes is called keys. e.g.; ROLL_NO in STUDENT is key. No two
students can have the same roll number. So a key has two properties:
● It should be unique for all tuples.
● It can’t have NULL values.

Referential Integrity
When one attribute of a relation can only take values from another attribute of the same relation
or any other relation, it is called referential integrity. Let us suppose we have 2 relations
Table Student

ROLL_NO NAME ADDRESS PHONE AGE BRANCH_CODE

1 RAM DELHI 9455123451 18 CS


2 RAMESH GURGAON 9652431543 18 CS

3 SUJIT ROHTAK 9156253131 20 ECE

4 SURESH DELHI 18 IT

Table Branch

BRANCH_CODE BRANCH_NAME

CS COMPUTER SCIENCE

IT INFORMATION TECHNOLOGY

ECE ELECTRONICS AND COMMUNICATION ENGINEERING

CV CIVIL ENGINEERING

BRANCH_CODE of STUDENT can only take the values which are present in BRANCH_CODE
of BRANCH which is called referential integrity constraint. The relation which is referencing
another relation is called REFERENCING RELATION (STUDENT in this case) and the relation
to which other relations refer is called REFERENCED RELATION (BRANCH in this case).

Anomalies in the Relational Model


An anomaly is an irregularity or something which deviates from the expected or normal state.
When designing databases, we identify three types of anomalies: Insert, Update, and Delete.

Insertion Anomaly in Referencing Relation


We can’t insert a row in REFERENCING RELATION if referencing attribute’s value is not
present in the referenced attribute value. e.g.; Insertion of a student with BRANCH_CODE ‘ME’
in STUDENT relation will result in an error because ‘ME’ is not present in BRANCH_CODE of
BRANCH.

Deletion/ Updation Anomaly in Referenced Relation:


We can’t delete or update a row from REFERENCED RELATION if the value of REFERENCED
ATTRIBUTE is used in the value of REFERENCING ATTRIBUTE. e.g; if we try to delete a tuple
from BRANCH having BRANCH_CODE ‘CS’, it will result in an error because ‘CS’ is referenced
by BRANCH_CODE of STUDENT, but if we try to delete the row from BRANCH with
BRANCH_CODE CV, it will be deleted as the value is not been used by referencing relation. It
can be handled by the following method:

On Delete Cascade
It will delete the tuples from REFERENCING RELATION if the value used by REFERENCING
ATTRIBUTE is deleted from REFERENCED RELATION. e.g.; For, if we delete a row from
BRANCH with BRANCH_CODE ‘CS’, the rows in STUDENT relation with BRANCH_CODE CS
(ROLL_NO 1 and 2 in this case) will be deleted.

On Update Cascade
It will update the REFERENCING ATTRIBUTE in REFERENCING RELATION if the attribute
value used by REFERENCING ATTRIBUTE is updated in REFERENCED RELATION. e.g;, if
we update a row from BRANCH with BRANCH_CODE ‘CS’ to ‘CSE’, the rows in STUDENT
relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be updated with
BRANCH_CODE ‘CSE’.

Super Keys
Any set of attributes that allows us to identify unique rows (tuples) in a given relationship is
known as super keys. Out of these super keys, we can always choose a proper subset among
these that can be used as a primary key. Such keys are known as Candidate keys. If there is a
combination of two or more attributes that are being used as the primary key then we call it a
Composite key.

Codd Rules in Relational Model


Edgar F Codd proposed the relational database model where he stated rules. Now these are
known as Codd’s Rules. For any database to be the perfect one, it has to follow the rules.
For more, refer to Codd Rules in Relational Model.

Advantages of the Relational Model


● Simple model: Relational Model is simple and easy to use in comparison to other
languages.
● Flexible: Relational Model is more flexible than any other relational model present.
● Secure: Relational Model is more secure than any other relational model.
● Data Accuracy: Data is more accurate in the relational data model.
● Data Integrity: The integrity of the data is maintained in the relational model.
● Operations can be Applied Easily: It is better to perform operations in the relational
model.
Disadvantages of the Relational Model
● Relational Database Model is not very good for large databases.
● Sometimes, it becomes difficult to find the relation between tables.
● Because of the complex structure, the response time for queries is high.

Characteristics of the Relational Model


❖ Data is represented in rows and columns called relations.
❖ Data is stored in tables having relationships between them called the Relational
model.
❖ The relational model supports the operations like Data definition, Data
manipulation, and Transaction management.
❖ Each column has a distinct name and they are representing attributes.
❖ Each row represents a single entity.

❖ ER model
❖ ER model stands for an Entity-Relationship model. It is a high-level data
model. This model is used to define the data elements and relationship for
a specified system.

❖ It develops a conceptual design for the database. It also develops a very


simple and easy to design view of data.

❖ In ER modeling, the database structure is portrayed as a diagram called an


entity-relationship diagram.

For example, Suppose we design a school database. In this database, the student will
be an entity with attributes like address, name, id, age, etc. The address can be another
entity with attributes like city, street name, pin code, etc and there will be a relationship
between them.
Component of ER Diagram
1. Entity:

An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc.


can be taken as an entity.
a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.

2. Attribute

The attribute is used to describe the property of an entity. Eclipse is used to represent
an attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents
a primary key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute.


The composite attribute is represented by an ellipse, and those ellipses are connected
with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued
attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another
attribute like Date of birth.

3. Relationship

A relationship is used to describe the relation between entities. Diamond or rhombus is


used to represent the relationship.
Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known
as one to one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity
on the right associates with the relationship then this is known as a one-to-many
relationship.

For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist.

c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity
on the right associates with the relationship then it is known as a many-to-one
relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then it is known as a many-to-many
relationship.

For example, Employee can assign by many projects and project can have many
employees.

Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are
used to express the cardinality. These notations are as follows:
Cardinality
Cardinality means how the entities are arranged to each other or what is the relationship
structure between entities in a relationship set. In a Database Management System,
Cardinality represents a number that denotes how many times an entity is participating
with another entity in a relationship set. The Cardinality of DBMS is a very important
attribute in representing the structure of a Database. In a table, the number of rows or
tuples represents the Cardinality.

Cardinality Ratio
Cardinality ratio is also called Cardinality Mapping, which represents the mapping of
one entity set to another entity set in a relationship set. We generally take the example
of a binary relationship set where two entities are mapped to each other.

Cardinality is very important in the Database of various businesses. For example, if we


want to track the purchase history of each customer then we can use the one-to-many
cardinality to find the data of a specific customer. The Cardinality model can be used in
Databases by Database Managers for a variety of purposes, but corporations often use
it to evaluate customer or inventory data.

There are four types of Cardinality Mapping in Database Management Systems:

1. One to one

2. Many to one

3. One to many

4. Many to many

One to One

One to one cardinality is represented by a 1:1 symbol. In this, there is at most one
relationship from one entity to another entity. There are a lot of examples of one-to-one
cardinality in real life databases.

For example, one student can have only one student id, and one student id can belong
to only one student. So, the relationship mapping between student and student id will be
one to one cardinality mapping.

Another example is the relationship between the director of the school and the school
because one school can have a maximum of one director, and one director can belong
to only one school.

Note: it is not necessary that there would be a mapping for all entities in an entity set in
one-to-one cardinality. Some entities cannot participate in the mapping.
Many to One Cardinality:

In many to one cardinality mapping, from set 1, there can be multiple sets that can make
relationships with a single entity of set 2. Or we can also describe it as from set 2, and
one entity can make a relationship with more than one entity of set 1.

One to one Cardinality is the subset of Many to one Cardinality. It can be represented by
M:1.

For example, there are multiple patients in a hospital who are served by a single doctor,
so the relationship between patients and doctors can be represented by Many to one
Cardinality.
One to Many Cardinalities:

In One-to-many cardinality mapping, from set 1, there can be a maximum single set that
can make relationships with a single or more than one entity of set 2. Or we can also
describe it as from set 2, more than one entity can make a relationship with only one
entity of set 1.

One to one cardinality is the subset of One-to-many Cardinality. It can be represented by


1: M.

For Example, in a hospital, there can be various compounders, so the relationship


between the hospital and compounders can be mapped through One-to-many
Cardinality.
Many to Many Cardinalities:

In many, many cardinalities mapping, there can be one or more than one entity that can
associate with one or more than one entity of set 2. In the same way from the end of set
2, one or more than one entity can make a relation with one or more than one entity of
set 1.

It is represented by M: N or N: M.

One to one cardinality, One to many cardinalities, and Many to one cardinality is the
subset of the many to many cardinalities.

For Example, in a college, multiple students can work on a single project, and a single
student can also work on multiple projects. So, the relationship between the project and
the student can be represented by many to many cardinalities.
Appropriate Mapping Cardinality
Evidently, the real-world context in which the relation set is modeled determines the
Appropriate Mapping Cardinality for a specific relation set.

❖ We can combine relational tables with many involved tables if the


Cardinality is one-to-many or many-to-one.

❖ One entity can be combined with a relation table if it has a one-to-one


relationship and total participation, and two entities can be combined with
their relation to form a single table if both of them have total participation.

❖ We cannot mix any two tables if the Cardinality is many-to-many.

❖ Keys play an important role in the relational database.

❖ It is used to uniquely identify any record or row of data from the table. It is
also used to establish and identify relationships between tables.
For example, ID is used as a key in the Student table because it is unique for each
student. In the PERSON table, passport_number, license_number, SSN are keys since
they are unique for each person.

Types of keys:

1. Primary key
○ It is the first key used to identify one and only one instance of an entity uniquely.
An entity can contain multiple keys, as we saw in the PERSON table. The key
which is most suitable from those lists becomes a primary key.

○ In the EMPLOYEE table, ID can be the primary key since it is unique for each
employee. In the EMPLOYEE table, we can even select License_Number and
Passport_Number as primary keys since they are also unique.

○ For each entity, the primary key selection is based on requirements and
developers.

2. Candidate key

○ A candidate key is an attribute or set of attributes that can uniquely identify a


tuple.

○ Except for the primary key, the remaining attributes are considered a candidate
key. The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of
the attributes, like SSN, Passport_Number, License_Number, etc., are considered a
candidate key.

3. Super Key

Super key is an attribute set that can uniquely identify a tuple. A super key is a superset
of a candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the
name of two employees can be the same, but their EMPLYEE_ID can't be the same.
Hence, this combination can also be a key.

The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key

○ Foreign keys are the column of the table used to point to the primary key of
another table.

○ Every employee works in a specific department in a company, and employee and


department are two different entities. So we can't store the department's
information in the employee table. That's why we link these two tables through
the primary key of one table.

○ We add the primary key of the DEPARTMENT table, Department_Id, as a new


attribute in the EMPLOYEE table.

○ In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.
5. Alternate key

There may be one or more attributes or a combination of attributes that uniquely


identify each tuple in a relation. These attributes or combinations of the attributes are
called the candidate keys. One key is chosen as the primary key from these candidate
keys, and the remaining candidate key, if it exists, is termed the alternate key. In other
words, the total number of the alternate keys is the total number of candidate keys
minus the primary key. The alternate key may or may not exist. If there is only one
candidate key in a relation, it does not have an alternate key.

For example, employee relation has two attributes, Employee_Id and PAN_No, that act
as candidate keys. In this relation, Employee_Id is chosen as the primary key, so the
other candidate key, PAN_No, acts as the Alternate key.
6. Composite key

Whenever a primary key consists of more than one attribute, it is known as a composite
key. This key is also known as Concatenated Key.

For example, in employee relations, we assume that an employee may be assigned


multiple roles, and an employee may work on multiple projects simultaneously. So the
primary key will be composed of all three attributes, namely Emp_ID, Emp_role, and
Proj_ID in combination. So these attributes act as a composite key since the primary key
comprises more than one attribute.
7. Artificial key

The key created using arbitrarily assigned data are known as artificial keys. These keys
are created when a primary key is large and complex and has no relationship with many
other relations. The data values of the artificial keys are usually numbered in a serial
order.

❖ Object based data Model


Need of Object Oriented Data Model :
To represent the complex real world problems there was a need for a data model that is closely
related to real world. Object Oriented Data Model represents the real world problems easily.
Object Oriented Data Model :
In Object Oriented Data Model, data and their relationships are contained in a single structure
which is referred as object in this data model. In this, real world problems are represented as
objects with different attributes. All objects have multiple relationships between them. Basically,
it is combination of Object Oriented programming and Relational Database Model as it is clear
from the following figure :
Object Oriented Data Model
= Combination of Object Oriented Programming + Relational database model
Components of Object Oriented Data Model :
Basic Object Oriented Data Model

● Objects –
An object is an abstraction of a real world entity or we can say it is an instance of class.
Objects encapsulates data and code into a single unit which provide data abstraction by
hiding the implementation details from the user. For example: Instances of student,
doctor, engineer in above figure.

● Attribute –
An attribute describes the properties of object. For example: Object is STUDENT and its
attribute are Roll no, Branch, Setmarks() in the Student class.

● Methods –
Method represents the behavior of an object. Basically, it represents the real-world
action. For example: Finding a STUDENT marks in above figure as Setmarks().

● Class –
A class is a collection of similar objects with shared structure i.e. attributes and behavior
i.e. methods. An object is an instance of class. For example: Person, Student, Doctor,
Engineer in above figure.

class student
{
char Name[20];
int roll_no;
--
--
public:
void search();
void update();
}
In this example, students refers to class and S1, S2 are the objects of class which can be
created in main function.
● Inheritance –
By using inheritance, new class can inherit the attributes and methods of the old class
i.e. base class. For example: as classes Student, Doctor and Engineer are inherited from
the base class Person.

Advantages of Object Oriented Data Model :


● Codes can be reused due to inheritance.
● Easily understandable.
● Cost of maintenance can reduced due to reusability of attributes and functions because
of inheritance.
Disadvantages of Object Oriented Data Model :
❖ It is not properly developed so not accepted by users easily.

❖ Semi structured data model


The semi-structured data model is designed as an evolution of the
relational data model that allows the representation of data with a
flexible structure. Some items may have missing attributes, others
may have extra attributes, some items may have two ore more
occurrences of the same attribute. The type of an attribute is also
flexible: it may be an atomic value, or it may be another record or
collection. Moreover, collections may be heterogeneous, i.e., they
may contain items with different structures. The semi-structured
data model is self-describing data model, in which the data values
and the schema components co-exist. Formally:
● Database Design
❖ Database application
Nowadays, any business that has small or large amounts of data needs a database to
store and manage the information. The database is an easy, reliable, secure, and
efficient way to maintain business information. There are many applications where
databases are used.
In this article, we will discuss some of the applications of databases, which are
mentioned below:

1. Universities:

It is an undeniable application of the database. Universities have so much data which


can be stored in the database, such as student information, teacher information,
non-teaching staff information, course information, section information, grade report
information, and many more. University information is kept safe and secure in the
database.

Anyone who needs information about the student, teacher, or course can easily retrieve
it from the database. Everything needs to be maintained because even after ten years,
information may be required, and the information may be useful, so maintaining
complete information is the primary responsibility of any university or educational
institution.

2. Banking:
It is one of the major applications of databases. Banks have a huge amount of data as
millions of people have accounts that need to be maintained properly. The database
keeps the record of each user in a systematic manner. Banking databases store a lot of
information about account holders. It stores customer details, asset details, banking
transactions, balance sheets, credit card and debit card details, loans, fixed deposits,
and much more. Everything is maintained with the help of a database.

3. Railway Reservation System:

It is an inevitable area of application of databases. They store information such as


passenger name, mobile number, booking status, reservation details, train schedule,
employee information, account details, seating arrangement, route & alternate route
details, etc. All the information needs to be maintained, so railways use a database
management system for their efficient storage and retrieval purpose.

4. Social Media Sites:

Nowadays, everyone has a smartphone and accounts on various social media sites like
Facebook, LinkedIn, Pinterest, Twitter, Instagram, etc. People can chat with their friends
and family and make new friends from all over the world. Social media has millions of
accounts, which means they have a huge amount of data that needs to be stored and
maintained. Social media sites use databases to store information about users, images,
videos, chats, etc.

5. Library Management System:

There are hundreds and thousands of books in the library, so it is not easy to maintain
the records of the books in a register or diary, so a database management system is
used which maintains the information of the library efficiently. The library database
stores information like book name, issue date, author name, book availability, book
issuer name, book return details, etc.

6. E-commerce Websites:

E-commerce websites are one of the prominent applications of the database. Websites
such as Flipkart, Myntra, Amazon, Nykaa, Snapdeal, Shopify, and many more, are online
shopping websites where people buy items online. These websites have so much data.
These websites use databases to securely store and maintain customer details, product
details, dealer details, purchase details, bank & card details, transactions details, invoice
details, etc. You can analyze the sales and maintain the inventory with the help of a
database.

7. Medical:

There is a lot of important data collection in the medical field, so it is necessary to use
the database to store data related to the medical field, such as patient details, medicine
details, practitioner details, surgeon details, appointment details, doctor schedule,
patient discharge details, payment detail, invoices, and other medical records. The
database management system is a boon for the medical field because it helps doctors
to monitor their patients and provide better care.

8. Accounting and Finance:

When there is big data regarding accounting and finance, there is a need to maintain a
large amount of data, which is done with the help of a database. The database stores
data such as accounting details, bank details, purchases of stocks, invoice details, sales
records, asset details, etc. Accounting and finance database helps in maintaining and
analyzing historical data.

9. Industries:

The database management system is the main priority of industries because they need
to store huge amounts of data. The industry database stores customer details, sales
records, product lists, transactions, etc. All the information is kept secure and
maintained by the database.

10. Airline Reservation System:

It is one of the applications of database management systems that contain data such
as passenger name, passenger check-in, passenger departure, flight schedule, number
of flights, distance from source to destination, reservation information, pilot details,
accounting detail, route detail, etc. The database provides maintenance and security to
airline data.

11. Telecommunication:

We cannot deny that telecommunication has brought a remarkable revolution


worldwide. The Telecom field has huge data, and it is very difficult to manage big data
without a database; that is why a telecom database is required, which stores data such
as customer names, phone numbers, calling details, prepaid & post-paid connection
records, network usage, bill details, balance details, etc.

12. Manufacturing:

In the manufacturing field, a lot of data needs to be maintained regarding supply chain
management, so the database maintains the data such as product details, customer
information, order details, purchase details, payment info, worker's details, invoice, etc.
Manufacturing companies produce and supply products every day, so it is important to
use a database.

13. Human Resource Management:

Any organization will definitely have employees, and if there are a large number of
employees, then it becomes essential to store data in a database as it maintains and
securely saves the data, which can be retrieved and accessed when required. The
human resource database stores data such as employee name, joining details,
designation, salary details, tax information, benefits & goodies details, etc.

14. Broadcasting:

Broadcasting is distributing video and audio content to a dispersed audience by


television, radio, or other means. Broadcasting database stores data such as subscriber
information, event recordings, event schedules, etc., so it becomes important to store
broadcasting data in the database.

15. Insurance:

An insurance company needs a database to store large amounts of data. Insurance


database stores data such as policy details, user details, buyer details, payment details,
nominee details, address details, etc.

❖ Phases of database design

Requirements Analysis
Requirements analysis is the first phase that focuses on understanding the goals
and requirements for the database. It involves:
● Identifying the purpose and scope of the database – what data needs to be
stored and why.
● Determining what applications will use the database – this helps anticipate
future data access needs.
● Interviewing stakeholders and users to understand reporting, analysis, and
other requirements.
● Identifying crucial entities, attributes, relationships, and constraints for the
data model.
Thorough requirements gathering helps design a database that contains the right
data elements to meet business objectives.

2. Conceptual Data Modeling


The conceptual data model phase focuses on identifying the highest-level
relationships between the main entities or objects in the application domain. The
conceptual model is independent of any implementation concerns.
Following steps are involved in conceptual data modeling:
● Using business requirements to identity core entities.
● Determining the attributes for each entity – these are characteristics that
describe or qualify the entity.
● Defining relationships between entities, such as one-to-one, one-to-many,
or many-to-many.
● Representing the entities and relationships using modeling methodologies
like entity-relationship diagrams.
Its end goal is to create a high-level abstract model representing the overall
structure of the data.

3. Logical Database Design


Logical design converts the conceptual model into a more technical map of the
database, focused on structures. Steps include:
● Mapping conceptual model entities and attributes to database tables and
columns.
● Establishing primary keys and foreign keys to represent relationships and
enforce referential integrity.
● Normalizing the table structure through techniques like removing
redundant attributes.
● Defining indexes, partitions, and other constructs to optimize performance.
The logical design introduces database-specific concepts while maintaining a
platform-independent perspective.

4. Physical Database Design


Physical design maps the logical model directly to a specific database
management system (DBMS) and its features. Following are the tasks in physical
database design:
● Selecting the DBMS technology like Oracle, MySQL, MongoDB etc.
● Defining the database, tablespaces, files, and physical storage parameters.
● Implementing the table and column definitions using DBMS syntax.
● Specifying data types, keys, constraints, triggers, and other constructs.
● Determining indexing, partitioning, and security settings based on DBMS
capabilities.
The physical design creates a database model customized to the target
environment.

5. Database Implementation
Database implementation is the actual creation of the database and all its
objects on the chosen DBMS platform. It involves these steps:
● Use of Data Definition Language (DDL) statements to create the database,
tables, indexes, keys, triggers, procedures, and other elements designed in
the physical model.
● Loading initial master data or reference data needed by the applications.
● Granting access permissions and roles to users and groups.
● Testing the database operations and performance using dummy data.
● Finalizing documentation for the database schema, processes, security
model etc.
Successful implementation brings the database design to life on production
servers.

6. Testing and Quality Assurance


Thorough testing is also crucial to ensure the database operates efficiently.
These are the testing steps:
● Checking that all objects are implemented accurately based on the
physical design.
● Validating that the correct data is stored and retrieved as expected.
● Using test data and dummy records to simulate production scenarios.
● Performing SQL queries, procedures, and transactions to verify
functionality.
● Checking performance using volume and load testing.
● Testing disaster recovery and failover capabilities.
● Fixing any bugs or issues before final deployment.
Testing verifies the database is ready for release.

7. Maintenance and Monitoring


Once database is deployed, it requires ongoing maintenance and monitoring
activities. These activities can be:
● Monitoring the usage metrics, load, throughput, uptime etc.
● Tuning and optimizing queries and performance of database.
● Evolving schema changes through alterations or by some addition.
● Performing the backups, patches, upgrades and refresh tasks.
● Enforcing the security and access controls.
● Investigating issues and troubleshooting as and when needed.
● Retiring objects or data no longer needed.
Proper maintenance keeps the system running smoothly while monitoring helps
in capacity planning.
❖ Data Modelling process(8 process from ppt)

❖ Database Users
❖ Advantages of DBMS
Advantages of Database Management System (DBMS):
Some of them are given as follows below.
1. Better Data Transferring: Database management creates a place where users have an
advantage of more and better-managed data. Thus making it possible for end-users to
have a quick look and to respond fast to any changes made in their environment.
2. Better Data Security: The more accessible and usable the database, the more it is prone
to security issues. As the number of users increases, the data transferring or data
sharing rate also increases thus increasing the risk of data security. It is widely used in
the corporate world where companies invest money, time, and effort in large amounts to
ensure data is secure and is used properly. A Database Management System (DBMS)
provides a better platform for data privacy and security policies thus, helping companies
to improve Data Security.
3. Better data integration: Due to the Database Management System we have an access to
well managed and synchronized form of data thus it makes data handling very easy and
gives an integrated view of how a particular organization is working and also helps to
keep a track of how one segment of the company affects another segment.
4. Minimized Data Inconsistency: Data inconsistency occurs between files when different
versions of the same data appear in different places. For Example, data inconsistency
occurs when a student’s name is saved as “John Wayne” on a main computer of the
school but on the teacher registered system same student name is “William J. Wayne”,
or when the price of a product is $86.95 in the local system of the company and its
National sales office system shows the same product price as $84.95. So if a database
is properly designed then Data inconsistency can be greatly reduced hence minimizing
data inconsistency.
5. Faster data Access: The Database management system (DBMS) helps to produce quick
answers to database queries thus making data access faster and more accurate. For
example, to read or update the data. For example, end-users, when dealing with large
amounts of sale data, will have enhanced access to the data, enabling a faster sales
cycle. Some queries may be like:
● What is the increase in sales in the last three months?
● What is the bonus given to each of the salespeople in the last five months?
● How many customers have a credit score of 850 or more?
6. Better decision making: Due to DBMS now we have Better managed data and Improved
data access because of which we can generate better quality information hence on this
basis better decisions can be made. Better Data quality improves accuracy, validity, and
time it takes to read data. DBMS does not guarantee data quality, it provides a
framework to make it easy to improve data quality. DBMS provides powerful data
analysis and reporting tools that allow users to make informed decisions based on data
insights. This helps organizations to improve their decision-making processes and
achieve better business outcomes.
7. Increased end-user productivity: The data which is available with the help of a
combination of tools that transform data into useful information, helps end-users to make
quick, informative, and better decisions that can make difference between success and
failure in the global economy.
8. Simple: Database management system (DBMS) gives a simple and clear logical view of
data. Many operations like insertion, deletion, or creation of files or data are easy to
implement.
9. Data abstraction: The major purpose of a database system is to provide users with an
abstract view of the data. Since many complex algorithms are used by the developers to
increase the efficiency of databases that are being hidden by the users through various
data abstraction levels to allow users to easily interact with the system.
10. Reduction in data Redundancy: When working with a structured database, DBMS
provides the feature to prevent the input of duplicate items in the database. for e.g. – If
there are two same students in different rows, then one of the duplicate data will be
deleted.
11. Application development: A DBMS provides a foundation for developing applications that
require access to large amounts of data, reducing development time and costs.
12. Data sharing: A DBMS provides a platform for sharing data across multiple applications
and users, which can increase productivity and collaboration.
13. Data organization: A DBMS provides a systematic approach to organizing data in a
structured way, which makes it easier to retrieve and manage data efficiently.
14. The atomicity of data can be maintained: That means, if some operation is
performed on one particular table of the database, then the change must be reflected
for the entire database.
15. The DBMS allows concurrent access to multiple users by using the synchronization
technique.
16. Data consistency and accuracy: DBMS ensures that data is consistent and accurate by
enforcing data integrity constraints and preventing data duplication. This helps to
eliminate data discrepancies and errors that can occur when data is stored and
managed manually.
17. Improved data security: DBMS provides a high level of data security by offering user
authentication and authorization, data encryption, and access control mechanisms. This
helps to protect sensitive data from unauthorized access, modification, or theft.
18. Efficient data access and retrieval: DBMS allows for efficient data access and retrieval
by providing indexing and query optimization techniques that speed up data retrieval.
This reduces the time required to process large volumes of data and increases the
overall performance of the system.
19. Scalability and flexibility: DBMS is highly scalable and can easily accommodate changes
in data volumes and user requirements. DBMS can easily handle large volumes of data,
and can scale up or down depending on the needs of the organization. It provides
flexibility in data storage, retrieval, and manipulation, allowing users to easily modify the
structure and content of the database as needed.
20. Improved productivity: DBMS reduces the time and effort required to manage data,
which increases productivity and efficiency. It also provides a user-friendly interface for
data entry and retrieval, which reduces the learning curve for new users.

Database Management System (DBMS) offers several advantages over


traditional file-based systems. Some of the key advantages of DBMS are:
❖ Data Integrity and Security: DBMS provides a centralized approach to data
management that ensures data integrity and security. DBMS allows defining
constraints and rules to ensure that data is consistent and accurate.
❖ Reduced Data Redundancy: DBMS eliminates data redundancy by storing data
in a structured way. It allows sharing data across different applications and users,
reducing the need for duplicating data.
❖ Improved Data Consistency: DBMS ensures data consistency by enforcing data
validation rules and constraints. This ensures that data is accurate and consistent
across different applications and users.
❖ Improved Data Access and Availability: DBMS provides efficient data access and
retrieval mechanisms that enable quick and easy data access. It allows multiple
users to access the data simultaneously, ensuring data availability.
❖ Improved Data Sharing: DBMS provides a platform for sharing data across
different applications and users. It allows sharing data between different
departments and systems within an organization, improving collaboration and
decision-making.
❖ Improved Data Integration: DBMS allows integrating data from different sources,
providing a comprehensive view of the data. It enables data integration from
different systems and platforms, improving the quality of data analysis.
❖ Improved Data Backup and Recovery: DBMS provides backup and recovery
mechanisms that ensure data is not lost in case of a system failure. It allows
restoring data to a specific point in time, ensuring data consistency.
❖ Data sharing: DBMS allows multiple users to access and modify the same data
simultaneously, without conflicts or data loss. This facilitates collaborative work
and improves data consistency across the organization.
❖ Data independence: DBMS separates the logical and physical view of data,
which allows users to manipulate data without having to know its physical
location or structure. This provides flexibility and reduces the risk of data
corruption due to changes in the underlying hardware or software.
❖ Data integrity: DBMS enforces data integrity constraints such as referential
integrity, entity integrity, and domain integrity, which prevent data inconsistencies
and errors. This ensures that data is accurate, complete, and consistent.
❖ Data security: DBMS provides various security mechanisms such as
authentication, authorization, and encryption, which protect data from
unauthorized access, modification, or theft. This ensures that sensitive data is
protected from internal and external threats.
❖ Data backup and recovery: DBMS provides backup and recovery mechanisms,
which enable organizations to recover lost or damaged data quickly and
efficiently. This reduces the risk of data loss and ensures business continuity.
❖ Reduced data redundancy: DBMS eliminates data redundancy by storing data in
a centralized location and providing mechanisms for data sharing and reuse. This
reduces data storage requirements and improves data consistency.
❖ Overall, DBMS offers several advantages over traditional file-based systems. It
ensures data integrity, security, and consistency, reduces data redundancy, and
improves data access, sharing, and integration. These benefits make DBMS an
essential tool for managing and processing data in modern organizations.

❖ Disadvantages of DBMS
There are many advantages and disadvantages of DBMS (Database Management System).
The disadvantages of DBMS are explained below.
1. Increased Cost:
These are different types of costs:

1. Cost of Hardware and Software –


This is the first disadvantage of the database management system. This is because, for
DBMS, it is mandatory to have a high-speed processor and also a large memory size.
After all, nowadays there is a large amount of data in every field which needs to be store
safely and with security.
The requirement of this large amount of space and a high-speed processor needs
expensive hardware and expensive software too. That is, there is a requirement for
sophisticated hardware and software which means that we need to upgrade the
hardware which is used for the file-based system. Hardware and Software, both require
maintenance which costs very high. All the operating, Training (all levels including
programming, application development, and database administration), licensing, and
regulatory compliance costs very high.

2. Cost of Staff Training –


Educated staff (database administrator, application programmers, data entry operations)
who maintains the database management system also requires a good amount. We
need the database system designers to be hired along with application programmers.
Alternatively, the services of some software houses need to be taken. So there is a lot of
money which needs to be spent on developing software.

3. Cost of Data Conversion –


We need to convert our data into a database management system, there is a
requirement of a lot of money as it adds to the cost of the database management
system. This is because for this conversion we need to hire database system designers
whom we have to pay a lot of money and also services of some software house will be
required. All this shows that a high initial investment for hardware, software, and trained
staff is required by DBMS. So, altogether Database Management System results in a
costlier system.

2. Complexity:
As we all know that nowadays all companies are using the database management system as it
fulfills lots of requirements and also solves the problem. But a problem arises, that is all this
functionality has made the database management system an extremely complex software. For
the proper requirement of DBMS, it is very important to have a good knowledge of it by the
developers, DBA, designers, and also the end-users. This is because if any one of them does
not acquire proper and complete skills then this may lead to data loss or database failure.
These failures may lead to bad design decisions due to which there may be serious and bad
consequences for the organization. So this complex system needs to be understood by
everyone using it. As it cannot be managed very easily. All this shows that a database
management system is not a child’s game as it cannot be managed very easily. It requires a lot
of management. A good staff is needed to manage this database at times when it becomes very
complicated to decide where to pick data from and where to save it.
3. Currency Maintenance:
This is very necessary to keep your system current because efficiency which is one of the
biggest factors and needs to be overlooked must be maximized. That is we need to maximize
the efficiency of the database system to keep our system current. For this, frequent updation
must be performed on all the components as new threats come daily. DBMS should be updated
according to the current scenario. Also, security measures must be implemented. Due to
advancement in database technology, training cost tends to be significant.
4. Performance:
The traditional file system is written for small organizations and for some specific applications
due to which performance is generally very good. But for the small-scale firms, DBMS does not
give a good performance as its speed is very slow. As a result, some applications will not run as
fast as they could. Hence it is not good to use DBMS for small firms. Because performance is a
factor that is overlooked by everyone. If performance is good then everyone (developers,
designers, end-users) will use it easily and it will be user-friendly too. As the speed of the
system totally depends on the performance so performance needs to be good.
5. Frequency Upgrade/Replacement Cycles:
Nowadays in this world, we need to stay up-to-date about the latest technologies, developments
arriving in the market. Frequent upgrade of the products is done by the DBMS vendors to add
new functionality to the systems. New upgrade versions of the software often come bundled.
Sometimes these updates also need hardware upgrades. Sometimes these changes and
updates are so fast that the users find it difficult to work with that system because it is not easy
to learn new commands and understand them again when the new upgrades are done. All
these upgrades also cost money to train users, designers, etc. to use the new features.
6.Complex design :
Database design is complex, difficult and time consuming.
7.Damaged part : If one part of database is corrupted or damaged, then entire
database may get affected.
8.Compatibility: DBMS software may not be compatible with other software systems or
platforms, making it difficult to integrate with other applications.
9.Security: A DBMS can be vulnerable to security breaches if not properly configured and
managed. This can lead to data loss or theft.

❖ Storage management
Storage System in DBMS
A database system provides an ultimate view of the stored data. However, data in the
form of bits, bytes get stored in different storage devices.

In this section, we will take an overview of various types of storage devices that are
used for accessing and storing data.

Types of Data Storage


For storing the data, there are different types of storage options available. These
storage types differ from one another as per the speed and accessibility. There are the
following types of storage devices used for storing the data:

○ Primary Storage

○ Secondary Storage

○ Tertiary Storage
Primary Storage

It is the primary area that offers quick access to the stored data. We also know the
primary storage as volatile storage. It is because this type of memory does not
permanently store the data. As soon as the system leads to a power cut or a crash, the
data also get lost. Main memory and cache are the types of primary storage.

Backward Skip 10s


Play Video
Forward Skip 10s

○ Main Memory: It is the one that is responsible for operating the data that is
available by the storage medium. The main memory handles each instruction of
a computer machine. This type of memory can store gigabytes of data on a
system but is small enough to carry the entire database. At last, the main
memory loses the whole content if the system shuts down because of power
failure or other reasons.

1. Cache: It is one of the costly storage media. On the other hand, it is the fastest
one. A cache is a tiny storage media which is maintained by the computer
hardware usually. While designing the algorithms and query processors for the
data structures, the designers keep concern on the cache effects.

Secondary Storage

Secondary storage is also called as Online storage. It is the storage area that allows the
user to save and store data permanently. This type of memory does not lose the data
due to any power failure or system crash. That's why we also call it non-volatile storage.

There are some commonly described secondary storage media which are available in
almost every type of computer system:

○ Flash Memory: A flash memory stores data in USB (Universal Serial Bus) keys
which are further plugged into the USB slots of a computer system. These USB
keys help transfer data to a computer system, but it varies in size limits. Unlike
the main memory, it is possible to get back the stored data which may be lost
due to a power cut or other reasons. This type of memory storage is most
commonly used in the server systems for caching the frequently used data. This
leads the systems towards high performance and is capable of storing large
amounts of databases than the main memory.

○ Magnetic Disk Storage: This type of storage media is also known as online
storage media. A magnetic disk is used for storing the data for a long time. It is
capable of storing an entire database. It is the responsibility of the computer
system to make availability of the data from a disk to the main memory for
further accessing. Also, if the system performs any operation over the data, the
modified data should be written back to the disk. The tremendous capability of a
magnetic disk is that it does not affect the data due to a system crash or failure,
but a disk failure can easily ruin as well as destroy the stored data.

Tertiary Storage
It is the storage type that is external from the computer system. It has the slowest
speed. But it is capable of storing a large amount of data. It is also known as Offline
storage. Tertiary storage is generally used for data backup. There are following tertiary
storage devices available:

○ Optical Storage: An optical storage can store megabytes or gigabytes of data. A


Compact Disk (CD) can store 700 megabytes of data with a playtime of around
80 minutes. On the other hand, a Digital Video Disk or a DVD can store 4.7 or 8.5
gigabytes of data on each side of the disk.

○ Tape Storage: It is the cheapest storage medium than disks. Generally, tapes are
used for archiving or backing up the data. It provides slow access to data as it
accesses data sequentially from the start. Thus, tape storage is also known as
sequential-access storage. Disk storage is known as direct-access storage as we
can directly access the data from any location on disk.

Storage Hierarchy
Besides the above, various other storage devices reside in the computer system. These
storage media are organized on the basis of data accessing speed, cost per unit of data
to buy the medium, and by medium's reliability. Thus, we can create a hierarchy of
storage media on the basis of its cost and speed.

Thus, on arranging the above-described storage media in a hierarchy according to its


speed and cost, we conclude the below-described image:
In the image, the higher levels are expensive but fast. On moving down, the cost per bit
is decreasing, and the access time is increasing. Also, the storage media from the main
memory to up represents the volatile nature, and below the main memory, all are
non-volatile devices.

❖ Query Processor
Query Processing in DBMS
Query Processing is the activity performed in extracting data from
the database. In query processing, it takes various steps for
fetching the data from the database. The steps involved are:

1. Parsing and translation

2. Optimization

3. Evaluation
The query processing works in the following way:

Parsing and Translation


As query processing includes certain activities for data retrieval.
Initially, the given user queries get translated in high-level
database languages such as SQL. It gets translated into
expressions that can be further used at the physical level of the
file system. After this, the actual evaluation of the queries and a
variety of query -optimizing transformations and takes place. Thus
before processing a query, a computer system needs to translate
the query into a human-readable and understandable language.
Consequently, SQL or Structured Query Language is the best
suitable choice for humans. But, it is not perfectly suitable for the
internal representation of the query to the system. Relational
algebra is well suited for the internal representation of a query.
The translation process in query processing is similar to the
parser of a query. When a user executes any query, for
generating the internal form of the query, the parser in the system
checks the syntax of the query, verifies the name of the relation in
the database, the tuple, and finally the required attribute value.
The parser creates a tree of the query, known as 'parse-tree.'
Further, translate it into the form of relational algebra. With this, it
evenly replaces all the use of the views when used in the query.

Thus, we can understand the working of a query processing in the


below-described diagram:
Suppose a user executes a query. As we have learned that there
are various methods of extracting the data from the database. In
SQL, a user wants to fetch the records of the employees whose
salary is greater than or equal to 10000. For doing this, the
following query is undertaken:

select emp_name from Employee where salary>10000;

Thus, to make the system understand the user query, it needs to


be translated in the form of relational algebra. We can bring this
query in the relational algebra form as:

○ σsalary>10000 (πsalary (Employee))

○ πsalary (σsalary>10000 (Employee))


After translating the given query, we can execute each relational
algebra operation by using different algorithms. So, in this way, a
query processing begins its working.

Evaluation
For this, with addition to the relational algebra translation, it is
required to annotate the translated relational algebra expression
with the instructions used for specifying and evaluating each
operation. Thus, after translating the user query, the system
executes a query evaluation plan.

Query Evaluation Plan

○ In order to fully evaluate a query, the system needs to construct a query


evaluation plan.

○ The annotations in the evaluation plan may refer to the algorithms to be used for
the particular index or the specific operations.

○ Such relational algebra with annotations is referred to as Evaluation Primitives.


The evaluation primitives carry the instructions needed for the evaluation of the
operation.

○ Thus, a query evaluation plan defines a sequence of primitive operations used for
evaluating a query. The query evaluation plan is also referred to as the query
execution plan.

○ A query execution engine is responsible for generating the output of the given
query. It takes the query execution plan, executes it, and finally makes the output
for the user query.

Optimization
○ The cost of the query evaluation can vary for different types of queries. Although
the system is responsible for constructing the evaluation plan, the user does
need not to write their query efficiently.

○ Usually, a database system generates an efficient query evaluation plan, which


minimizes its cost. This type of task performed by the database system and is
known as Query Optimization.

○ For optimizing a query, the query optimizer should have an estimated cost
analysis of each operation. It is because the overall operation cost depends on
the memory allocations to several operations, execution costs, and so on.

Finally, after selecting an evaluation plan, the system evaluates the query and produces
the output of the query.

❖ Transaction Management
Transaction in Database Management Systems (DBMS) can be defined as a set of logically
related operations. It is the result of a request made by the user to access the contents of the
database and perform operations on it. It consists of various operations and has various states
in its completion journey. It also has some specific properties that must be followed to keep the
database consistent.
❖ System Structure
From ppt

You might also like