UNIT 1:
history of Database Management Systems (DBMS) :
The history of Database Management Systems (DBMS) stretches back to the
early 1960s, evolving from simpler file-based systems to the sophisticated
relational and NoSQL databases we use today. Early DBMSs like Charles
Bachman's Integrated Data Store (IDS) and IBM's Information Management
System (IMS) used hierarchical and network models. The relational model,
popularized by Edgar F. Codd in the 1970s, revolutionized the field by
introducing tables and SQL, leading to the development of systems like
Oracle and MySQL.
file system:
In computing, a file system -- sometimes written filesystem -- is a logical
and physical system for organizing, managing and accessing the files
and directories on a device's solid-state drive (SSD), hard-disk drive
(HDD) or other media.
Without a file system, the operating system (OS) would see only large
chunks of data without any way to distinguish one file from the next. As
data capacities increase, the efficient organization and accessibility of
individual files becomes even more important in data storage.
Digital file systems are named for and modeled after the paper-based
filing systems used to store and retrieve documents.
Despite their shared roots, however, file systems can differ significantly
between operating systems such as Microsoft
Windows, macOS or Linux.
On the other hand, an OS can support multiple file systems despite their
differences. In some cases, a file system can also be used across
multiple platforms.
Some file systems are designed for specific applications,
including distributed file systems, disk-based file systems and special-
purpose file systems.
File systems use metadata to store and retrieve files and to provide users
with information about their files. File systems do not necessarily track the
same types of metadata, but they usually maintain the following
information:
Date and time created.
Date and time modified.
Data and time last accessed.
File owner.
Access permissions.
File size, including size on disk.
Attributes such as read-only or hidden.
Location in the directory hierarchy.
Problems with File Processing System:
a) Data redundancy and inconsistency
File processing system leads to the usage of many copies of same data.
This is data redundancy. If we need to change any of the data, then we
need to change the data at all copies. If not, this will lead to inconsistency.
For example, let us assume a file for storing addresses of students. If we
make three copies of the address file and store them in three different
computers, we say that the data is redundant. If suppose one want to
change the address of any students, then the change should be made at all
the three computers failing which leads to inconsistent data.
b) Difficulty in accessing data
In a file processing system, to access data differently we need to have
different programs.
For example, if you want to access student names from a file, we need a
program that does the job. If you want to view only address of all students
from a specific city, then we need different program that does the required
job. This list goes endless. Hence, it is difficult to access data.
c) Data isolation
Files are stored in different locations, different formats. Thus they are
isolated.
For example, one location the student data may be stored in .txt format. In
other location, the same file may be stored in .doc format.
d) Integrity problems
Integrity problem arises when the database fails to satisfy certain integrity
conditions.
For example, the phone number cannot be longer than 10 digits, bank
balance should not go below 1000 etc. The actual problem arises when we
would like to include new such conditions with the existing database. It is
hard to make those changes.
e) Atomicity problems
The database must be in a consistent state in spite of failures.
For example, let us suppose that you have a savings account with the
balance 5000 and a loan account with an outstanding of 3000. This is the
old consistent state. Now you would like to transfer 500 to your loan
account. If this transaction is successful, then your savings balance should
be 4500 and loan outstanding should be 2500. This is the new consistent
state. Suppose a failure occurs during this transaction, the database must
be in any one of the 2 consistent states mentioned above.
It is hard to maintain atomicity in file processing system due to data
redundancy, data isolation etc.
f) Concurrent access anomalies
Simultaneous access of a data item should be handled carefully.
For example, if only one ticket is there and two customers are trying to
book the ticket simultaneously, the ticket should be allotted to any one
customer.
It is difficult to handle in file processing system due to the fact of data
isolation, redundancy etc
Data Models in DBMS
A Data Model in Database Management System (DBMS) is the concept of
tools that are developed to summarize the description of the database. Data
Models provide us with a transparent picture of data which helps us in
creating an actual database. It shows us from the design of the data to its
proper implementation of data.
Types of Relational Models
1. Conceptual Data Model
2. Representational Data Model
3. Physical Data Model
It is basically classified into 3 types:-
data model
Conceptual Data Representational
Physical Data Model
Model Data Model
1. Conceptual Data Model
The conceptual data model describes the database at a very high level and
is useful to understand the needs or requirements of the database.
It is this model, that is used in the requirement-gathering process i.e. before
the Database Designers start making a particular database. One such
popular model is the entity/relationship model (ER model) .
The E/R model specializes in entities, relationships, and even attributes that
are used by database designers.
Entity-Relationship Model( ER Model): It is a high-level data model which
is used to define the data and the relationships between them. It is basically
a conceptual design of any database which is easy to design the view of
data.
Components of ER Model:
1. Entity: An entity is referred to as a real-world object. It can be a name,
place, object, class, etc. These are represented by a rectangle in an ER
Diagram.
2. Attributes: An attribute can be defined as the description of the entity.
These are represented by Ellipse in an ER Diagram. It can be Age, Roll
Number, or Marks for a Student.
3. Relationship: Relationships are used to define relations among different
entities. Diamonds and Rhombus are used to show Relationships.
Characteristics of a conceptual data model
Offers Organization-wide coverage of the business concepts.
This type of Data Models are designed and developed for a business
audience.
The conceptual model is developed independently of hardware
specifications like data storage capacity, location or software
specifications like DBMS vendor and technology
2. Representational Data Model
This type of data model is used to represent only the logical part of the
database and does not represent the physical structure of the database.
The representational data model allows us to focus primarily, on the design
part of the database.
A popular representational model is a Relational model. The relational Model
consists of Relational Algebra and Relational Calculus.
Characteristics of Representational Data Model
Represents the logical structure of the database.
Relational models like Relational Algebra and Relational Calculus are
commonly used.
Uses tables to represent data and relationships.
Provides a foundation for building the physical data model.
3.Physical Data Model
The physical Data Model is used to practically implement Relational Data
Model. Ultimately, all data in a database is stored physically on a secondary
storage device such as discs and tapes.
This is stored in the form of files, records, and certain other data structures.
It has all the information on the format in which the files are present and the
structure of the databases, the presence of external data structures, and
their relation to each other.
Here, we basically save tables in memory so they can be accessed
efficiently. In order to come up with a good physical model, we have to work
on the relational model in a better way.
Structured Query Language (SQL) is used to practically implement
Relational Algebra.
This Data Model describes HOW the system will be implemented using a
specific DBMS system. This model is typically created by DBA and
developers. The purpose is actual implementation of the database.
Characteristics of a physical data model:
The physical data model describes data need for a single project or
application though it maybe integrated with other physical data models
based on project scope.
Data Model contains relationships between tables that which addresses
cardinality and nullability of the relationships.
Developed for a specific version of a DBMS, location, data storage or
technology to be used in the project.
Columns should have exact datatypes, lengths assigned and default
values.
Primary and Foreign keys, views, indexes, access profiles, and
authorizations, etc. are defined
Advantages of Data Models
1. Data Models help us in representing data accurately.
2. It helps us in finding the missing data and also in minimizing Data
Redundancy.
3. Data Model provides data security in a better way.
4. The data model should be detailed enough to be used for building the
physical database.
5. The information in the data model can be used for defining the
relationship between tables, primary and foreign keys, and stored
procedures.
Disadvantages of Data Models
1. In the case of a vast database, sometimes it becomes difficult to
understand the data model.
2. You must have the proper knowledge of SQL to use physical models.
3. Even smaller change made in structure require modification in the entire
application.
4. There is no set data manipulation language in DBMS.
5. To develop Data model one should know physical data stored
characteristics.
What is DBMS?
A database management system (DBMS) is system software
for creating and managing databases. A DBMS makes it
possible for end users to create, protect, read, update and
delete data in a database.
Evolution of Database
Data modeling and databases evolved together, and their
history dates back to the 1960’s. The database evolution
happened in five “waves”: The first wave consisted of
network, hierarchical, inverted list, and (in the 1990’s)
object-oriented DBMSs; it took place from roughly 1960 to
1999.
What is Evolution of DBMS
DBMS is a structured system of collection of programs that
enable users to create and maintain a data base and
interfaces with the various users as data base administrator,
online users, application programmers and users.
There are various database management systems available in
the market. Each type has its features and can be used for
varied purposes. The large number of DBMS makes it
difficult to choose the DBMS that should be implemented to
solve our problem. To choose the most suitable DBMS, we
need to evaluate from various systems. We perform a
structured approach to evaluate the database systems.
Evaluation Methodology
This step involves two analyses; both these analyses should
be performed to determine the most suited DBMS.
These two analyses are:
1. Feature analysis: In this phase, we determine whether
the DBMS provides all the features required for the
operations that are to be performed on the data and
shortlist the DBMS.
2. Performance Analysis: In this phase, we analyse only
those shortlisted DBMS, evaluate the systems’ efficiency,
and choose one with maximum efficiency.
Evolution of Data Models
Managing data was the key and was essential. Therefore,
data model originated to solve the file system issues. Here
are the Data Models in DBMS.
Hierarchical Model
In Hierarchical Model, a hierarchical relation is formed by
collection of relations and forms a tree-like structure.
The relationship can be defined in the form of parent child
type.
One of the first and most popular Hierarchical Model is
Information Management System (IMS), developed by IBM.
Network Model
The Hierarchical Model creates hierarchical tree with
parent/ child relationship, whereas the Network Model has
graph and links.
The relationship can be defined in the form of links and it
handles many-to-many relations. This itself states that a
record can have more than one parent.
Relational Model
A relational model groups data into one or more tables.
These tables are related to each other using common
records.
The data is represented in the form of rows and columns.
What is Data Abstraction DBMS?
Data abstractions in DBMS refer to the hiding of unnecessary data from the end-
user. Database systems have complex Data Structures and relationships.
These difficulties are masked so that users may readily access the data, and just
the relevant section of the database is made accessible to them through data
abstraction
Degrees(or Levels) of Data Abstraction in DBMS
Database Management Systems (DBMS) are essential to effectively managing
and organizing data in data management.
Three interrelated levels of abstraction, each performing a different function,
make up the hierarchical framework through which these systems operate.
There are three levels of abstraction for DBMS are:
External Level / View Level
Conceptual Level/ Logical Level
View or External Level
1. Physical or Internal Level
It is the lowest level of abstraction for DBMSs, defining how data is stored, data
structures for storing data, and database access mechanisms.
Developers or database application programmers decide how to store data in the
database. It is complex to understand.
Example
The physical level, being the lowest level of abstraction, can be understood with
an example, like how information about a customer is stored in tables while the
data is stored in the form of blocks.
2. Logical or Conceptual Level
The logical level is the next higher level or intermediate level. It explains what
data is stored in the database and how those data are related.
It seeks to explain the complete or entire data by describing what tables should
be constructed. It is less complex than the physical level.
Example
The logical level in DBMS is used for representing entities and relationships
among the data stored.
For example, defining tables and their attributes and specifying relationships
between them. A table named ‘class’ may have different attributes like
student_name, Roll_no, student ID, and Marks.
A table named ‘IDs’ contains details about the address of the teacher's ID
(foreign key), and student ID (foreign key).
3. View or External Level
This is the top level. There are various views at the view level, with each view
defining only a portion of the total data.
It also facilitates user engagement by providing a variety of views or numerous
views of a single database.
All users have access to the view level. This is the easiest and most simple level.
Example
The external level in DBMS defines a part of the entire data and simplifies
interaction with the user by providing multiple views of a similar database.
For example, interacting with a system using a graphical user interface (GUI) to
access an application's features.
Here GUI is the view level, and the user does not know how and what data is
exactly stored,
i.e hiding the details from the user.