0 ratings0% found this document useful (0 votes) 47 views20 pagesDBMS Notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
DBMS NOTES
What is Database?
The database is a collection of interrelated data which is used to retrieve, insert and delete the data
jently. Itis also used to organize the data in the form of a table, schema, views, and reports, etc.
Database Management System
‘© Database management system is a software which is used to manage the database. For
example: MySQL
Oracle
ete are a very popular commercial database which is used in different applications.
© DBMS provides an interface to perform various operations like database creation, storing data
init, updating data, creating a table in the database and a lot more.
* Itprovides protection and security to the database. In the case of multiple users, it also
maintains data consistency.
Characteristics of DBMS
© Ituses a digital repository established on a server to store and manage the information.
Itcan provide a clear and logical view of the process that manipulates data.
‘© DBMS contains automatic backup and recovery procedures.
‘© Itcontains ACID properties which maintain data in a healthy state in case of failure.
© Itcan reduce the complex relationship between data,
‘* _Itis used to support manipulation and processing of data.
* Itis used to provide security of data.
‘© It can view the database from different viewpoints according to the requirements of the user.
Advantages of DBMS
It can control data redundancy because it stores all the data in
one single database file and that recorded data is placed in the database.
‘© Data sharing: In DBMS, the authorized users of an organization can share the data among
‘multiple users.‘* Easily Maintenance: It ean be easily maintainable due to the centralized nature of the database
system.
‘© Reduce time: It reduces development time and maintenance need.
‘* Backup: It provides backup and recovery subsystems which create automatic backup of data
from hardware and software failures and restores the data if required.
‘© multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces
Disadvantages of DBMS
. ware and Software: It requires a high speed of data processor and large memory
Size: Itoccupies a large space of disks and large memory to run them efficiently.
. : Database system creates additional complexity and requirements.
‘© Higher impact of failure: Failure is highly impacted the database because in most of the
organization, all the data stored in a single database and if the database is damaged due to
electric f
lure or database corruption then the data may be lost forever.
‘What is RDSMS (Relational Database Management System)
RDBMS stands for Relational Database Management System. All modern database management
systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and Microsoft Access are based on
RDBMS. It is called Relational Database Management System (RDBMS) because it is based on the
relational model introduced by E.F. Codd.
How it works Data is represented in terms of tuples (rows) in RDBMS. A relational database is the most
commonly used database. It contains several tables, and each table has its primary key. Due toa
collection of an organized set of tables, data can be accessed easily in RDBMS.
DBMS Architecture
+ The DBMS design depends upon its architecture. The basic client/server architecture is used to
deal with a large number of PCs, web servers, database servers and other components that are
connected with networks.
* The client/server architecture consists of many PCs and a workstation which are connected via
the network.© DBMS architecture depends upon how users are connected to the database to get their request
done.
‘Types of DBMS Architecture (ons
Database architecture can be seen as a single tier or
multi-tier. But logically, database architecture is of two types. 7
4-Tier Architecture \ )
* Inthis architecture, the database is directly available
to the user. It means the user can directly sit on the DBMS and uses it.
Any changes done here will directly be done on the database itself. It doesnt provide a handy
tool for end users.
The 1-Tier architecture is used for development of the local application, where programmers
can directly communicate with the database for the quick response.
2-Tier Architecture
© The 2-Tier architecture is same as basic
client-server. In the two-tier architecture,
g Databsse system Server
applications on the client end can directly
commur
For th
cate with the database at the server si
interaction, API's like: ODBC, JDBC are used.
‘* The user interfaces and application programs are
run on the client-side.
Application
© The server side is responsible to provide the I Client
functionalities like: query processing and transaction 7
management. _
© To communicate with the DBMS, client-side
application establishes a connection with the server side.3-Tier Architecture
The 3-Tier architecture contains another layer
betw
n the client and server. In this architecture,
client can't
rectly communicate with the server,
The application on the client-end interacts with an
application server which further communicates with
the database system.
End user has no idea about the existence of the
database beyond the application server. The
database also has no idea about any other user
beyond the application.
The 3-Tier architecture is used in case of large web
application.
Three schema Architecture
‘The three-schema architecture is as follows:
Applicaton Server
‘Application Gent
—_
The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture,
This framework is used to describe the structure of a specific database system.
Cllent
The three schema architecture is also used to separate the user applications and physical
database.
The three schema architecture contains three-levels. It breaks the database down into three
different categories.
It shows the DBMS architecture.
Conceptual Schema
Mapping is used to transform the request
and response between various database
levels of architecture. Internal schema
Mapping is not good for small DBMS
because it takes more time.
EsternalSehemna [ External Level )
/
‘Concepiual Level
|
Database
= |
External Conceptual
Mapping
Mapping‘* In External / Conceptual mapping, it is necessary to transform the request from external level to
conceptual schema,
© In Conceptual / Internal mapping, DBMS transform the request from the conceptual to internal
levThe ER-model (Entity Relationship model) is a conceptual model used in database design.
It represents entities as rectangles, relationships as diamonds, and attributes as ovals. The
basic concepts of the ER-model include entities, attributes, and relationships.
Entities are objects or concepts in the real world that have a unique identity and can be
distinguished from other entities. Examples of entities might include customers, orders, products, or
employees. Attributes describe the properties or characteristics of entities. Examples of attributes
might include a customer's name, address, or phone number. Relationships describe how two or
more entities are related to each other. For example, a customer might place an order, which involves
a relationship between the customer and the order. Relationships can have cardinality constraints,
such as one-to-one, one-to-many, or many-to-many.
Weak entity types are entities that do not have a unique identity of their own and depend on another
entity for identification. For example, a line item in an order might depend on the order itself for
identification. Weak entity types are represented using double rectangles. ER-diagrams are diagrams
that depict entities, relationships, and attributes. They are used to model the structure of a database
and provide a visual representation of the data.
Subclasses and superclasses are used to represent hierarchical relationships between entities.
Subclasses inherit attributes and relationships from their parent superclass, For example, a car
might be a subclass of a vehicle superclass, inheriting attributes such as make, model, and year.
Inheritance, specialization, and generalization are used to model complex relationships between
entities. Inheritance involves a subclass inheriting all the attributes and relationships of its parent
superclass. Specialization involves creating new entities from a superclass, while generalization
involves combining entities into a more general entityOverall, the ER-model provides a framework for understanding the structure and relationships of data
in a database, allowing for effective database design and management.
‘Types of entities, types of relationshi
types of attributes
In database design, entities are objects or concepts that are significant and independent and are
represented in the database. Relationships define how entities relate to each other. Attributes
describe the properties of entities and relationships. Here are some common types of entities,
relationships, and attributes: Types of Entities:
1. Strong Entity: Itis an entity that can be uniquely identified by its attributes alone.
2. Weak Entity: It is an entity that cannot be uniquely identified by its attributes alone and
depends on its relationship with a strong entity.
3. Associative Entity: Its an entity that connects the relationships between two or more other
entities. Types of Relationships:
1. One-to-One Relationship: Itis a relationship in which one entity is related to exactly one other
entity.
2. One+to-Many Relationship: tis a relationship in which one entity is related to multiple
instances of another entity.
3. Many-to-Many Relationship: Its a relationship in which multiple instances of one entity are
related to multiple instances of another entity. Types of Attributes:
1. Simple Attribute: It is an attribute that cannot be divided into smaller parts
2. Composite Attribute: It is an attribute that can be divided into smaller parts,
3. Derived Attribute: It is an attribute that can be calculated or derived from other attributes.
4, Key Attribute: itis an attribute that uniquely identifies an entity in a database
5. Null Attribute: Itis an attribute that does not have a value for a particular entity or
relationship instance.DATA MODELS
Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is connected
to each other and how they are processed and stored inside the system.
The very first data model could be flat data-models, where all the data used are to be kept in the
same plane. Earlier data models were not so scientific, hence they were prone to introduce lots of
duplication and update anomalies
Relational Model
The most popular data model in DBMS is the
Relational Model. It is more scientific a model attributes,
predicate logic and defines a table as an n-ary
relation.
‘The main highlights of this model are —
© Datais stores
tables called vabe (lator)
relations.
‘* Relations can be normalized.
ed relations, values saved are atomic values.
* Each row in a relation contains a unique value.
© Each column in a relation contains values from a same domain
Hierarchical model A hierarchical database model is a data model in which the data are
organized into a tree-like structure. The data are stored as records which are connected to
one another through links. A record is a collection of fields, with each field containing only
one value. The type of a record defines which fields the record contains. The hierarchical
database model mandates that each child record has only one parent, whereas each
parent record can have one or more child records. In order to retrieve data from aarchical database, the whole tree needs to be traversed starting from the root node.
This model is recognized as the first database model created by IBM in the 1960s
Applications of hierarchical model :
‘* Hierarchical models are generally used as semantic models in practice as many
real-world occurrences of events are hierarchical in nature like biological structures,
Political, or social structures.
‘* Hierarchical models are also commonly used as physical models because of the
inherent
jerarchical structure of the disk storage system like tracks, cylinders, ete.
‘There are various examples such as Information Management System (IMS) by IBM,
NOMAD by NCSS, ete.
NETWORK MODEL The network database model was created to solve the shortcomings of
the hierarchical database model. In this type
of model, a child can be Linked to multiple
parents, a feature that was not supported
by the hierarchical data model. The parent
nodes are known as owners and the child
nodes are called members.
‘The network data model can be represented as
Advantages of Network Model
The network model can support many to many relationships as seen in the diagram. D2 and C3
‘each have multiple masters. The masters for D2 are C1 and C2 while for C3 are B1 and B2. In this
way, the network data model can handle many to many relationships where the hierarchical data
model didn't.
Disadvantages of Network ModelThere are some disadvantages in the network model even though it is an improvement over the
hierarchical model. These are
¢ The network model is much more complicated than the Hierarchical model. As such,
it is difficult to handle and maintain.
* Although the Network model is more flexible than the Hierarchical model, it still has
flexi
xy problems. Not all relations can be handled by assigning them in the form of
‘owners and members.
The structure of the Network Model is quite complicated and so the programmer has
‘to understan«
well in order to implement or modify it.
Keys are one of the basic requirements of a relational database model. It is widely used to identify
the tuplesirows) uniquely in the table. We also use keys to set up relations amongst various
columns and tables of a relational database.
Different Types of Key:
‘the Relational Model
1. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a
candidate key. For Example, STUD_NO in STUDENT relation.
It is a minimal super key.
It is a super key with no repeated data is called a candidate key.
The minimal set of attributes that can uniquely identify a record.
t must contain unique values.
lt can contain NULL values.
Every table must have at least a single candidate key.
A table can have multiple candidate keys but only one primary key (the primary key
cannot have a NULL value, so the candidate key with a NULL value can't be the primary
key).
The value of the Candidate Key is unique and may be null for a tuple.
There can be more than one candidate key in a relationship.
The candidate key can be simple (having only one attribute) or composite as well.2. Primary Key: There can be more than one candidate key in relation out of which one can be chosen
as the primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys for
relation STUDENT but STUD_NO can be chosen as the primary key (only one out of many cani
keys).
Itis a unique key.
lt cannot be NULL.
bea primary key for a table.
3. Super Key: The set of attributes that can uniquely
identify a tuple is known as Super Key. For Example,
STUD_NO, (STUD_NO, STUD_NAME), etc. A super
key is a group of single or multiple keys that
identifies rows in a table. It supports NULL values.
* Adding zero or more attributes to the
candidate key generates the super key.
It can identify only one tuple (a record) at a time.
thas no duplicate values, it has unique values.
late
Primary keys are not necessarily to be a single column; more than one column can also
‘* Acandidate key is a super key but vice versa is not true.
4, Alternate Key: The candidate key other than the primary key is called an alternate key.
‘* All the keys which are not primary keys are called alternate keys.
¢ Its a secondary key.
‘¢ It contains two or more fields to
identify two or more records.
‘* These values are repeated.
© Eg SNAME, and ADDRESS is
Alternate keys
Candidate Key
~——~——!____
a
== [nee | preree_oonoe
primal ay
‘Alternate Key5. Foreign Key: If an attribute can only take the values which are present as values of some other
attribute, it will be a foreign key to the attribute to which it refers. The relation which is being
referenced is called referenced relation and the corresponding attribute is called referenced attribute
the relation which refers to the referenced relation is called referencing relation and the
corresponding attribute is called referencing attribute. The referenced attribute of the referenced
relation should be the primary key to it.
It is a key it acts as a primary key in one table and it acts as
secondary key in another table.
‘¢ It combines two or more relations (tables) at a time.
© They act as a cross-reference between the tables.
‘¢ For example, DNO is a primary key in the DEPT table and a non-key in EMP.
It may be worth noting that, unlike the Primary Key of
may contain duplicate tuples i.e. it need not follow ee al
STUDENT_COURSE relation is not unique. It has been Ld — 7 na =
repeated for the first and third tuples. However, the =} —_—
STUD_NO in STUDENT relation is a primary key and it Student Details Student Marks
needs to be always unique, and it cannot be null.
6. Composite Key: Sometimes, a table might not have a single column/attribute that uniquely
identifies all the records of a table. To uniquely identify rows of a table, a combination of two or
more columns/attributes can be used. It still can give duplicate values in rare cases. So, we need to
find the optimal set of attributes that can uniquely identify rows in a table.
© Itacts as a primary key if there is no primary key in a table
‘* Two or more attributes are used together to make a composite key.
'* Different combinations of attributes may give different accuracy in terms of identifying
the rows uniquely.tablet ———F patie
‘Atertte Key
Different Types of Keys
Why are keys necessary for DBMS?
‘* Keys are one of the important aspects of DBMS. Keys help us to find the tuples(rows)
uniquely in the table. It is also used in developing various relations amongst columns or
tables of the database.
What is a Unique Key?
® Unique Keys are the keys that define the record uniquely in the table. It is different from
Primary Keys, as Unique Key can contain one NULL value but Primary Key does not
any NULL values.
What is an Artificial Key?
‘© Artificial Keys are the keys that are used when no attributes contain all the properties of
the Primary Key or if the Primary key is very large and complex.
Data Independence
© Data independence can be explained using the three-schema architecture.
© Data independence refers to the characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.
‘There are two types of data independence:1. Logical Data Independence
‘© Logical data independence refers to the characteristic of being able to change the conceptual
schema without having to change the external schema.
‘© Logical data independence is used to separate the external level from the conceptual view.
© Ifwe do any changes in the conceptual view of the data, then the user view of the data would
not be affected.
‘© Logical data independence occurs at the user interface level.
2. Physical Data Independence
‘* Physical data independence can be defined as the
capacity to change the internal schema without Loker Data ndenendence
having to change the conceptual schema. Locate
‘If we do any changes in the storage size of the Physic Dats napa
database system server, then the Conceptual Physical vol
structure of the database will not be affected.
‘© Physical data independence is used to separate
conceptual levels from the internal levels.
© Physical data independence occurs at the logical
interface level.
DBMS vs. File System
File System Approach
File based systems were an early attempt to computerize the manual system. It is also called a
traditional based approach in which a decentralized approach was taken where each department stored
and controlled its own data with the help of a data processing specialist. The main role of a data
processing specialist was to create the necessary computer file structures, and also manage the data
within structures and design some appl
ion programs that create reports based on file data.
Consider an example of a student's file system. The student file
will contain information regarding the student
roll no,
student name, course etc.). Similarly, we have a subject file thatcontains information about the subject and the result file which contains the information regarding the
result.
Some fields are duplicated in more than one file, which leads to data redundancy. So to overcome this
problem, we need to create a centralized system, i.e. DBMS approach.
Spanned Unspanned Mapping
In computer storage, spanned and unspanned mapping are two different methods of storing data on a
disk,
Unspanned mapping is a simple method where each file is stored contiguously on the disk. This means
that the file is stored in a single location and can be accessed easily. The main advantage of this method
is that itis fast to access the file as the operating system only needs to access one location on the disk to
retrieve the entire file, However, the downside is that if the file size changes, it may not fit in the original
location and additional space may need to be allocated, leading to fragmentation of the disk
Spanned mapping, on the other hand, is a method where a file is stored in non-contiguous locations on
the disk. This is necessary when a file is too large to fit into a single contiguous location. In this case, the
file is broken into smaller parts, which are then stored in different locations on the disk. The operating
system keeps track of the different parts of the file and retrieves them as needed. This method reduces
fragmentation, but it takes longer to access the file because the operating system needs to access
multiple locations on the disk to retrieve the entire file.
Sorted vs unsorted file structure
In a computer system, data is often stored in files for easy access and retrieval. Two common file
structures are sorted and unsorted files.
In an unsorted file, the records are stored in no particular order. This means that records can be added or
deleted from the file without affecting the order of the other records. However, searching for a specific
record can take longer since all records must be scanned until the desired record is found.
Ina sorted file, records are arranged in a specific order, such as alphabetically or numerically. This makes
searching for a specific record much faster since binary search or other efficient search algorithms can
be used. However, adding or deleting records from a sorted file requires rearranging the order of the
records, which can be time-consuming and require more resources.‘The choice between sorted and unsorted file structure depends on the specific application requirements.
If the main operation is searching for specific records, then a sorted file structure may be more
appropriate. If the main operation is adding or deleting records, then an unsorted file may be more
efficient,
Dems:
A database approach is a well-organized collection of data
that are related in a meaningful way which can be accessed
by different users but stored only once in a system. The
various operations performed by the DBMS system are:
Insertion, deletion, selection, sorting etc.
sere Users
In the above figure, duplication of data is reduced due to centralization of data.
‘There are the following differences between DBMS and File systems:
Basle DBMS Approach Fle System Approach
Meaning DBMS is a collection of data. In DBMS, ‘The file system is a collection of data, In this
the user is not required to write the system, the user has to rite the procedures for
procedures. managing the database.
‘Sharing of data Due to the centralized approach, data Data is distributed in many files, and it may be of
sharing is easy. diferent format, sot isnt easy to she
Data Abstraction DBMS gives an abstract view of data ‘The file system provides the detall of the data
that hides the details representation and storage of data,Security and
Protection
Recovery
Mechanism
‘Manipulation
Techniques
‘concurrency
Problems
Where to use
cost
Data Redundancy
Inconsistency
DBMS provides @ good protection
mechanism.
DBMS. provides a crash recovery
mechanism, Le., DBMS protects the
user from system f
DBMS contains a wide variety of
sophisticated techniques to store and
retrieve the data
DBMS takes care of Concurrent access
of data using some form of locking.
Database approach used in large
systems which interrelate many files
The database system is expensive to
design.
Due to the centralization of the
database, the problems of data
redundancy and Inconsistency are
controled
It Iont easy to protect a file under the file
system,
The file system doesnt have @ crash
mechanism, ie, if the system crashes while
centering some data, then the content ofthe file
‘The fle system cant efficiently store and
retrieve the data,
Inthe File system, concurrent access has many
problems like redirecting the fle while deleting
‘some information or updating some information.
File system approach used in large systems
which interrelate many files
‘The filesystem approach is cheaper to design.
In this, the files and application programs are
created by diferent programmers so that there
‘exists a fot of duplication of data which may lead
to inconsistency.structure
Data
Independence
Integrity
constraints
Data Models
Flexibility
Examples
‘The database structure is complex to
design
In this system, Data Independence
exists, andi can be of two types.
Logical Data Independence
Physical Data Independence
Inegrity Constraints
In the database approach, 3 types of
data models exist:
Hierarchal data models
Network data models
Relational data models,
Changes are often a necessity to the
content of the data stored in any
system, and these changes
casily with a database approach,
Oracle, SQL Server, Sybase et.
‘The fle system approach has a simple structure,
In the Filesystem approach, there exists no Data
Independence.
fe system.
Inthe fil
system approach, ther
is no concept
of data models exists
‘The flexibility of the system is less as compared
tothe DBMS approach.
Cobol, c+ eteDatabase Management Challenges
1. Increased problem in the appearance of data. 2. Limitations on mitigation 3. Increase data
volume.4. Data security.5. Data Management and Distribution.
‘What are the components of DBMS?
Hardware, Software, Data, Database Access
Language, Procedures and Users all together form
Indexing in DBMS.
© Indexing is used to optimize the performance of
a database by mit
the number of | m= |
accesses required when a query is processed.
© The index is a type of data structure. It is used
to locate and access the data in a database table quickly.
Index structure:
Indexes can be created using some database columns. Search key Dam
Reference
Fig: Structure of Index
©The first column of the database is the search key that contains a copy of the primary key or
candidate key of the table. The values of the primary key are stored in sorted order so that the
corresponding data can be accessed easily.
‘The second column of the database is the data reference. It contains a set of pointers holding
the address of the disk block where the value of the particular key can be found.
ees
Indexing Methods
\
/ \.
/\
JN
\
= =The indices are usually sorted to make searching faster. The indices which are sorted are known as
ordered ini
Primary Index
© Ifthe index is created on the basis of the primary key of the table, then it is known as primary
Indexing. These primary keys are unique to each record and contain 1:1 relation between the
records.
As primary keys are stored in sorted order, the performance of the searching operation is quite
efficient.
The primary index can be classified into two types: Dense index and Spars
Dense index
© The dense index contains an index record for every search key value in the data fi
searching faster.
© Inthis, the number of records in the index table is same as the number of records in the main
table.
w 1 ‘Agra | 1604300
© Itneeds more space to store index
Usa usa | Chicago [2,789,378
record itself. The index records have
Nepal Nepal | Kathmandu | 1,456,634
the search key and a pointer to the
UK te UK | Cambridge | 1.360368
actual record on the disk.
Sparse index
© Inthe data file, index record appears only for a
few items. Each item points to a block.
e a
Nepal sy Chie 2789378
main table, the index points to the records in the a ~
main table in a gap. WK Se Kathmandu | 1456634
UW | Cami | 130036
© Inthis, instead of pointing to each record in theClustering Index
© Acluster
index can be defined as an ordered data file. Sometimes the index is cr
non-primary key columns which may not be unique for each record.
© Inthis case, to identify the record faster, we will group two or more columns to get the unique
value and create index out of them. This method is called a clustering index.
9 The records which have si
lar character
jos are grouped, and indexes are created for these
group.
‘The previous schema is little confusing because one disk block is shared by records which belong to the
different cluster. If we use separate disk block for separate clusters, then it is called better technique.
Secondary Index
In the sparse indexing, as the size of the table grows, the size of mapping also grows. These mappings
are usually kept in the primary memory so that address fetch should be faster. Then the secondary
memory searches the actual data based on the address got from mapping. If the mapping size grows
then fetching the address itself becomes slower. In this case, the sparse index will not be efficient. To
overcome this problem, secondary indexing is introduced.
In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In this
method, the huge range for the columns is selected initially so that the mapping size of the first level
becomes small. Then
ich range is further divide
into smaller ranges. The mapping of the first level is
stored in the primary memory, so that address fetch is faster. The mapping of the second level and
actual data are stored in the secondary memory (hard disk).