UNIT 2 12 MKS
DATA MODELS AND DATABASE TYPES
Data model:
The data model describes the structure of a database. It is a collection of
conceptual tools for describing data, data relationships and consistency
constraints .and various types of data model such as
1. Object based logical model
2. Record based logical model
3. Physical model
Types of data model:
1. Object based logical model
a. ER-model
a. Functional model
a. Object oriented model
a. Semantic model
2. Record based logical model
a. Hierarchical database model
a. Network model
a. Relational model
3. Physical model: These models can be used in describing the data at the
lowest level, i.e. physical level. These models can be classified into
1. Unifying model
2. Frame memory model
HIERARCHICAL MODEL:
Hierarchical Database Model, as the name suggests, is a database model in
which the data is arranged in a hierarchical or tree like structure. Parent should
have one or more child records. The Data can be accessed by following through
the tree structure from the Root or the first parent. Hence this model is named as
Hierarchical Database Model.
It is a data model in which data is represented in the tree-like structure.
In this model, data is stored in the form of records which are the collection of
fields.
The records are connected through links
Each field can contain only one value.
It must have only one parent for each child node but parent nodes can have
more than one child. Multiple parents are not allowed. This is the major
difference between the hierarchical and network database model.
The first node of the tree is called the root node.
When data needs to be retrieved then the whole tree is traversed starting from
the root node.
This model represents one- to- many relationships.
Let us see one example: Let us assume that we have a main directory which
contains other subdirectories. Each subdirectory contains more files and
directories. Each directory or file can be in one directory only i.e. it has only
one parent.
APPLICATIONS:
A Hierarchical database model was widely used during the Mainframe
Computers Era.
Today, it is used mainly for storing file systems and geographic information.
It is used in applications where high performance is required such as
telecommunications and banking.
A hierarchical database is also used for Windows Registry in the Microsoft
Windows
Advantages
Advantages are listed below.
Data can be retrieved easily due to the explicit links present between the table
structures.
Referential integrity is always maintained i.e. any changes made in the parent
table are automatically updated in a child table.
Allows data sharing.
It is conceptually simple due to the parent-child relationship.
Database security is enforced.
Efficient with 1: N relationships.
A clear chain of command or authority.
Increases specialization.
High performance.
Clear results.
Disadvantages
Below are some of the disadvantages given.
Complex relationships are not supported.
Redundancy which results in inaccurate information.
Change in structure leads to change in all application programs.
M: N relationship is not supported.
No data manipulation or data definition language.
Network Model:
This is an extension of the Hierarchical model. In this model data is
organised more like a graph, and are allowed to have more
than one parent node.
1.data is more related as more relationships are established in this database
model.
2.Also, as the data is more related, hence accessing the data is also easier and
fast.
3.This database model was used to map many-to-many data relationships.
4.This was the most widely used database model, before Relational Model was
introduced.
GRAPH
CHARACTERISTICS:
The network model is better than a hierarchical model.
Supports many to many relationships.
Many parents can have many children.
Many children can have many parents (as shown in the figure).
Entities /records are represented as a connected network with each other.
One child entity can have more than one parent entity. For example, in the
figure, the Subject has two children. One child is a STUDENT and another one
is Degree.
Represented as a network and one child can have more than one parent. This
model represents a complex structure.
Entities can have multiple parent entities and lead to a complex structure.
Not very flexible to reorganize the model.
High performance
Advantages of the network model
It is fast data access with a network model.
The network model allows creating more complex and stronger queries as
compared to the database with a hierarchical database model. A user can
execute a variety of database queries when selecting the network model.
Disadvantages of a network model
The network model is a very complex database model, so the user must be very
familiar with the overall structure of the database.
Updating inside this database is a quite difficult and boring task. We need the
help of the application programs that is being used to navigate the data.
RELATIONAL MODEL
(RM) represents the database as a collection of relations.
A relation is nothing but a table of values.
Every row in the table represents a collection of related data values.
The data are represented as a set of relations.
In the relational model, data are stored as tables.
Properties:
In this each column of a table contains same kind of values.
Each data value is a simple number or a character string. That is a table must be
in first normal form.
All rows of a table are distinct.
The ordering of rows with in a table is immaterial.
The columns of a table are assigned distinct names and the ordering of these
columns in immaterial.
Attribute: Each column in a Table. Attributes are the properties which define a
relation. e.g., Student_Rollno, NAME,etc.
Tables – In the Relational model the, relations are saved in the table format. It is
stored along with its entities. A table has two properties rows and columns.
Rows represent records and columns represent attributes.
Tuple – It is nothing but a single row of a table, which contains a single record.
Relation Schema: A relation schema represents the name of the relation with its
attributes.
Degree: The total number of attributes which in the relation is called the degree
of the relation.
Cardinality: Total number of rows present in the Table.
Column: The column represents the set of values for a specific attribute.
Relation instance – Relation instance is a finite set of tuples in the RDBMS
system. Relation instances never have duplicate tuples.
Relation key - Every row has one, two or multiple attributes, which is called
relation key.
Attribute domain – Every attribute has some pre-defined value and scope which
is known as attribute domain
Object Oriented Data Model
Object oriented data model is based upon real world situations. These situations
are represented as objects, with different attributes. All these object have
multiple relationships between them.
Elements of Object oriented data model
Objects
The real world entities and situations are represented as objects in the Object
oriented database model.
Attributes and Method
Every object has certain characteristics. These are represented using Attributes.
The behaviour of the objects is represented using Methods.
Class
Similar attributes and methods are grouped together using a class. An object can
be called as an instance of the class.
Inheritance
A new class can be derived from the original class. The derived class contains
attributes and methods of the original class as well as its own.
For example, two classes, CUSTOMER and EMPLOYEE, can be created as
subclasses from the class PERSON. In this case, CUSTOMER and
EMPLOYEE will inherit all attributes and methods from PERSON.
ANOTHER EX Shape, Circle, Rectangle and Triangle are all objects in this
model.
Circle has the attributes Center and Radius.
Rectangle has the attributes Length and Breadth
Triangle has the attributes Base and Height.
The objects Circle, Rectangle and Triangle inherit from the object Shape.
Advantages of Object Oriented Data Model
1. Define their own data types
2. Directly represent aggregate objects
3. Navigate between tables with pointers
4. Create non-first normal form tables with repeating groups
Disadvantages of Object Oriented Data Model
1. Lack of OODM standards
2. Complex navigational data access
On the basis of the number of users:
The database system may be multi-user or single-user. The configuration
of the hardware and the size of the organization will determine whether it is a
multi-user system or a single user system.
In single user system the database resides on one computer and is only
accessed by one user at a time. This one user may design, maintain, and write
database programs.
Due to large amount of data management most systems are multi-user. In
this situation the data are both integrated and shared.A database is integrated
when the same information is not recorded in two places. For example, both the
Library department and the Account department of the college database may
need student addresses. Even though both departments may access different
portions of the database, the students' addresses should only reside in one place.
It is the job of the DBA to make sure that the DBMS makes the correct
addresses available from one central storage area.
On the basis of the site location
Centralized Database System
The centralized database system consists of a single processor together
with its associated data storage devices and other peripherals. It is physically
confined to a single location. Data can be accessed from the multiple sites with
the use of a computer network while the database is maintained at the central
site.
Disadvantages of Centralized Database System
• When the central site computer or database system goes down, then every one
(users) is blocked from using the system until the system comes back.
• Communication costs from the terminals to the central site can be expensive.
Parallel Database System
Parallel database system architecture consists of a multiple Central Processing
Units (CPUs) and data storage disk in parallel. Hence, they improve
processing and Input/Output (I/O) speeds. Parallel database systems are used in
the application that have to query extremely large databases or
that have to process an extremely large number of transactions per second.
Mainly there are two types of parallel database system
1. Shared memory parallel database
2.Shared disk parallel database
Advantages of a Parallel Database System
• Parallel database systems are very useful for the applications that have to
query extremely large databases (of the order of terabytes, for example
, 1012 bytes) or that have to process an extremely large number of transactions
per second (of the order of thousands of transactions per second).
• In a parallel database system, the throughput (that is, the number of tasks that
can be completed in a given time interval) and the response time
(that is, the amount of time it takes to complete a single task from the time it is·
submitted) are very high.
Disadvantages of a Parallel Database System
• In a parallel database system, there· is a startup cost associated with initiating
a single process and the startup-time may overshadow the actual
processing time, affecting speedup adversely.
• Since process executing in a parallel system often access shared resources, a
slowdown may result from interference of each new process as
it completes with existing processes for commonly held resources, such as
shared data storage disks, system bus and so on.
Distributed Database System
A distributed database is a collection of multiple interconnected
databases, which are spread physically across various locations that
communicate via a computer network.
A logically interrelated collection of shared data physically distributed over a
computer network is called as distributed database and the software
System that permits the management of the distributed database and makes the
distribution transparent to users is called as Distributed DBMS.
It consists of a single logical database that is split into a number of fragments.
Each fragment is stored on one or more computers under the control
of a separate DBMS, with the computers connected by a communications
network. As shown, in distributed database system, data is spread across
a variety of different databases. These are managed by a variety of different
DBMS software running on a variety of different operating systems.
These machines are spread (or distributed) geographically and connected
together by a variety o
f communication networks
Types of Distributed Database Management System
The following are the types of Distributed Database Management System:
Homogeneous DDBMS
In a homogeneous DDBMS, the database management systems across all locations
are uniform and based on the same data model. These database management
systems are much easier to handle and the database can even be scaled if required.
In a homogeneous distributed database, all the sites use identical DBMS and
operating systems. Its properties are −
● The sites use very similar software.
● The sites use identical DBMS or DBMS from the same vendor.
● Each site is aware of all other sites and cooperates with other sites to
process user requests.
● The database is accessed through a single interface as if it is a single
database.
Heterogeneous DDBMS
In heterogeneous DDBMS, the database management systems across different
locations may be based on different data models such as relational, hierarchical,
object oriented etc. This type of database systems are a result of later integration of
individual database systems. They are quite complicated and difficult to manage.
In a heterogeneous distributed database, different sites have different operating
systems, DBMS products and data models. Its properties are −
● Different sites use dissimilar schemas and software.
● The system may be composed of a variety of DBMSs like relational, network,
hierarchical or object oriented.
● Query processing is complex due to dissimilar schemas.
● Transaction processing is complex due to dissimilar software.
● A site may not be aware of other sites and so there is limited co-operation in
processing user requests.
Advantages
The distributed database can have the data arranged according to different levels of
transparency i.e data with different transparency levels can be stored at different locations.
● If there were a natural catastrophe such as a fire or an earthquake, all the data would
not be destroyed as it is stored at different locations.
● It is cheaper to create a network of systems containing a part of the database. This
database can also be easily increased or decreased.
● Even if some of the data nodes go offline, the rest of the database can continue its
normal functions.
Disadvantages
● The distributed database is quite complex and it is difficult to make sure that a user
gets a uniform view of the database because it is spread across multiple locations.
● It is difficult to provide security in a distributed database as the database needs to be
secured at all the locations it is stored. Moreover, the infrastructure connecting all the
nodes in a distributed database also needs to be secured.
● It is difficult to maintain data integrity in the distributed database because of its
nature. There can also be data redundancy in the database as it is stored at multiple
locations.