9/8/2014
Distributed DBMS Concepts
Concepts Distributed Database
Advantages and disadvantages of A logically interrelated collection of
distributed databases. shared data (and a description of this
Functions of DDBMS. data),
Distributed database design. physically distributed over a computer
network.
Distributed DBMS Concepts
Software system that permits the Distributed database system (DDBS) = DDB
management of the distributed database + D–DBMS
and A distributed database system consists of
loosely coupled sites that share no physical
makes the distribution transparent to
component
users.
Database systems that run on each site are
independent of each other
Transactions may access data at one or
more sites
Concepts Concepts
Collection of logically-related shared Sites linked by a communications network.
data. Data at each site is under control of a DBMS.
Data split into fragments. DBMSs handle local applications
Fragments may be replicated. autonomously.
Fragments/replicas allocated to sites. Each DBMS participates in at least one global
application.
1
9/8/2014
Distributed DBMS Distributed Processing
A centralized database that can be
accessed over a computer network.
Advantages of DDBMSs Disadvantages of DDBMSs
Organizational Structure Complexity
Shareability and Local Autonomy Cost
Improved Availability Security
Improved Reliability Integrity Control More Difficult
Improved Performance Lack of Standards
Economics Lack of Experience
Modular Growth Database Design More Complex
Types of DDBMS: Heterogeneous DDBMS
Homogeneous DDBMS Sites may run Different hardware,
Resembles a centralised DB, but data is DBMS products, Data model or
distributed across a number of sites in a Combination of above
network
Occurs when sites have implemented
All sites use same DBMS product.
their own databases and integration is
Has multiple data collections
considered later.
It integrates multiple data resources
Much easier to design and manage.
2
9/8/2014
Heterogeneous DDBMS Federated Database System
Complete local autonomy Cross between distributed and
Translations required to allow for: centralized DBMS
Different hardware. Distributed for global users and
Different DBMS products. Centralized for local users
Different hardware and different DBMS
products.
Functions of a DDBMS Distributed Database Design
Expect DDBMS to have at least the Fragmentation
functionality of a centralized DBMS. Relation may be divided into a number of sub-
Also to have following functionality: relations, which are then distributed
Extended communication services. Allocation
Extended Data Dictionary. Each fragment is stored at site with "optimal"
distribution.
Distributed query processing.
Extended concurrency control. Replication
Copy of fragment may be maintained at
Extended recovery services.
several sites.
Distributed Database Design Fragmentation
Definition and allocation of fragments
carried out strategically to achieve:
Locality of Reference
Improved Reliability and Availability
Improved Performance
Balanced Storage Capacities and Costs
Minimal Communication Costs.
3
9/8/2014
Correctness of Fragmentation Correctness of Fragmentation
Completeness Reconstruction
If relation R is decomposed into fragments Must be possible to define a relational
R1, R2, ... Rn, each data item that can be operation that will reconstruct R from the
found in R must appear in at least one fragments.
fragment. Reconstruction for horizontal fragmentation
is Union operation and Join for vertical .
Correctness of Fragmentation Correctness of Fragmentation
Disjointness For horizontal fragmentation, data item is a
If data item di appears in fragment Ri, tuple
then it should not appear in any other For vertical fragmentation, data item is an
fragment. attribute.
Exception:
vertical fragmentation, where primary key
attributes must be repeated to allow
reconstruction.
Types of Fragmentation Fragmentation
Four types of fragmentation:
Horizontal
Vertical
Mixed
Derived.
Other possibility is no fragmentation:
If relation is small and not updated frequently,
may be better not to fragment relation.
4
9/8/2014
Fragmentation Fragmentation Example
Reconstructing the original relation from
the vertical fragments is done via a
suitable join operation, &
from horizontal fragments via the
union operation.
Fragmentation Example Fragmentation
Horizontal
Consists of a subset of the tuples of a relation.
Defined using Selection operation of relational
algebra.
Vertical
Consists of a subset of attributes of a relation.
Defined using Projection operation of relational
algebra.
Fragmentation Fragmentation
Mixed Derived
Consists of a horizontal fragment that is A horizontal fragment that is based on
vertically fragmented, or a vertical horizontal
fragment that is horizontally fragmented. fragmentation of a parent relation.
Defined using Selection and Projection Ensures that fragments that are frequently
operations of relational algebra joined together are at same site.
Defined using Semi-Join operation of
relational algebra.
5
9/8/2014
Data Allocation Data Allocation
Four alternative strategies regarding Centralized
bplacement of data: Consists of single database and DBMS
stored at one site with users distributed
Centralized
across the network. (not a true
Partitioned (or Fragmented) distribution)!
Complete Replication Partitioned
Selective Replication Database partitioned into disjoint
fragments, each fragment assigned to one
site.
Data Allocation Database Replication
Complete Replication Functionality of DDBMS is attractive but
Consists of maintaining complete copy of protocols & algorithms are complex and
database at each site. can cause problems that may outweigh
Selective Replication advantages.
Combination of partitioning, replication, Alternative and more simplify approach
and centralization to data distribution is DB Replication
Database Replication Benefits of Database Replication
Replication server: Availability
Every major database vendor has Reliability
replication solution.
Performance
Database Replication:
Load Reduction
the process of copying and maintaining
database objects, such as relations, in Disconnected Computing
multiple databases that make up a Support Multiple Users
distributed database system.
Support Advanced Applications