KEMBAR78
distributed dbms | PPT
Distributed Database
Management Systems
Different Types of Database System




Contributed by:
                  Isha Kushwah
                  MCA-2008-11
                                     Centralized database System




                                                                   2
Content
• What a distributed database management system
  (DDBMS)
• DDBMS components
• Database implementation is affected by different
  levels of data and process distribution
• How transactions are managed in a distributed
  database environment
• How database design is affected by the
  distributed database environment

                                                     3
Problem in Centralized database
             Management
• Performance degradation
• High cost
• Reliability problems




                                      5
DDBMS Advantages
•   Data located near site with greatest demand
•   Faster data access
•   Faster data processing
•   Growth facilitation
•   Improved communications
•   Reduced operating costs
•   User-friendly interface
•   Less danger of single-point failure
•   Processor independence
                                                  6
DDBMS Disadvantages
•   Complexity of management and control
•   Security
•   Lack of standards
•   Increased storage requirements
•   Greater difficulty in managing data environment
•   Increased training costs




                                                      7
Distributed Processing
Shares database’s logical processing among
physically, networked independent sites




                                         Figure 10.1
                                                       8
Distributed Database
Stores logically related database over physically
independent sites




                                             Figure 10.2

                                                       9
Distributed Database
      vs. Distributed Processing
• Distributed processing
  – Does not require distributed database
  – May be based on a single database on single
    computer
  – Copies or parts of database processing functions
    must be distributed to all data storage sites
• Distributed database
  – Requires distributed processing
• Both
  – Require a network to connect components
                                                       10
Functions of DDBMS
•   Application/end user interface
•   Validation
•   Transformation
•   Query optimization
•   Mapping
•   I/O interface
•   Formatting
•   Security
•   Backup and recovery
•   DB Administration
•   Concurrency Control
•   Transaction Management


                                     11
Centralized Database




                       Figure 10.3

                                13
Fully Distributed Database
  Management System




                       Figure 10.4   14
DDBMS Components
•   Computer workstations
•   Network hardware and software components
•   Communications media
•   Transaction processor (TP)
    – Also called application manager (AP) or
      transaction manager (TM)
• Data processor (DP)
    – Also called data manager (DM)



                                                15
Distributed Database Components




                        Figure 10.5
                                      16
DDBMS Protocols
• Interface with network to transport data and
  commands between DPs and TPs
• Synchronize data received from DPs and route to
  appropriate TPs
• Ensure common database functions
   – Security
   – Concurrency control
   – Backup and recovery



                                                    17
Levels of Data and Process
             Distribution
Database systems can be classified based on
process distribution and data distribution




                                       Table 10.1




                                                    18
Single-Site Processing, Single-Site
                Data (SPSD)

•   All processing on single CPU or host computer
•   All data are stored on host computer disk
•   DBMS located on the host computer
•   DBMS accessed by dumb terminals
•   Typical of mainframe and minicomputer DBMSs
•   Typical of 1st generation of single-user
    microcomputer database




                                                    19
Single-Site Processing, Single-Site
            Data (con’t.)




  Figure 10.6


                                      20
Multiple-Site Processing, Single-Site
             Data (MPSD)
 • Requires network file server
 • Applications accessed through LAN
 • Variation known as client/server architecture




                                     Figure 10.7
                                                   21
Multiple-Site Processing,
        Multiple-Site Data (MPMD)
• Fully distributed DDBMS with support for multiple
  DPs and TPs at multiple sites
   – Homogeneous I
      • Integrate one type of centralized DBMS over the
        network
   – Heterogeneous
      • Integrate different types of centralized DBMSs over a
        network



                                                                22
Heterogeneous Distributed Database
            Scenario




                             Figure 10.8


                                           23
Distributed DB Transparency
• Allows end users to feel like only database user
• Hides complexities of distributed database
• Transparency features
   –   Distribution
   –   Transaction
   –   Failure
   –   Performance
   – Heterogeneity


                                                     24
Distribution Transparency
• Allows management of a physically dispersed
  database as though it were centralized
• Three Levels
  – Fragmentation transparency
  – Location transparency
  – Local mapping transparency

                                            Table 10.2




                                                    25
Transaction Transparency
• Ensures transactions maintain integrity and
  consistency
• Completed only if all involved database sites
  complete their part of the transaction
• Management mechanisms
   –   Remote request
   –   Remote transaction
   –   Distributed transaction
   –   Distributed request


                                                  26
Remote Request




                 Figure 10.10




                                27
Remote Transaction




                     Figure 10.11


                                    28
Distributed Transaction
Figure 10.12




                                         29
Distributed Requests




                  Figure 10.13

                                 30
Distributed Requests (con’t.)




                       Figure 10.14

                                      31
Distributed Concurrency Control
• Multisite, multiple-process operations more likely
  to create data inconsistencies and deadlocked
  transactions
• Problems
   – Transaction committed by local DP
   – One DP could not commit transaction’s result
   – Yields inconsistent database



                                                       32
Two-Phase Commit Protocol
• DO-UNDO-REDO protocol
  – Write-ahead protocol
  – Two kinds of nodes
     • Coordinator
     • Subordinates
• Phases
  – Preparation
     • Coordinator sends message to all subordinates
     • Confirms all are ready to commit or abort
  – Final Commit
     • Ensures all subordinates have committed or aborted
                                                            33
Performance Transparency
        and Query Optimization
• Objective: Minimize total cost associated with
  execution of request
• Main costs
   – Access time
   – Communication
   – CPU time
• Basis for query optimization algorithms
   – Optimum execution order
   – Sites accessed to minimize communication costs
• Automatic or Manual
• Dynamic or static optimization
• Statistically based vs. rule-based query
  optimization algorithms                             34
Distributed Database Design
• Partition database into fragments
   – Horizontal
   – Vertical
   – Mixed
• Fragments to replicate
   – Storage of data copies at multiple sites
   – Fully, partially, unreplicated databases
• Data allocation
   – Where to locate data
   – Centralized, partitioned, replicated
                                                35
Client/Server Advantages Over DDBMS
• Client/server less expensive
• Client/server solutions allow use of
  microcomputer’s GUI
• More people with PC skills than mainframe skills
• PC is well established in workplace
• Numerous data analysis and query tools exist
• Considerable cost advantages to off-loading
  application development

                                                     36
Client/Server Disadvantages
• Creates more complex environment with different
  platforms
• Increased number of users and sites creates
  security problems
• Training issues become more complex and
  expensive




                                                    37
Date’s 12 Commandments for
   Distributed Databases
1. Local Site Independence
2. Central Site Independence
3. Failure Independence
4. Location Transparency
5. Fragmentation Transparency
6. Replication Transparency


                                38
Date’s 12 Commandments for
     Distributed Databases
 7. Distributed Query Processing
 8. Distributed Transaction Processing
 9. Hardware Independence
10. Operating System Independence
11. Network Independence
12. Database Independence

                                         39

distributed dbms

  • 1.
  • 2.
    Different Types ofDatabase System Contributed by: Isha Kushwah MCA-2008-11 Centralized database System 2
  • 3.
    Content • What adistributed database management system (DDBMS) • DDBMS components • Database implementation is affected by different levels of data and process distribution • How transactions are managed in a distributed database environment • How database design is affected by the distributed database environment 3
  • 4.
    Problem in Centralizeddatabase Management • Performance degradation • High cost • Reliability problems 5
  • 5.
    DDBMS Advantages • Data located near site with greatest demand • Faster data access • Faster data processing • Growth facilitation • Improved communications • Reduced operating costs • User-friendly interface • Less danger of single-point failure • Processor independence 6
  • 6.
    DDBMS Disadvantages • Complexity of management and control • Security • Lack of standards • Increased storage requirements • Greater difficulty in managing data environment • Increased training costs 7
  • 7.
    Distributed Processing Shares database’slogical processing among physically, networked independent sites Figure 10.1 8
  • 8.
    Distributed Database Stores logicallyrelated database over physically independent sites Figure 10.2 9
  • 9.
    Distributed Database vs. Distributed Processing • Distributed processing – Does not require distributed database – May be based on a single database on single computer – Copies or parts of database processing functions must be distributed to all data storage sites • Distributed database – Requires distributed processing • Both – Require a network to connect components 10
  • 10.
    Functions of DDBMS • Application/end user interface • Validation • Transformation • Query optimization • Mapping • I/O interface • Formatting • Security • Backup and recovery • DB Administration • Concurrency Control • Transaction Management 11
  • 11.
    Centralized Database Figure 10.3 13
  • 12.
    Fully Distributed Database Management System Figure 10.4 14
  • 13.
    DDBMS Components • Computer workstations • Network hardware and software components • Communications media • Transaction processor (TP) – Also called application manager (AP) or transaction manager (TM) • Data processor (DP) – Also called data manager (DM) 15
  • 14.
  • 15.
    DDBMS Protocols • Interfacewith network to transport data and commands between DPs and TPs • Synchronize data received from DPs and route to appropriate TPs • Ensure common database functions – Security – Concurrency control – Backup and recovery 17
  • 16.
    Levels of Dataand Process Distribution Database systems can be classified based on process distribution and data distribution Table 10.1 18
  • 17.
    Single-Site Processing, Single-Site Data (SPSD) • All processing on single CPU or host computer • All data are stored on host computer disk • DBMS located on the host computer • DBMS accessed by dumb terminals • Typical of mainframe and minicomputer DBMSs • Typical of 1st generation of single-user microcomputer database 19
  • 18.
    Single-Site Processing, Single-Site Data (con’t.) Figure 10.6 20
  • 19.
    Multiple-Site Processing, Single-Site Data (MPSD) • Requires network file server • Applications accessed through LAN • Variation known as client/server architecture Figure 10.7 21
  • 20.
    Multiple-Site Processing, Multiple-Site Data (MPMD) • Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites – Homogeneous I • Integrate one type of centralized DBMS over the network – Heterogeneous • Integrate different types of centralized DBMSs over a network 22
  • 21.
    Heterogeneous Distributed Database Scenario Figure 10.8 23
  • 22.
    Distributed DB Transparency •Allows end users to feel like only database user • Hides complexities of distributed database • Transparency features – Distribution – Transaction – Failure – Performance – Heterogeneity 24
  • 23.
    Distribution Transparency • Allowsmanagement of a physically dispersed database as though it were centralized • Three Levels – Fragmentation transparency – Location transparency – Local mapping transparency Table 10.2 25
  • 24.
    Transaction Transparency • Ensurestransactions maintain integrity and consistency • Completed only if all involved database sites complete their part of the transaction • Management mechanisms – Remote request – Remote transaction – Distributed transaction – Distributed request 26
  • 25.
    Remote Request Figure 10.10 27
  • 26.
    Remote Transaction Figure 10.11 28
  • 27.
  • 28.
    Distributed Requests Figure 10.13 30
  • 29.
  • 30.
    Distributed Concurrency Control •Multisite, multiple-process operations more likely to create data inconsistencies and deadlocked transactions • Problems – Transaction committed by local DP – One DP could not commit transaction’s result – Yields inconsistent database 32
  • 31.
    Two-Phase Commit Protocol •DO-UNDO-REDO protocol – Write-ahead protocol – Two kinds of nodes • Coordinator • Subordinates • Phases – Preparation • Coordinator sends message to all subordinates • Confirms all are ready to commit or abort – Final Commit • Ensures all subordinates have committed or aborted 33
  • 32.
    Performance Transparency and Query Optimization • Objective: Minimize total cost associated with execution of request • Main costs – Access time – Communication – CPU time • Basis for query optimization algorithms – Optimum execution order – Sites accessed to minimize communication costs • Automatic or Manual • Dynamic or static optimization • Statistically based vs. rule-based query optimization algorithms 34
  • 33.
    Distributed Database Design •Partition database into fragments – Horizontal – Vertical – Mixed • Fragments to replicate – Storage of data copies at multiple sites – Fully, partially, unreplicated databases • Data allocation – Where to locate data – Centralized, partitioned, replicated 35
  • 34.
    Client/Server Advantages OverDDBMS • Client/server less expensive • Client/server solutions allow use of microcomputer’s GUI • More people with PC skills than mainframe skills • PC is well established in workplace • Numerous data analysis and query tools exist • Considerable cost advantages to off-loading application development 36
  • 35.
    Client/Server Disadvantages • Createsmore complex environment with different platforms • Increased number of users and sites creates security problems • Training issues become more complex and expensive 37
  • 36.
    Date’s 12 Commandmentsfor Distributed Databases 1. Local Site Independence 2. Central Site Independence 3. Failure Independence 4. Location Transparency 5. Fragmentation Transparency 6. Replication Transparency 38
  • 37.
    Date’s 12 Commandmentsfor Distributed Databases 7. Distributed Query Processing 8. Distributed Transaction Processing 9. Hardware Independence 10. Operating System Independence 11. Network Independence 12. Database Independence 39

Editor's Notes