Principles of Distributed Database
Systems
                                   M. Tamer Özsu
                                   Patrick Valduriez
© 2020, M.T. Özsu & P. Valduriez                       1
Outline
◼    Introduction
◼    Distributed and parallel database design
◼    Distributed data control
◼    Distributed Query Processing
◼    Distributed Transaction Processing
◼    Data Replication
◼    Database Integration – Multidatabase Systems
◼    Parallel Database Systems
◼    Peer-to-Peer Data Management
◼    Big Data Processing
◼    NoSQL, NewSQL and Polystores
◼    Web Data Management
© 2020, M.T. Özsu & P. Valduriez                    2
Outline
◼    Distributed Query Processing
       ❑   Query Decomposition and Localization
       ❑   Join Ordering
       ❑   Distributed Query Optimization
       ❑   Adaptive Query Processing
© 2020, M.T. Özsu & P. Valduriez                  3
Query Processing in a DDBMS
◼    Generally, a query in distributed DBMS require data from
     multiple sites, and this is called transmission of data that
     causes communication costs.
◼    Query processing in DBMS is different from centralized
     DBMS due to the communication cost of data transfer
     over the network.
◼    The transmission cost is low when the sites are
     connected through high-speed network and is quite
     significant in another network.
© 2020, M.T. Özsu & P. Valduriez                                    4
Query Processing in a DDBMS
◼    In distributed query processing, the data transfer
     cost of distributed query processing means:
       ❑   Cost of transferring intermediate files to other sites for
           processing and
       ❑   Cost of transferring the ultimate result file to the
           location where the results required
© 2020, M.T. Özsu & P. Valduriez                                        5
Distributed DBMS Environment
• If s1 request a query and
  needs data from s2 and
  s3.
• It is decided to execute
  the query at s3.
• 1st communication cost is
  transferring the data from
  s2 to s3 → then s3 will
  execute the query and
  get the result.
• 2nd communication cost is
  transferring the result
  from s3 to s1
© 2020, M.T. Özsu & P. Valduriez   6
Query Processing in a DDBMS
                                      High level user query
                                            Query
                                          Processor
                                   Low-level data manipulation
                                   commands for D-DBMS
© 2020, M.T. Özsu & P. Valduriez                                 7
Query Processing Components
◼    Query language
       ❑   SQL: “intergalactic dataspeak”
◼    Query execution
       ❑   The steps that one goes through in executing high-level
           (declarative) user queries.
◼    Query optimization
       ❑   How do we determine the “best” execution plan?
◼    We assume a homogeneous D-DBMS
© 2020, M.T. Özsu & P. Valduriez                                     8
Selecting Alternatives
     Find the names of employee who are
       managing a project
     Strategy 1
     SELECT               ENAME
     FROM                 EMP , ASG
     WHERE                emp.no=asg.no and RESP = "Manager"
    Strategy 1
     ENAME(RESP=“Manager”EMP.ENO=ASG.ENO(EMP×ASG))
© 2020, M.T. Özsu & P. Valduriez                               9
Selecting Alternatives
     Strategy 2
     SELECT               ENAME
     FROM                 EMP NATURAL JOIN ASG
     WHERE                RESP = "Manager"
   Strategy 2
    ENAME(EMP ⋈ENO (RESP=“Manager” (ASG))
   Strategy 2 avoids Cartesian product, and consumes less
   computing resources, so may be “better”
© 2020, M.T. Özsu & P. Valduriez                            10
Selecting Alternatives
     In a distributed system,
     ◼    Relational algebra is not enough to express execution
          strategies. It must be supplemented with operators for
          exchanging data between sites.
     ◼    The distributed query processor must also select the
          best sites to process data , and possibly the way data
          should be transformed.
© 2020, M.T. Özsu & P. Valduriez                                   11
    What is the Problem?
       Site 1                          Site 2          Site 3             Site 4            Site 5
ASG1=ENO≤“E3”(ASG)      ASG2=   ENO>“E3”(ASG) EMP1= ENO≤“E3”(EMP) EMP2= ENO>“E3”(EMP)   Result
                       Strategy A                                            Strategy B
    © 2020, M.T. Özsu & P. Valduriez                                                                 12
Cost of Alternatives
Assume
   ◼ size(EMP) = 400 row,
     size(ASG) = 1000
   ◼ tuple access cost = 1
     unit (1 operation or 1s);
   ◼ tuple transfer cost = 10
     units
   ◼ There are 20 managers
     in relation ASG
   ◼ Assume that the data is
     uniformly distributed
     among sites
© 2020, M.T. Özsu & P. Valduriez   13
Cost of Alternatives
◼    Strategy A
      ❑    produce ASG': (10+10) tuple access cost      20
      ❑    transfer ASG' to the sites of EMP: (10+10)
           tuple transfer cost                         200
      ❑    produce EMP': (10+10) tuple access cost
           2                                            40
      ❑    transfer EMP' to result site: (10+10) tuple
           transfer cost                               200
              Total Cost                               460
◼    Strategy B
      ❑    transfer EMP to site 5: 400 tuple transfer
           cost                                        4,000
      ❑    transfer ASG to site 5: 1000 tuple transfer
           cost                                       10,000
      ❑    produce ASG': 1000 tuple access (apply
           condition)                                  1,000
      ❑    join EMP and ASG': 400 20(manager)         tuple
           access cost                                 8,000
              Total Cost                              23,000
© 2020, M.T. Özsu & P. Valduriez                               14
Query Optimization Objectives
◼    Minimize a cost function
       ❑   I/O cost + CPU cost + communication cost
       ❑   These might have different weights in different distributed
           environments
◼    Wide area networks
       ❑   Communication cost may dominate or vary much
              ◼   Bandwidth
              ◼   Speed
              ◼   Protocol overhead
◼    Local area networks
       ❑   Communication cost not that dominant, so total cost function
           should be considered
◼    Can also maximize throughput
© 2020, M.T. Özsu & P. Valduriez                                          15
    Complexity of Relational Operations
                                             Operation                   Complexity
                                            Select
                                            Project                          O(n)
◼    Assume                            (without duplicate elimination)
      ❑   Relations of cardinality n       Project
                                       (with duplicate elimination)        O(n  log n)
      ❑   Sequential scan
                                           Group
                                           Join
                                           Semi-join                       O(n  log n)
                                           Division
                                           Set Operators
                                           Cartesian Product                 O(n2)
    © 2020, M.T. Özsu & P. Valduriez                                                  16
Types Of Optimizers
◼    Exhaustive search
       ❑    Cost-based
       ❑    Optimal
       ❑    Combinatorial complexity in the number of relations
◼    Heuristics
       ❑    Not optimal
       ❑    Regroup common sub-expressions
       ❑    Perform selection, projection first
       ❑    Replace a join by a series of semijoins
       ❑    Reorder operations to reduce intermediate relation size
       ❑    Optimize individual operations
© 2020, M.T. Özsu & P. Valduriez                                      17
 Optimization Granularity
◼    Single query at a time
       ❑    Cannot use common intermediate results
◼    Multiple queries at a time
       ❑    Efficient if many similar queries
       ❑    Decision space is much larger
© 2020, M.T. Özsu & P. Valduriez                     18
Optimization Timing
◼    Static : optimization is done at query compilation time
       ❑   Compilation ➔ optimize prior to the execution
       ❑   Difficult to estimate the size of the intermediate resultserror
           propagation
       ❑   Can amortize over many executions
◼    Dynamic: proceeds at query execution time
       ❑   Run time optimization
       ❑   Exact information on the intermediate relation sizes
       ❑   Have to re-optimize for multiple executions
◼    Hybrid: tradeoff between both
       ❑   Compile using a static algorithm
       ❑   If the error in estimate sizes > threshold, re-optimize at run time
© 2020, M.T. Özsu & P. Valduriez                                                 19
Statistics
◼    Relation
       ❑   Cardinality
       ❑   Size of a tuple
       ❑   Fraction of tuples participating in a join with another relation
◼    Attribute
       ❑   Cardinality of domain
       ❑   Actual number of distinct values
◼    Simplifying assumptions
       ❑   Independence between different attribute values
       ❑   Uniform distribution of attribute values within their domain
© 2020, M.T. Özsu & P. Valduriez                                              20
Optimization Decision Sites
◼    Centralized
       ❑   Single site determines the “best” schedule
       ❑   Simple
       ❑   Need knowledge about the entire distributed database
◼    Distributed
       ❑   Cooperation among sites to determine the schedule
       ❑   Need only local information
       ❑   Cost of cooperation
◼    Hybrid
       ❑   One site determines the global schedule
       ❑   Each site optimizes the local subqueries
© 2020, M.T. Özsu & P. Valduriez                                  21
Network Topology
◼    Wide area networks (WAN) – point-to-point
       ❑    Characteristics
             ◼    Relatively low bandwidth (compared to local CPU/IO)
             ◼    High protocol overhead
       ❑    Communication cost may dominate; ignore all other cost factors
       ❑    Global schedule to minimize communication cost
       ❑    Local schedules according to centralized query optimization
◼    Local area networks (LAN)
       ❑    Communication cost not that dominant
       ❑    Total cost function should be considered
       ❑    Broadcasting can be exploited (joins)
       ❑    Special algorithms exist for star networks
© 2020, M.T. Özsu & P. Valduriez                                             22
                                   Questions?
© 2020, M.T. Özsu & P. Valduriez                23