0% found this document useful (0 votes)

348 views42 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Tapaswini Desaboina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

348 views42 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Tapaswini Desaboina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Advanced Databases

UNIT: III (Chapter-2)

Query Processing and
Decomposition
Reference:
Chapter – 6 & 7
Principles of Distributed Database Systems, M.Tamer Ozsu, Patrick Valduriez, 3rd Edition,
Springer

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1

Outline
• Objectives of Query Processing
• Characterization of query processors
• Layers of query processing
• Query decomposition
• Localization of distributed data

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2

Query Processing in a DDBMS
high level user query

query
processor

Low-level data manipulation

commands for D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3

Query Processing Components
• Query language that is used

➡ SQL

• Query execution methodology

➡ The steps that one goes through in executing high-level (declarative) user
queries.

• Query optimization

➡ How do we determine the “best” execution plan?

• We assume a homogeneous D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4

Selecting Alternatives

EMP(ENO, ENAME, TITLE)

ASG(ENO, PNO, RESP, DUR)

SELECT ENAME
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager“

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5

Selecting Alternatives
EMP(ENO, ENAME, TITLE)
SELECT ENAME
ASG(ENO, PNO, RESP, DUR)
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager"

Strategy 1
ENAME(RESP=“Manager”EMP.ENO=ASG.ENO(EMP×ASG))
Strategy 2
 ENAME(EMP ⋈ENO (RESP=“Manager” (ASG))

Strategy 2 avoids Cartesian product, so may be “better”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6

What is the Problem?
Site 1 Site 2 Site 3 Site 4 Site 5
ASG1=ENO≤“E3”(ASG) ASG2= ENO>“E3”(ASG) EMP1= ENO≤“E3”(EMP) EMP2= ENO>“E3”(EMP) Result

Site 5
Strategy-B
Strategy-A Site 5
result  EMP1'  EMP2' result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2)

EMP1' EMP2'
Site 3 Site 4 ASG1 ASG2 EMP1 EMP2
EMP’1=EMP1 ⋈ENO ASG’1 EMP’2=EMP2 ⋈ENO ASG’2
Site 1 Site 2 Site 3 Site 4

ASG 1' ASG '2

Site 1 Site 2
ASG 1'  σ RESP "Manager"ASG 1 ASG '2  σ RESP  "Manager"ASG 2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7

Cost of Alternatives
• Assume
➡ size(EMP) = 400, size(ASG) = 1000 20 Managers in ASG
➡ tuple access cost = 1 unit tuple transfer cost = 10 units
• Strategy-A
➡ produce ASG': (10+10)  tuple access cost =20
➡ transfer ASG' to the sites of EMP: (10+10)  tuple transfer cost=200
➡ produce EMP': (10+10)  tuple access cost  2=40
➡ transfer EMP' to result site: (10+10)  tuple transfer cost= 200
Total Cost 460
• Strategy-B
➡ transfer EMP to site 5: 400  tuple transfer cost 4,000
➡ transfer ASG to site 5: 1000  tuple transfer cost 10,000
➡ produce ASG': 1000  tuple access cost 1,000
➡ join EMP and ASG': 400  20  tuple access cost 8,000
Total Cost 23,000
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8
Objectives of Query Processing
• Minimize a cost function
I/O cost + CPU cost + Communication cost
These might have different weights in different distributed environments
• Wide area networks
➡ Communication cost may dominate or vary much
✦ bandwidth
✦ speed
✦ high protocol overhead
• Local area networks
➡ communication cost not that dominant
➡ total cost function should be considered
• Can also maximize throughput

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9

Characteristics of Query Processors

• Languages
• Types of Optimizers
• Optimization Timing
• Statistics
• Decision Sites
• Network Topology
• Replicated Fragments
• Use of Semijoins
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10
Characteristics of Query Processors
Complexity of Relational Operations

Operation Complexity

Select
• Assume Project O(n)
(without duplicate elimination)
➡ relations of cardinality n
➡ sequential scan Project
(with duplicate elimination) O(n  log n)
Group

Join
Semi-join O(n  log n)
Division
Set Operators

Cartesian Product O(n2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11

Characteristics of Query Processors
Types of Optimization
• Exhaustive search – all possible execution strategies are considered
➡ Cost-based

➡ Optimal

➡ Combinatorial complexity in the number of relations

• Heuristics
➡ Not optimal

➡ Regroup common sub-expressions

➡ Perform selection, projection first

➡ Replace a join by a series of semijoins

➡ Reorder operations to reduce intermediate relation size

➡ Optimize individual operations

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12

Characteristics of Query Processors
Optimization Granularity
• Single query at a time

➡ Cannot use common intermediate results

• Multiple queries at a time

➡ Efficient if many similar queries

➡ Decision space is much larger

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13

Characteristics of Query Processors
Optimization Timing
• Static
➡ Compilation  optimize prior to the execution
➡ Difficult to estimate the size of the intermediate results. error
propagation
➡ Can amortize over many executions
• Dynamic
➡ Run time optimization
➡ Exact information on the intermediate relation sizes
➡ Have to reoptimize for multiple executions
➡ Distributed INGRES
• Hybrid
➡ Compile using a static algorithm
➡ If the error in estimate sizes > threshold, reoptimize at run time
➡ Mermaid

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14

Characteristics of Query Processors
Statistics
• Relation
➡ Cardinality
➡ Size of a tuple
➡ Fraction of tuples participating in a join with another relation
• Attribute
➡ Cardinality of domain
➡ Actual number of distinct values
• Common assumptions
➡ Independence between different attribute values
➡ Uniform distribution of attribute values within their domain

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15

Characteristics of Query Processors
Decision Sites
• Centralized
➡ Single site determines the “best” schedule
➡ Simple
➡ Need knowledge about the entire distributed database

• Distributed
➡ Cooperation among sites to determine the schedule
➡ Need only local information
➡ Cost of cooperation

• Hybrid
➡ One site determines the global schedule
➡ Each site optimizes the local subqueries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/16

Characteristics of Query Processors
Network Topology
• Wide area networks (WAN) – point-to-point
➡ Characteristics
✦ Low bandwidth
✦ Low speed
✦ High protocol overhead
➡ Communication cost will dominate; ignore all other cost factors
➡ Global schedule to minimize communication cost
➡ Local schedules according to centralized query optimization

• Local area networks (LAN)

➡ Communication cost not that dominant
➡ Total cost function should be considered
➡ Broadcasting can be exploited (joins)
➡ Special algorithms exist for star networks

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/17

Layering scheme for Distributed
Query Processing
Calculus Query on Distributed Relations

Query GLOBAL
Decomposition SCHEMA

Algebraic Query on Distributed

Relations
CONTROL
Data FRAGMENT
SITE Localization SCHEMA

Fragment Query

Global STATS ON
Optimization FRAGMENTS

Optimized Fragment Query

with Communication Operations

LOCAL Local LOCAL

Optimization SCHEMAS
SITES

Optimized Local Queries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/18
Query Decomposition
&
Localization of distributed data

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/19

Query Decomposition
Input : Calculus query on global relations
• Normalization
➡ manipulate query quantifiers and qualification
• Semantic Analysis
➡ detect and reject “incorrect” queries
➡ possible for only a subset of relational calculus
• Simplification
➡ eliminate redundant predicates
• Restructuring
➡ calculus query  algebraic query
➡ more than one translation is possible
➡ use transformation rules

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/20

Normalization
• Lexical and syntactic analysis
➡ check validity (similar to compilers)

➡ check for attributes and relations

➡ type checking on the qualification

• Put into normal form

➡ Conjunctive normal form

(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ Disjunctive normal form

(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ OR's mapped into union

➡ AND's mapped into join or selection

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/21
Semantic Analysis
• prove incorrect queries
• Type incorrect
➡ If any of its attribute or relation names are not defined in the global schema
➡ If operations are applied to attributes of the wrong type
• Semantically incorrect
➡ Components do not contribute in any way to the generation of the result
➡ Only a subset of relational calculus queries can be tested for correctness
➡ Those that do not contain disjunction and negation
➡ To detect
✦ connection graph (query graph)
✦ join graph

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/22

Semantic Analysis – Example
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND PNAME = "CAD/CAM"
AND DUR ≥ 36
AND TITLE = "Programmer"

Query graph Join graph

DUR≥36

ASG ASG
EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO

TITLE =
EMP RESP PROJ EMP PROJ
“Programmer”

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/23

Semantic Analysis
If the query graph is not connected, the query may be wrong or
use Cartesian product
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND PNAME = "CAD/CAM"
AND DUR > 36
AND TITLE = "Programmer"

ASG

EMP RESP PROJ

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/24

Simplification
• Why simplify?

➡ Remember the example

• How? Use transformation rules

➡ Elimination of redundancy
✦ idempotency rules
p1  ¬( p1)  false
p1  (p1p2)  p1
p1  false  p1
…
➡ Application of transitivity
➡ Use of integrity rules

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/25

Simplification – Example
SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
OR (NOT(EMP.TITLE = "Programmer")
AND (EMP.TITLE = "Programmer"
OR EMP.TITLE = "Elect. Eng.")
AND NOT(EMP.TITLE = "Elect. Eng."))


SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/26

Restructuring
• Convert relational calculus to relational ENAME Project
algebra
• Make use of query trees σDUR=12 OR DUR=24
• Example
Find the names of employees other than
J. Doe who worked on the CAD/CAM
σPNAME=“CAD/CAM” Select
project for either 1 or 2 years.
SELECT ENAME σENAME≠“J. DOE”
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO ⋈PNO
AND ASG.PNO = PROJ.PNO
AND ENAME≠ "J. Doe" ⋈ENO Join
AND PNAME = "CAD/CAM"
AND (DUR = 12 OR DUR = 24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/27
Example
Recall the previous example: ENAME
Project
Find the names of employees other
than J. Doe who worked on the DUR=12  DUR=24
CAD/CAM project for either one or
two years.
PNAME=“CAD/CAM” Select
SELECT ENAME
FROM PROJ, ASG, EMP ENAME≠“J. DOE”
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO ⋈PNO
AND ENAME ≠ "J. Doe"
AND PROJ.PNAME="CAD/CAM" ⋈ENO Join

AND (DUR=12 OR DUR=24)

PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/28
Equivalent Query
ENAME

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

⋈PNO,ENO

EMP PROJ ASG

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/29
Data Localization
Input: Algebraic query on distributed relations
• Determine which fragments are involved
• Localization program
➡ substitute for each global query its materialization program

➡ optimize

Example
Assume ENAME

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24

EMP3 as follows:
✦ EMP1= ENO≤“E3”(EMP) PNAME=“CAD/CAM”
✦ EMP2= “E3”<ENO≤“E6”(EMP)
ENAME≠“J. DOE”
✦ EMP3= ENO≥“E6”(EMP)
➡ ASG fragmented into ASG1 and ASG2 ⋈PNO
as follows:
✦ ASG1= ENO≤“E3”(ASG) ⋈ENO

✦ ASG2= ENO>“E3”(ASG) PROJ  

Replace EMP by (EMP1  EMP2  EMP3)
and ASG by (ASG1  ASG2) in any query EMP1EMP2 EMP3 ASG1 ASG2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/31
Provides Parallellism

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Eliminates Unnecessary Work

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Reduction for PHF
• Reduction with selection
➡ Relation R and FR={R1, R2, …, Rw} where Rj=p (R)
j

pi(Rj)= if x in R: ¬(pi(x) pj(x))

➡ Example
SELECT *
FROM EMP
WHERE ENO="E5"
ENO=“E5” ENO=“E5”

EMP1 EMP2 EMP3 EMP2

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)

i j

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Reduction for PHF
• Assume EMP is fragmented as ⋈ENO
before and
➡ ASG1: ENO ≤ "E3"(ASG)  
➡ ASG2: ENO > "E3"(ASG)
• Consider the query EMP1 EMP2 EMP3 ASG1 ASG2
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO 
• Distribute join over unions
• Apply the reduction rule ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Reduction for VF
• Find useless (not empty) intermediate relations

Relation R defined over attributes A = {A1, ..., An} vertically fragmented

as Ri =A'(R) where A' A:
D,K(Ri) is useless if the set of projection attributes D is not in A'
Example: EMP1=ENO,ENAME (EMP); EMP2=ENO,TITLE (EMP)

SELECT ENAME
FROM EMP
ENAME ENAME

⋈ENO

EMP1 EMP2 EMP1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/37
Reduction for DHF
• Rule :
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation
• Example
ASG1: ASG ⋉ENO EMP1
ASG2: ASG ⋉ENO EMP2
EMP1: TITLE=“Programmer” (EMP)
EMP2: TITLE=“Programmer” (EMP)
• Query
SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/38
Reduction for DHF
Generic query ⋈ENO
TITLE=“Mech. Eng.”

 

ASG1 ASG2 EMP1 EMP2

Selections first ⋈ENO

 TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2

⋈ENO ⋈ENO

TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 EMP2 ASG2 EMP2

Elimination of the empty intermediate relations
(left sub-tree) ⋈ENO

TITLE=“Mech. Eng.”

ASG2 EMP2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/40
Reduction for Hybrid Fragmentation
• Combine the rules already specified:
➡ Remove empty relations generated by contradicting selections on horizontal
fragments;

➡ Remove useless relations generated by projections on vertical fragments;

➡ Distribute joins over unions in order to isolate and remove useless joins.

Reduction for Hybrid Fragmentation
Example
ENAME
Consider the following hybrid
fragmentation: ENAME
ENO=“E5”
EMP1= ENO≤"E4" (ENO,ENAME (EMP))

EMP2= ENO>"E4" (ENO,ENAME (EMP))

⋈ENO
 ENO=“E5”
EMP3= ENO,TITLE (EMP)

and the query 

EMP2
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP1 EMP2 EMP3

ADBMS Sem 1 Mumbai University (MSC - CS)
No ratings yet
ADBMS Sem 1 Mumbai University (MSC - CS)
39 pages
Backup and Recovery
No ratings yet
Backup and Recovery
35 pages
Hive Lecture Notes
100% (1)
Hive Lecture Notes
17 pages
DBMS Assignment-2
No ratings yet
DBMS Assignment-2
6 pages
6.1 Emerging Databases
No ratings yet
6.1 Emerging Databases
18 pages
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
No ratings yet
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
27 pages
Mid Term Past Paper 1
No ratings yet
Mid Term Past Paper 1
9 pages
Characteristics of Dbms
No ratings yet
Characteristics of Dbms
2 pages
Predicate Calculus
No ratings yet
Predicate Calculus
9 pages
Data Base Management System - Unit 8 - Week 6
No ratings yet
Data Base Management System - Unit 8 - Week 6
7 pages
DBMS Lab (18IS507) Manual With Solutions-1
No ratings yet
DBMS Lab (18IS507) Manual With Solutions-1
24 pages
Dbms Question Bank Unit I
100% (1)
Dbms Question Bank Unit I
2 pages
TY - BSC (C.S) Software Testing - Notes
No ratings yet
TY - BSC (C.S) Software Testing - Notes
62 pages
DSA Lab Manual Solved by M.Daud Sajid 028 BSSE4A FA21
No ratings yet
DSA Lab Manual Solved by M.Daud Sajid 028 BSSE4A FA21
162 pages
ESDL Lab Manual
No ratings yet
ESDL Lab Manual
7 pages
Understanding Transaction Management
No ratings yet
Understanding Transaction Management
28 pages
Advantage of Database Approach
No ratings yet
Advantage of Database Approach
2 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
31 pages
4.data Mining - Pattern Mining in Multilevel, Multidimensional Space, Rare and Negative Patterns
No ratings yet
4.data Mining - Pattern Mining in Multilevel, Multidimensional Space, Rare and Negative Patterns
14 pages
Previous University Question Paper
No ratings yet
Previous University Question Paper
3 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Information Retrieval 6 IR Models
No ratings yet
Information Retrieval 6 IR Models
14 pages
DBMS - Unit 1
No ratings yet
DBMS - Unit 1
33 pages
Query Processing Questions and Explanation
No ratings yet
Query Processing Questions and Explanation
8 pages
DBMS Lab Internal Question Paper
No ratings yet
DBMS Lab Internal Question Paper
8 pages
DBMS Notes
No ratings yet
DBMS Notes
141 pages
College Database
No ratings yet
College Database
11 pages
Database Systems Lab Guide
No ratings yet
Database Systems Lab Guide
10 pages
Query Processing - Database Questions & Answers - Sanfoundry 00
No ratings yet
Query Processing - Database Questions & Answers - Sanfoundry 00
7 pages
Unit 2.2:-BSR (Broadcasting With Selective Reduction) 8: Class
No ratings yet
Unit 2.2:-BSR (Broadcasting With Selective Reduction) 8: Class
15 pages
SQL Table Constraints and Views
No ratings yet
SQL Table Constraints and Views
6 pages
Distributed Database Transparency Features
No ratings yet
Distributed Database Transparency Features
6 pages
Reduction of An E-R Schema To Tables: Dr. Jenila Livingston L.M. Scse
No ratings yet
Reduction of An E-R Schema To Tables: Dr. Jenila Livingston L.M. Scse
23 pages
Graphs C++
No ratings yet
Graphs C++
5 pages
CS217 - Object-Oriented Programming (OOP) Assignment # 1: Carefully Read The Following Instructions!
No ratings yet
CS217 - Object-Oriented Programming (OOP) Assignment # 1: Carefully Read The Following Instructions!
2 pages
Distributed Querry Optimization
No ratings yet
Distributed Querry Optimization
4 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Database Management System Kcs 501 1
No ratings yet
Database Management System Kcs 501 1
2 pages
CS8492-Database Management Systems
No ratings yet
CS8492-Database Management Systems
15 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
35 pages
Software Engineering FULL ANSWER
No ratings yet
Software Engineering FULL ANSWER
109 pages
16 Mark Questions OOAD
100% (2)
16 Mark Questions OOAD
9 pages
Relational Model & Algebra Basics
No ratings yet
Relational Model & Algebra Basics
8 pages
DBMS All in One R19
No ratings yet
DBMS All in One R19
167 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
12 pages
Database - Security Model - Question - Paper
No ratings yet
Database - Security Model - Question - Paper
1 page
DSA 2022 Question Paper
100% (1)
DSA 2022 Question Paper
2 pages
Ass 1 DM q1 Sol
No ratings yet
Ass 1 DM q1 Sol
2 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
DD Decode
0% (1)
DD Decode
104 pages
T.Y.B.Sc. (Computer Science) - 07.07.2021
No ratings yet
T.Y.B.Sc. (Computer Science) - 07.07.2021
46 pages
DBMS - Unit 2
100% (1)
DBMS - Unit 2
53 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
CD Previous Question Papers According To Jntuh Syllabus
No ratings yet
CD Previous Question Papers According To Jntuh Syllabus
16 pages
Frame-Based Expert Systems
No ratings yet
Frame-Based Expert Systems
50 pages
Database Management System: Important Questions Unit-1
No ratings yet
Database Management System: Important Questions Unit-1
9 pages
DBMS (UNIT-6) (Advances in Databases and Big Data)
No ratings yet
DBMS (UNIT-6) (Advances in Databases and Big Data)
103 pages
Automata Theory Solved Mcqs
No ratings yet
Automata Theory Solved Mcqs
18 pages
Query Processing
No ratings yet
Query Processing
121 pages
Outline: Distributed Query Processing
No ratings yet
Outline: Distributed Query Processing
8 pages
Construction Method Statement Safety
No ratings yet
Construction Method Statement Safety
4 pages
Myrtle Beach SEO
No ratings yet
Myrtle Beach SEO
4 pages
Atv61 Installation Manual
No ratings yet
Atv61 Installation Manual
47 pages
Enterprise Content Management: Ayman Al-Massri
No ratings yet
Enterprise Content Management: Ayman Al-Massri
21 pages
Networking Basics for Beginners
No ratings yet
Networking Basics for Beginners
4 pages
Vibrating Grizzly Feeder Guide
100% (1)
Vibrating Grizzly Feeder Guide
2 pages
9Y319-00011 Couple Tier 5
No ratings yet
9Y319-00011 Couple Tier 5
22 pages
Vehicle Tracking and Locking System
No ratings yet
Vehicle Tracking and Locking System
14 pages
MEDICA 2017 Wheisman Medical Technology Co. LTD Paper Medcom2017.2556129 8ayslvd2RMGitd8KwWyLkw PDF
100% (1)
MEDICA 2017 Wheisman Medical Technology Co. LTD Paper Medcom2017.2556129 8ayslvd2RMGitd8KwWyLkw PDF
4 pages
50 Us
No ratings yet
50 Us
5 pages
Ultra Wideband vs. Narrowband: A Comparison
No ratings yet
Ultra Wideband vs. Narrowband: A Comparison
6 pages
Car Racing Game Development
No ratings yet
Car Racing Game Development
4 pages
GPS310 PDF
No ratings yet
GPS310 PDF
1 page
RR5700 Electrical Diagrams Index
No ratings yet
RR5700 Electrical Diagrams Index
29 pages
Storage Managment
No ratings yet
Storage Managment
9 pages
Voltage Stability of Electric Power Systems: Printed Book
No ratings yet
Voltage Stability of Electric Power Systems: Printed Book
1 page
Record
No ratings yet
Record
115 pages
MSC Project Management Course Brochure
No ratings yet
MSC Project Management Course Brochure
17 pages
History Spring08
No ratings yet
History Spring08
7 pages
Software Testing Doc Template UTM
No ratings yet
Software Testing Doc Template UTM
9 pages
Item PWS6600C-N / AP1600C-N PWS6600C-P / AP1600C-P PWS6600C-S / AP1600C-S
No ratings yet
Item PWS6600C-N / AP1600C-N PWS6600C-P / AP1600C-P PWS6600C-S / AP1600C-S
3 pages
Lori Emerson Reading Writing Interfaces
No ratings yet
Lori Emerson Reading Writing Interfaces
248 pages
ETL UL1973 Draft Report
No ratings yet
ETL UL1973 Draft Report
32 pages
CBSE Class 12 Netflix Data Analysis
100% (1)
CBSE Class 12 Netflix Data Analysis
22 pages
Module 7 Tutorial
No ratings yet
Module 7 Tutorial
5 pages
Circuit Design for Students
No ratings yet
Circuit Design for Students
3 pages
Tech Note 91 - Using HistData With InTouch and Excel
No ratings yet
Tech Note 91 - Using HistData With InTouch and Excel
11 pages
Passive Component
100% (1)
Passive Component
61 pages
MIA University
No ratings yet
MIA University
10 pages
Jeffrey Koh
No ratings yet
Jeffrey Koh
3 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Advanced Databases

UNIT: III (Chapter-2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2

Low-level data manipulation

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3

• Query execution methodology

➡ How do we determine the “best” execution plan?

• We assume a homogeneous D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4

EMP(ENO, ENAME, TITLE)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5

Strategy 2 avoids Cartesian product, so may be “better”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6

ASG 1' ASG '2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9

Cartesian Product O(n2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11

➡ Combinatorial complexity in the number of relations

➡ Regroup common sub-expressions

➡ Perform selection, projection first

➡ Replace a join by a series of semijoins

➡ Reorder operations to reduce intermediate relation size

➡ Optimize individual operations

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12

➡ Cannot use common intermediate results

• Multiple queries at a time

➡ Efficient if many similar queries

➡ Decision space is much larger

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/16

• Local area networks (LAN)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/17

Algebraic Query on Distributed

Optimized Fragment Query

LOCAL Local LOCAL

Optimized Local Queries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/19

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/20

➡ check for attributes and relations

➡ type checking on the qualification

• Put into normal form

(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)

(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ AND's mapped into join or selection

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/22

Query graph Join graph

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/23

EMP RESP PROJ

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/24

➡ Remember the example

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/25

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/26

AND (DUR=12 OR DUR=24)

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

EMP PROJ ASG

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/30

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24

✦ ASG2= ENO>“E3”(ASG) PROJ  

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/32

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/33

pi(Rj)= if x in R: ¬(pi(x) pj(x))

EMP1 EMP2 EMP3 EMP2

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/35