Distributed Database Management Systems
Derived Horizontal Fragmentation(DHF)
Defined on a member relation of a link according to a selection operation specified on its owner
Two important points:
Each link is an equi-join. Equijoin can be implemented by means of semi-joins
So we are interested in defining the partitions of member based on fragmentation of its owner, but want to see attributes only from member, so
Ri = R Si, 1 i w
where w is the maximum number of fragments that will be defined on R and Si = Fi (S), where Fi is formula for PHF on S
DHF Example
PAY
title, sal
L1
EMP
eNo, Name, titke
jNo, jName, budget, loc
PROJ
ASIGN
eNo, jNo, resp, dur
Considering the link L1 above: owner (L1) = PAY member (L1) = EMP We want to group employees on the basis of their salaries one with salary less than or equal to 30,000/and other more than that
eNo E1 E2 E3 E4 E5 E6 E7
eName T Khan W Shah R Dar K Butt F Sahbai A Haq S Farhana
title Elec Eng Sys Ana Mech Eng Programme Sys Ana Elec Eng Mech Eng
E8
M Daud
Sys Ana
10
Title Elect. Eng Sys Analyst Mech. Eng Programmer
Sal 40000 34000 27000 24000
11
eNo
E1 E2 E5 E6 E8
eName
T Khan W Shah F Sahbai A Haq M Daud
title
Elec Eng Sys Ana Sys Ana Elec Eng Sys Ana
E3 E4
R Dar K Butt
Mech Eng Programme
E7
S Farhana
Mech Eng
PAY1 = sal 30000 (PAY) PAY2 = sal > 30000 (PAY) EMP1 = EMP PAY1 EMP2 = EMP PAY2
DHF
The inputs required for DHF
The set of partitions for owner Member relation Semi-join predicates between owner and member
14
Care in case of multiple owners, like ASIGN Fragmentation selection depends:
1- One with better Join Characteristics 2- One used in more applications
DHA
Second one is straight forward, we should try to facilitate heavy users; the first one needs more considerations
15
DHF
For the first point;
Join is performed on smaller relations, that increases efficiency The join can be performed in parallel in case of simple graphs, that improves efficiency as well; simple graph means
PAY1 PAY2
EMP1
EMP2
16
demonstrates two things
DHF
Checking for Correctness
1-Derived fragmentation may follow a chain, like PAY-EMP-ASIGN 2-Typically, more than one fragmentation options are there, which one adopted is an allocation problem discussed later
Completeness: for PHF depends on Pr, and in DHF, completeness of owner Pr, and the referential integrity constraint Reconstruction: Involves Union in both cases Disjointness: Simple in PHF if the pi in Pr are mutually exclusive; in DHF, guaranteed in case of simple join graph, however in case of partitioned join graph it is hard to establish
17
Checking for Correctness
Completeness: for PHF depends on Pr, and in DHF, completeness of owner Pr, and the referential integrity constraint Let R be member S be owner Fs = { S1,S2,Sn} A the common attribute t[A] = t [A]
18
Reconstruction:
Involves Union in both cases FR = {R1,R2,.Rn} R = U Ri Ri FR
19
Disjoint ness:
Simple in PHF if the pi in Pr are mutually exclusive; in DHF, guaranteed in case of simple join graph, however in case of partitioned join graph it is hard to establish
20
Vertical Fragmentation (VF)
Vertical subset of relation A VF of a relation produces fragments R1, R2, . Rn, each of which contains subset of attributes of R and PK of R. Objective is to produce smaller relations, so that most of the applications run on smaller relations; so they become fast.
21
Vertical Fragment
Vertical fragmentation is more complicated, since more alternatives exist. VF is mainly based on heuristics
22
Example of VF
CUST
A/C# Na me AB10 Sae 1 ed AB20 Lae 2 eq AB20 Sal 3 ma AB10 Sha 9 an Bal Bran ch MTN
4535
Delta = A/C#, Name, Branch (CUST) Beta = A/C#, Bal (CUST)
Delta
A/C# Na me AB10 Sae 1 ed AB20 Lae 2 eq Bran ch MTN LHR
45632 LHR .34 67839 LHR .87 45.32 MTN
23
Two Alternatives of VF
Grouping: Starting with single attribute VFs and then combining different attributes
24
Two Alternatives of VF
Splitting: Starting from the whole relation and then breaking it down analyzing the nature of applications Later suits better to DDB environment; results non-overlapping fragments; so discussed here
25
Thanks