FRAGMENTATION:
Fragmentation is the major concept in distributed database. Fragmentation refers to the phenomenon where data becomes divided or
scattered in various ways. We fragment a table horizontally, vertically, or both and distribute the data to different sites (servers at
different geographical locations). While we perform the fragmentation process, as a result we expect the following as outcomes:
We should not lose data because of fragmentation
We should not get redundant data because of fragmentation
Hence, to ensure these properties we need to verify that whether we performed the fragmentation correctly or not. For this
verification we use the correctness rules. The rules are as follows:
1. Completeness - To ensure that there is no loss of data due to fragmentation. Completeness property ensures this by checking
whether all the records which were part of a table (before fragmentation) are found in at least one of the fragments after
fragmentation.
2. Reconstruction - This rule ensures the ability to re-construct the original table from the fragments that are created. This rule is
to check whether the functional dependencies are preserved or not.
3. Disjointness - This rule ensures that no record will become a part of two or more different fragments during the fragmentation
process.
HORIZONTAL FRAGMENTATION:
Horizontal fragmentation involves dividing a table or relation into smaller subsets based on specific criteria or conditions.
Here's an example of horizontal fragmentation for a simplified “Employee” table:
Original Employee Table:
EmployeeID FirstName LastName Department Salary
101 John Smith HR 50000
102 Jane Doe IT 60000
103 Mike Johnson Sales 55000
104 Emily Wilson IT 62000
105 David Brown Sales 58000
Let's say we want to horizontally fragment this "Employee" table based on the "Department" attribute. We want to create three
horizontal fragments: one for each department (HR, IT, and Sales).
Horizontal Fragmentation by Department:
HR_Employee Table:
EmployeeID FirstName LastName Department Salary
101 John Smith HR 50000
IT_Employee Table:
EmployeeID FirstName LastName Department Salary
102 Jane Doe IT 60000
104 Emily Wilson IT 62000
Sales_Employee Table:
EmployeeID FirstName LastName Department Salary
103 Mike Johnson Sales 55000
105 David Brown Sales 58000
In this example:
- The original "Employee" table has been horizontally fragmented into three smaller tables: "HR_Employee," "IT_Employee,"
and "Sales_Employee."
- Each fragment contains only the rows where the "Department" matches the specified condition (HR, IT, or Sales).
- This horizontal fragmentation allows for efficient data access and retrieval based on the department, as queries related to a
specific department can be directed to the appropriate fragment.
Horizontal fragmentation is useful for improving data access and managing data distribution in distributed database systems,
particularly when data access patterns are influenced by specific attributes or conditions.
VERTICAL FRAGMENTATION is a data distribution strategy used in distributed database systems where a single table is
divided into smaller subsets based on columns or attributes. Each subset contains a subset of the columns from the original table.
Here's an example of vertical fragmentation:
Consider a university database with a "Student" table that contains various attributes, including:
- StudentID (unique identifier)
- FirstName
- LastName
- DateOfBirth
- Department
In a distributed database scenario, you might want to use vertical fragmentation to separate sensitive student information from
general student data. You could create two vertical fragments:
**Fragment 1 (Sensitive Data):**
- StudentID (unique identifier)
- FirstName
- LastName
- DateOfBirth
**Fragment 2 (Non-Sensitive Data):**
- StudentID (unique identifier)
- Department
In this example:
- Fragment 1 contains sensitive personal information like names and birthdates.
- Fragment 2 contains non-sensitive data related to academic information, such as department
HYBRID FRAGMENTATION is a data distribution strategy in distributed database systems that combines elements of both
horizontal and vertical fragmentation.
Hybrid fragmentation can be done in two alternative ways −
At first, generate a set of horizontal fragments; then generate vertical fragments from one or more of the horizontal
fragments.
At first, generate a set of vertical fragments; then generate horizontal fragments from one or more of the vertical
fragments.
Consider a distributed e-commerce database that stores information about products and customers. The database has a
"Product" table and a "Customer" table.
Product Table:
ProductID ProductName Category Price SupplierID
Customer Table:
CustomerID FirstName LastName Email Address
In this scenario, you could apply hybrid fragmentation based on specific requirements:
Horizontal Fragmentation for Customers (By Location):
Fragment 1: Customers located in PUNJAB
Fragment 2: Customers located in UP
Fragment 3: Customers located in DELHI
Vertical Fragmentation for Products (By Category):
Fragment 1: ProductID, ProductName, Category
Fragment 2: ProductID, Price