Unit 8 – Balanced Trees
Structure:
8.0 Introduction
8.1 Unit Objectives
8.2 Basic Terminology
8.3 AVL Trees
8.4 Weight Balanced Trees
8.5 Summary
8.6 Key Terms
8.7 Check Your Progress
8.0 Introduction
In the previous units, we have already discussed the binary trees and their terminologies in detail. A binary tree is a
special type of tree, which can be either empty or has a finite set of nodes, such that one of the nodes are designated as the
root node and the remaining nodes are partitioned into two subtrees of root node known as left subtree and right subtree.
These left and right subtrees should not be empty and should be binary trees. Binary trees can be represented in the
computer’s memory using an array or a linked list. Binary search trees are a kind of binary tree having values in the left
subtree of a root node smaller than the value of the root node, and values in the right subtree greater than or equal to the
value of the root node. Now, let’s learn about balanced binary trees. Balanced trees are a versatile set of data structures in
which every leaf is “at a certain distance” from the root than any other leaf. As we already know that the maximum number of
nodes possible from the root node to a leaf node is termed as the height of a tree. So, a binary tree is said to be balanced if
the height of the tree is O(log n), where n is the number of nodes of the tree. This unit describes the basic terminology of
balanced trees and also discusses different types of balanced trees in detail.
8.1 Unit Objectives: After going through this unit, the reader will be able to:
Understand the fundamentals of Balanced binary search trees.
Explain the basic terminology of Height balanced trees specifically AVL Trees.
Learn about the properties of Weight-balanced Trees.
8.2 Basic Terminology:
Balancing a binary search tree is beneficial as a balanced tree provides O(log n) time for all the operations like
searching, inserting, and deleting. An unbalanced tree consumes more running time i.e. O(n) as the shape becomes
distorted. It means that if one branch of the tree is much longer than the other, the operations take more running time. Fig.
8.1 represents balanced and unbalanced binary trees. The height of a binary tree is an important parameter in relation to the
efficient operations on the tree. Searching, inserting, and deleting operations in a binary search tree are all O(Height).
Generally, the height (H) and the number of nodes (n) are related to H=(log n). So, the effective operations become
O(H)=O(log n).
Figure 8.1 (a) Balanced Tree (b) Unbalanced Tree
Balancing of the binary trees can be achieved through different approaches. Either height or weight of the tree is
balanced such that the average running time of the operations is maintained to O(log n). The balancing approach can be
partial or complete as per requirement. Though it is difficult to attain a perfectly balanced binary tree. One of the most
commonly used balancing approaches is self-balancing trees. Such trees maintain balance automatically by keeping the
height as small as possible during the insertion and deletion operation on the binary tree. The self-balancing binary search
trees perform rotations
to maintain the balance in the tree, even after the insert and delete operations. Two types of rotations are possible in a
binary search tree without violating its in-order traversal property. Consider figure 8.1 (a) and (b) representing the left and
right rotation of a binary tree.
a) Left Rotation: According to figure 8.2 (a), during the left rotation about node X, the new root of the subtree is now
node Y. Node X becomes the left child of node Y while subtree B is now the right child of node X.
b) Right Rotation: In figure 8.2 (b), during the right rotation about node Y, the new root of the subtree is now node X.
Node Y becomes the right child of node X while subtree B is now the left child of node Y.
Figure 8.2 (a) Left Rotation (b) Right Rotation of a binary search tree
(Source- https://towardsdatascience.com/self-balancing-binary-search-trees-101-fc4f51199e1d)
The balanced binary search trees are classified into two groups, i.e. Height balanced trees and Weight balanced
trees. In height-balanced trees, the height of the siblings of a node is “approximately the same”. In weight balanced trees, the
number of descendants of sibling nodes is “approximately the same”. Different types of height-balanced binary search trees
are there. Some of them like AVL trees, Red-black Trees, Splay Trees will be discussed in this course.
8.3 AVL Trees:
AVL Trees are one of the most commonly known self-balancing binary search trees. These trees were first
introduced by two mathematicians G.M. Adelson-Velsky and E.M. Landis in 1962. AVL tree is named so in honor of its
inventors. Being a self-balanced binary search tree, an AVL tree takes O(log n) time to perform the search, insert, and delete
operations. That means the height of the AVL tree is limited to O(log n). The key property for AVL trees is that the heights of
the two subtrees of one node may differ by at most one. Due to this, the AVL tree is also termed a height-balanced tree.
There is a little difference between the structure of an AVL tree and a simple binary search tree. In an AVL tree, the
additional variable “balance factor” is associated with each node. The balance factor of a node is the difference between the
height of the right subtree and the height of the left subtree.
Balance factor = Height (left subtree) – Height (right subtree)
In a height-balanced tree, every node has a balance factor of -1, 0, or 1. Any other value of the balance factor makes
the binary search tree unbalanced. Some important key points about balance factor are:
If a node has a balance factor of 1, it means that the right subtree of the node is one level lower than the left subtree.
Such a tree is called a lefty-heavy tree as shown in figure 8.3 (a).
If a node has a balance factor of 0, it means that the height of the left subtree is equal to the height of the right
subtree.
If a node has a balance factor -1, it means that the right subtree of the node is one level higher than the left subtree.
Such a tree is called a right-heavy tree as shown in figure 8.3 (b).
In figure 8.3, it should be noted that the balance factor of nodes 18, 39, 54, 63, and 72 is 0; the balance factor for nodes 27,
36, and 45 is 1.
Figure 8.3 (a) Left-heavy AVL Tree (b) Right-heavy AVL Tree (c) Balanced Tree
(Source- Data Structures using C, Reema Thareja, Oxford University Press, 2nd Edition, Chapter- 10, Page No.- 317)
The insertion and deletion operations on the AVL tree may imbalance the tree, resulting in disturbing the balance
factor of the nodes. In such cases, the tree is rebalanced using rotation operation at the critical node.
Searching for a Node in an AVL Tree: The search operation is the same for both
AVL trees and binary search trees. The structure of the tree does not modify, so no special provisions are required.
According to the property, the running time taken by the search operation to be completed is O(log n).
Inserting a New Node in an AVL Tree: The insert operation in an AVL tree is similar to the one in binary search trees. In an
AVL tree, the new node is always inserted as the leaf node and the insertion step is generally followed by an additional step,
i.e. rotation. Rotation is helpful in rebalancing the tree. Though, if the balance factor does not get disturbed due to insert
operation, i.e. it is still -1, 0, or 1, then there is no need for the rotation. In AVL trees, the new node is always inserted as a
leaf node, so the balance factor will always be 0. Change in balance factor can be observed only for those nodes that are in
the path of the root node and newly inserted node. Some possible changes in the path for any node are discussed below:
Initially, the node of an AVL tree was either left-heavy or right-heavy. After the insertion of a new node, it becomes
balanced.
The node that was balanced initially, becomes either left-heavy or right-heavy after inserting a new node.
The node that was either left-heavy or right-heavy initially, becomes unbalanced due to a new node insertion. Such a
node is termed a critical node. The nearest ancestor node on the path between the inserted node and the root with
balance factor neither -1, 0, nor 1 is the critical node.
Four types of rotations are generally used after insert operation in an AVL tree, they are LL rotation, RR rotation, LR
rotation, RL rotation.
1. LL rotation: The new node is inserted in the left subtree of the left subtree of the critical node. LL rotation in an AVL
tree is shown in figure 8.4. In this figure, node A is considered as a critical node, as it is the nearest ancestor whose
balance factor is not -1, 0, or 1. After insert operation, the new node has now become the part of tree T1. Now,
during LL rotation, node B becomes the root; T1 and A are its left and right child respectively; T2 and T3 are now left
and right subtrees of A.
Fig. 8.4 (a) Given AVL tree (b) Insert new node in left subtree of left subtree of critical node (c) LL rotation 4 given AVL tree
2. RR Rotation: The new node is inserted in the right subtree of the right subtree of the critical node. RR rotation in an
AVL tree is shown in figure 8.5. In this figure, node A is considered as a critical node, as it is the nearest ancestor
whose balance factor is not -1, 0, or 1. After insert operation, the new node has now become the part of tree T3.
Now, during RR rotation, node B becomes the root; A and T3 are its left and right child respectively; T1 and T2 are
now left and right subtrees of A.
Fig 8.5 (a) Given AVL tree (b) Insert new node in right subtree of right subtree of critical node (c) RR rotation 4 given AVL
Tree
3. LR rotation: The new node is inserted in the right subtree of the left subtree of the critical node. LR rotation in an
AVL tree is shown in figure 8.6. In this figure, node A is considered as a critical node, as it is the nearest ancestor
whose balance factor is not -1, 0, or 1. After insert operation, the new node has now become the part of tree T2.
Now, during LR rotation, node C becomes the root; B and A are its left and right child respectively; T1 and T2 are
now left subtrees and right subtrees of B and T3 and T4 are now left subtrees and right subtrees of A.
Fig 8.6 (a) Given AVL tree (b) Insert new node in right subtree of left subtree of critical node (c) LR rotation for given AVL
Tree
4. RL rotation: The new node is inserted in the left subtree of the right subtree of the critical node. RL rotation in an
AVL tree is shown in figure 8.7. In this figure, node A is considered as a critical node, as it is the nearest ancestor
whose balance factor is not -1, 0, or 1. After insert operation, the new node has now become the part of tree T2.
Now, during RL rotation, node C becomes the root; A and B are its left and right children, respectively; T1 and T2
are now left subtrees and right subtrees of A and T3 and T4 are now left subtrees and right subtrees of B.
Fig 8.7 (a) Given AVL tree (b) Insert new node in left subtree of right subtree of critical node (c) RL rotation 4 given AVL Tree
Deleting a node from an AVL Tree:
The delete operation in an AVL tree is similar to that of a binary search tree. The only difference is in terms of
maintaining the balance in the tree. After the delete operation, there may be a need to rebalance the AVL tree for which it is
required to perform rotations. There are two types of rotations that can be performed on an AVL tree after deleting a given
node. These rotations are R rotation and L rotation. If node A becomes the critical node while deleting node X from the AVL
tree, then the type of rotation depends on whether X is in the left subtree or in the right subtree of node A. If the node X is in
the left subtree of A, then L rotation is applied. If X is in the right subtree, R rotation is performed.
Program 8.1: Write a C program that shows insertion operation in an AVL tree.
#include <stdio.h>
typedef enum { FALSE ,TRUE } bool;
struct node {
int val; int balance;
struct node *left_child;
struct node *right_child;
};
struct node* search(struct node *ptr, int data) {
if(ptr!=NULL)
if(data < ptr -> val)
ptr = search(ptr -> left_child,data);
else if( data > ptr -> val)
ptr = search(ptr -> right_child, data); return(ptr);
}
struct node *insert (int data, struct node *ptr, int *ht_inc) {
struct node *aptr; struct node *bptr;
if(ptr==NULL) {
ptr = (struct node *) malloc(sizeof(struct node));
ptr -> val = data; ptr -> left_child = NULL; ptr -> right_child = NULL; ptr -> balance = 0;
*ht_inc = TRUE; return (ptr);
}
if(data < ptr -> val) {
ptr -> left_child = insert(data, ptr -> left_child, ht_inc);
if(*ht_inc==TRUE) {
switch(ptr -> balance) {
case -1: /* Right heavy */
ptr -> balance = 0; *ht_inc = FALSE; break;
case 0: /* Balanced */
ptr -> balance = 1; break;
case 1: /* Left heavy */
aptr = ptr -> left_child;
if(aptr -> balance == 1) {
printf(“Left to Left Rotation\n”);
ptr -> left_child= aptr -> right_child;
aptr -> right_child = ptr;
ptr -> balance = 0; aptr -> balance=0; ptr = aptr;
}
else {
printf(“Left to right rotation\n”);
bptr = aptr -> right_child;
aptr -> right_child = bptr -> left_child;
bptr -> left_child = aptr;
ptr -> left_child = bptr -> right_child;
bptr -> right_child = ptr;
if(bptr -> balance == 1 )
ptr -> balance = -1;
else
ptr -> balance = 0;
if(bptr -> balance == -1)
aptr -> balance = 1;
else
aptr -> balance = 0; bptr -> balance=0; ptr = bptr;
}
*ht_inc = FALSE;
}}}
if(data > ptr -> val) {
ptr -> right_child = insert(info, ptr -> right_child, ht_inc);
if(*ht_inc==TRUE) {
switch(ptr -> balance) {
case 1: /* Left heavy */
ptr -> balance = 0;
*ht_inc = FALSE;
break;
case 0: /* Balanced */
ptr -> balance = -1; break;
case -1: /* Right heavy */
aptr = ptr -> right_child;
if(aptr -> balance == -1) {
printf(“Right to Right Rotation\n”);
ptr -> right_child= aptr -> left_child;
aptr -> left_child = ptr;
ptr -> balance = 0; aptr -> balance=0; ptr = aptr;
}
else {
printf(“Right to Left Rotation\n”);
bptr = aptr -> left_child;
aptr -> left_child = bptr -> right_child;
bptr -> right_child = aptr;
ptr -> right_child = bptr -> left_child;
bptr -> left_child = pptr;
if(bptr -> balance == -1)
ptr -> balance = 1;
else
ptr -> balance = 0;
if(bptr -> balance == 1)
aptr -> balance = -1;
else
aptr -> balance = 0; bptr -> balance=0; ptr = bptr;
}/*End of else*/
*ht_inc = FALSE;
}}}
return(ptr);
}
void display(struct node *ptr, int level) {
int i;
if ( ptr!=NULL ) {
display(ptr -> right_child, level+1);
printf(“\n”);
for (i = 0; i < level; i++)
printf(“ “);
printf(“%d”, ptr -> val);
display(ptr -> left_child, level+1);
}}
void inorder(struct node *ptr) {
if(ptr!=NULL) {
inorder(ptr -> left_child);
printf(“%d “,ptr -> val);
inorder(ptr -> right_child);
}}
main()
{
bool ht_inc;
int data; int option;
struct node *root = (struct node *)malloc(sizeof(struct node));
root = NULL;
while(1) {
printf(“1.Insert\n”);
printf(“2.Display\n”);
printf(“3.Quit\n”);
printf(“Enter your option : “);
scanf(“%d”,&option);
switch(choice) {
case 1:
printf(“Enter the value to be inserted : “);
scanf(“%d”, &data);
if( search(root,data) == NULL )
root = insert(data, root, &ht_inc);
else
printf(“Duplicate value ignored\n”);
break; case 2:
if(root==NULL) {
printf(“Tree is empty\n”);
continue;
}
printf(“Tree is :\n”);
display(root, 1);
printf(“\n\n”);
printf(“Inorder Traversal is: “);
inorder(root); printf(“\n”); break;
case 3: exit(1);
default:
printf(“Wrong option\n”);
} } }
8.4 Weight Balanced Trees:
As we already know that the weight balanced trees are the type of self-balancing trees that are dependent on the
number of leaves in the subtrees of a node. A binary search tree is said to be weight-balanced if the weight of the
left and right subtree in each node differ by at most one. Weight balanced binary search trees were introduced by
Nievergelt and Reingold in the 1970s, in the name “trees of bounded balance”. Later, they were modified as weight
balanced trees by Kruth. These trees are generally used to implement dynamic sets, maps, and sequences.
Figure 8.8 An example of Weight Balanced Tree
Like other self-balancing trees, weight-balanced trees also perform rotations to restore the balance between the
nodes, when it becomes unbalanced by search, insert, and delete operations. The size of the subtree rooted at the node is
stored by each node of the tree. The sizes of left and right subtrees are kept approximately the same by some factor. Let this
factor be
α. The types of rotations used to rebalance the binary trees are the same as those used to rebalance AVL trees. According
to the definition, the size of a leaf is zero and the size of an internal node is calculated as adding one to the sum of sizes of
its two children. The weight can be defined as adding one to the size of that internal node.
size[n] = size[n.left] + size[n.right] + 1
weight[n] = size[n] + 1
A node is said to be an α-weight balanced tree if it satisfies the following condition,
weight[n.left] ≥ α·weight[n] & weight[n.right] ≥ α·weight[n]
Program 8.2: Write a C program to check whether the given tree is balanced or not.
#include <stdio.h>
#include <stdlib.h>
struct node
{
int data;
struct node*left;
Struct node*right;
};
bool is Balanced (struct node*root);
int find height (struct node*root) {
int lefth=0, righth=0;
if(
root==NULL) {
return 0;
}
lefth=findheight(root->left);
righth=findheight(root->right);
if(lefth>righth) {
return lefth+1;
}
else
{
return righth+1;
}}
Bool is Balanced(struct node*root) {
int left_height, right_height;
if(root==NULL) {
return true;
}
left_height=findheight(root->left);
right_height=findheight(root->right);
if(abs(left_height-right_height)<=1&&isBalanced(root->left)&& isBalanced(root->right)) {
return true;
}
return false;
}
Int
main()
{
struct node*root;
root=(structnode*)malloc(sizeof(struct node));
root->data=5;
root->left=(structnode*)malloc(sizeof(struct node));
root->left->data=8;
root->left->left=(struct node*)malloc(sizeof(structnode));
root->left->left->data=10;
root->left->left->left=root->left->left->right=NULL;
root->left->right=(struct node*)malloc(sizeof(struct node));
root->left->right->data=15;
root->left->right->left=root->left->right->right=NULL;
root->right=(struct node*)malloc(sizeof(struct node));
root->right->data=34;
root->right->left=root->right->right=NULL;
if(isBalanced(root)) {
printf("\n\n\nThe above given tree is a Balanced Tree\n\n\n");
}
else
{
printf("\n\n\nThe above given tree is not a Balanced Tree\n\n\n");
}
return 0;
}
The output of the program is: The above-given tree is a Balanced Tree. Or The above-given tree is not a Balanced Tree.
8.5 Summary:
Balanced trees are a versatile set of data structures in which every leaf is “at a certain distance” from the root than
any other leaf.
A binary tree is said to be balanced if the height of the tree is O(log n), where n is the number of nodes of the tree.
Self- balancing trees maintain balance automatically by keeping the height as small as possible during the insertion
and deletion operation on the binary tree.
The balance factor of a node is the difference between the height of the right subtree and the height of the left
subtree. In a height-balanced tree, every node has a balance factor of -1, 0, or 1.
Rotations are used to retain the balance in a binary search tree. There are four types of rotations: LL rotation, RR
rotation, LR rotation, and RL rotation.
A binary search tree is said to be weight-balanced if the weight of the left and right subtree in each node differ by at
most one.
8.6 Key Terms:
Height- balanced Trees: In height-balanced trees, the height of the siblings of a node is “approximately the same”.
Weight- balanced Trees: In weight balanced trees, the number of descendants of sibling nodes is “approximately the
same”.
Balance Factor: It is the difference between the height of the right subtree and the height of the left subtree.
Left- heavy Tree: If a node has a balance factor of 1, it means that the right subtree of the node is one level lower
than the left subtree. Such a tree is called a lefty-heavy tree.
Right- heavy Tree: If a node has a balance factor -1, it means that the right subtree of the node is one level higher
than the left subtree. Such a tree is called a right-heavy tree.
8.7 Check Your Progress: Short- Answer type
Q1) Time taken by an AVL tree to perform the search, insert, and delete operations in average as well as worst case is:
(a) O(n) (b) O(log n) (c) O(n2) (d) O(n log n)
Q2) In an AVL tree, searching operation takes ______ time.
Q3) When the new node is inserted in the right subtree of the right subtree of the critical node, then it is called RL
rotation. (True/ False?)
Q4) When the right subtree of a node is one level lower than the left subtree, then the balance factor is
(a) 0 (b) 1 (c) –1 (d) 2
Q5) A new node inserted in a binary search tree, will be added as an internal node. (True/ False?)
Long- Answer type:
Q1) The height of a binary search tree affects its performance. Explain.
Q2) State the advantages of AVL trees.
Q3) Differentiate between Height balanced and Weight balanced Trees.
Q4) Create an AVL tree using the following sequence of data: 16, 27, 9, 11, 36, 54, 81, 63, 72.
Q5) Explain the rotation process in Balanced trees in detail. Also, discuss the types of rotations.
Unit 9 – B- Trees
Structure:
9.0 Introduction
9.1 Unit Objectives
9.2 B- Trees
9.2. 1 Operations on a B- Tree
9.3 B+ Trees
9.3.1 Operations on a B+ Tree
9.4 Red-Black Trees
9.4.1 Inserting a Node in a Red-black Tree 9.4.2 Deleting a Node from a Red-black Tree
9.5 Splay Trees
9.6 Summary 9.7 Key Terms
9.8 Check Your Progress
9.0 Introduction:
We have discussed that in a binary search tree every node has one value and two pointers, that point to the left and
right subtrees of the node, respectively. B-trees are generally used in file systems and databases. A tree data structure that
sorts the data and then performs the insertion and deletion operations, is referred to as B-Tree. The internal nodes of a B-
Tree may have a variable number of child nodes in some predefined range. The number of child nodes varies with the
insertion or deletion of any data from the node. To maintain the predefined range, the internal nodes can be merged or
splitted. As B-trees permit the maintenance of these child nodes, rebalancing is not frequently required in B-trees as other
self-balancing trees require. But, this leads to the wastage of some space in the memory as nodes are not completely full.
This unit deals with the basics of B-Trees, its operations, and its applications. Apart from AVL trees, the fundamentals of
other balanced trees like Red-Black Trees, Splay Trees, and B+ Trees are also discussed.
9.1 Unit Objectives: After going through this unit, the reader will be able to:
Explain the basics of B-Trees.
Understand the operations and applications of B-Trees.
Learn about the distinct balanced trees like Red-black trees and Splay trees.
Discuss the fundamentals of B+ Trees.
9.2 B- Trees: B-trees were developed by Rudolf Bayer and Ed McCreight in 1970. They are widely used for accessing
the disk of computer systems. A B-tree having an order of m consists of m-1 keys and m pointers to the subtrees. The
purpose of using B-trees is to store a large number of keys in a single node to keep the height of the tree relatively small.
The small height of the tree will take less processing time as compared to the tree with more height. In B- trees the number
of child nodes are in a predefined range and can vary with the insert or delete operations. There is a need to maintain this
predefined range by merging or splitting these internal nodes. Unlike other self-balancing trees, B- trees do not require
rebalancing frequently as they focus on maintaining the predefined range of internal nodes. B- trees consist of two limits i.e.
upper bound and lower bound. These two bounds are fixed for the number of child nodes for a particular implementation. As
we already know that the height of all the leaf nodes has to be maintained to keep the tree balanced. Similarly, a B-tree is
also balanced by keeping all the leaf nodes at the same depth/ height. The height of the B- Tree will increase with the
addition of elements to the tree, but the overall height of the tree will not increase frequently. A B- tree may have a variable
number of keys and children, unlike a binary- tree. These keys are arranged in non-decreasing order. Each of these keys is
associated with a child. This child behaves as the root of a subtree having all the nodes with keys less than or equal to the
key but greater than the preceding key. An additional rightmost child is also associated with the node. This rightmost child
behaves as the root for a subtree that has all keys greater than any keys in the node. A B-tree should possess the following
properties:
In a B-tree with order m, every node should have a maximum of m children.
Every node except the root node and leaf nodes should have minimum m/2 children.
The root node should have at least two child nodes if it is not a leaf node.
All leaf nodes should be at the same level. Figure 9.1 shows a B-tree of order 4. It should be noted that the B-tree
shown in the figure fulfills all the properties that are mentioned above.
Fig 9.1 A B-tree of order 4.
(Source- Data Structures using C, Reema Thareja, Oxford University Press, 2nd Edition, Chapter- 11, Page No.- 345)
B-trees are balanced trees that can minimize the number of disk access for the computer system. Certain data is
stored in secondary storage such as magnetic disks and disk access is expensive and time-consuming in such cases. So, B-
trees help in minimizing the number of disk access attempts.
9.2.1 Operations on a B-Tree:
Like other binary trees, searching, inserting, and deleting operations are also supported by B-trees. All these
operations follow single-pass algorithms as they do not traverse back. As the basic motive of the B-tree is to minimize the
disk access, these single pass approaches will support this motive. It is assumed that all the nodes are stored in secondary
storage instead of primary storage. Disk-Read operation is used to read all the given nodes. Similarly, the write operation is
denoted by Disk- Write. Allocate- Node call is used to create new nodes and assign them storage.
Searching in a B-tree: The search operation of a B-tree is similar to that of a binary tree. Unlike a binary tree, B-
tree marks an n-way search instead of choosing between the left and right child of a node. The correct choice is
made by performing a linear search for the values in the node. After obtaining the value greater than or equal to the
required value, the search follows the child pointer to the immediate left of the value. On the other hand, if all the
values are less than the required value, it follows the rightmost child pointer. The search operation is terminated as
soon as the required node is found. The running time of the operation is decided by the height of the tree, i.e. O(log
n). The search algorithm is given below:
B-Tree-Search(x, k) then return (x, i)
i <- 1 if leaf[x]
while i <= n[x] and k > keyi[x] then return NIL
do i <- i + 1 else Disk-Read(ci[x])
if i <= n[x] and k = keyi[x] return B-Tree-Search(ci[x], k)
Insertion in a B-Tree: Before inserting any element in a B-tree, we must locate the appropriate node for the key,
using certain algorithms such as B-tree search. Next, the key is inserted into the node. If the node is not full, no
special action is required while if the node is full, then the node should be split and then the new key is loaded. The
splitting operation moves one key to the parent node. Also, this parent node must not be full otherwise another split
operation will be required. This process may repeat up to the root node.
B-Tree-Insert(T, k)
r <- root[T]
if n[r] = 2t - 1
then s <- Allocate-Node()
root[T] <- s
leaf[s] <- FALSE
n[s] <- 0
c1 <- r
B-Tree-Split-Child(s, 1, r)
B-Tree-Insert-Nonfull(s, k)
else B-Tree-Insert-Nonfull(r, k)
B-Tree-Insert-Nonfull(x, k)
i <- n[x]
if leaf[x]
then while i >= 1 and k < keyi[x]
do keyi+1[x] <- keyi[x]
i <- i - 1
keyi+1[x] <- k
n[x] <- n[x] + 1
Disk-Write(x)
else while i >= and k < keyi[x]
do i <- i - 1
i <- i + 1
Disk-Read(ci[x])
if n[ci[x]] = 2t - 1
then B-Tree-Split-Child(x, i, ci[x])
if k > keyi[x]
then i <- i + 1
B-Tree-Insert-Nonfull(ci[x], k)
Deletion in a B-Tree: The delete operation for a B-Tree is carried out from the leaf node, like in the insert operation.
A leaf node and an internal node can be deleted from a B-Tree. In the case of a leaf node, the following steps are
involved:
a) First, locate the leaf node that has to be deleted.
b) If the leaf node has more than m/2 elements (more than a minimum number of key values), then delete the value.
c) Else, if the leaf node does not have m/2 elements, then first fill the node either from the left or from the right sibling.
If there are more than m/2 elements in the left sibling, then its largest key is pushed into its parent’s node. Also,
the intermediate element of the parent and leaf node is taken down where the key is deleted.
Else, if there are more than m/2 elements in the right sibling, then its smallest key is pushed into its parent’s
node. Also, the intermediate element of the parent and leaf node is taken down where the key is deleted.
d) Else, if there are m/2 elements in both left and right siblings, then a new leaf node is created by combining the two
leaf nodes and the intermediate element of the parent node. It should be ensured that the number of elements
should not exceed the maximum number of elements a node can have, i.e. m. If after pulling down the intermediate
element of the parent node, it has less than m/2 elements, then the process is propagated upwards and the height of
the B-Tree gets reduced.
In the case of an internal node, the successor or predecessor of the key to be deleted is promoted to occupy
the position of the deleted key. The predecessor or successor keys are always in the leaf node, so the operation is
processed according to the deletion in a leaf node.
Example 9.1: Consider the B-Tree of order 3 given below and perform the following operations: (a)insert 121, 87 and then
(b) delete 36, 109.
An underflow condition may occur during the delete operation in a B-Tree. A leaf node underflows if it contains (m/2 -
1) keys after deleting a key from it. On the other hand, an internal node (excluding the root node) underflows if there are
(m/2 - 2) keys in the deletion process. While deleting any element from the B-Tree, either leaf node or internal node,
underflow condition is checked every time.
9.3 B+ Trees: B+ trees are a variant of B-Trees that also store sorted data only in the leaf nodes. In a B-Tree both keys
and records are stored in its internal nodes. In contrast, the B+ tree stores all the records at its leaf node, and internal nodes
contain only the keys. An added advantage of using a B+ tree is that the leaf nodes are often linked to each other in a linked
list. This makes the queries simpler and more efficient. B+ trees allow efficient insertion, retrieval, and deletion of records.
Generally, B+ trees are used to store large data. The leaf nodes of the B+ tree are stored in the secondary storage while the
internal nodes of the tree are stored in the main memory. The internal nodes of a B+ tree are called index nodes or i-nodes.
B+ Trees are simple and used to implement many database systems. B+ trees are always balanced as all the data
appear in the leaf nodes and are sorted. B+ trees also make searching for data-efficient. A B+ tree of order is shown in figure
9.3. Also, a comparison between B- Tree and B+ Tree is depicted in Table 9.1.
Fig. 9.3 A B+tree of order 3
Table 9.1 Comparison between B-trees and B+ trees
B TREE B+ TREE
Search key are not repeated Stores redundant search key
Data is stored internal or leaf nodes Data is stored only in leaf nodes
Searching takes more times as data may be found in a leaf Searching data is very easy as the data can be found in leaf
or non-leaf node. nodes only
Deletion of non-leaf nodes is very complicated Deletion is very simple because data will be in the leaf node
Leaf node cannot be stored using linked list Leaf node data are ordered using sequential linked list
The structure and operations are complicated The structure and operations are simple
9.3.1 Operations on a B+ Tree: Like B- Tree, insert and delete operations can be performed on B+ trees too. Let’s
discuss these operations in detail.
Inserting a new element in a B+ Tree: As we know that B+ trees are relevant to leaf nodes, a new element in a
leaf
node can be simply added if there is space for it. But, if there is no space for a new element, then the node splits into
two nodes. A new index value is added to the parent node so that future queries can arbitrate between the two
nodes. In fact, when a new element is added to a leaf node, it may be possible that all the nodes on the path from a
leaf to the root may split. If a root node splits, a new leaf node gets created and the level of the tree increases by
one. B+ trees follow the given algorithm to insert a new node:
a) Insert the new node as the leaf node.
b) If the leaf node overflows, split the node and copy the middle element to the next index node.
c) If the index node overflows, split that node and move the middle element to the next index page.