KEMBAR78
Module 07 | PDF | Algorithms And Data Structures | Theoretical Computer Science
0% found this document useful (0 votes)
7 views29 pages

Module 07

This document provides an overview of tree data structures, focusing on binary trees, their properties, and operations. It includes definitions, traversal methods, and algorithms for binary tree creation, searching, and deletion. Additionally, it discusses the implementation of binary trees in Java and their applications in real-life scenarios and operating systems.

Uploaded by

jaymalachavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views29 pages

Module 07

This document provides an overview of tree data structures, focusing on binary trees, their properties, and operations. It includes definitions, traversal methods, and algorithms for binary tree creation, searching, and deletion. Additionally, it discusses the implementation of binary trees in Java and their applications in real-life scenarios and operating systems.

Uploaded by

jaymalachavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 29

Data Structure and Files

Module: 07

TREES
 Motivation: Tree is a data structure which will help to students to store the data in the memory and
access it according to need. This can also be used in the operating system concepts i.e. the core system
implementation.

 Objective: To understand the working and implementation of tree.

 Syllabus:

Perquisites Syllabus Duration Self Study

Concept of Linked Basic tree concept 2 Hrs 2 Hrs


List, knowledge of Binary Tree operations and applications.
java.
Binary tree representation. 1 Hr 1 Hr

Binary tree traversal 2 Hr 2 Hr

Threaded binary tree. 1 Hr 1 Hr

Huffman algorithm 1 Hr 1 Hr

Binary search tree implementation 2 Hr 2 Hr

Expression trees 1 Hr 1 Hr

Introduction to multiway tree (B tree, B+ 2 Hrs 2 Hrs


tree, AVL tree)

 Definitions:-
Tree:-
A tree represents a hierarchy of information.

Trees
91
Data Structure and Files

In the following figure,


 Root node: Top element of tree. The root node has no parents.
A is the root node.

 Parent node: Immediate node joined by a single line segment when traversing up the tree. B is the
parent of D and E.
 Children: All immediate nodes joined by a line segment when traversing one level down the tree. D
and E are children of B.
 Sibling: Children of the same parent. C is a sibling of B.

• Ancestors: All nodes along the path to the root. H, C, and A are ancestors of I.

 External node: Node without children. Also called a leaf. D, E, F, G, and I are external nodes or leaves.

• Internal node: Node with one or more children. A, B, C and H are internal nodes.

• Depth (level): Number of line segments from the root to a node (or the number of ancestors).
The depth of B is 1. The depth of I is 3.
 Leaf Nodes: Nodes that do not have any children are called leaf nodes. They are also referred to as
terminal nodes. (Nodes D,E,F,G,I are leaf nodes)
 Height: The height of a node is the length of the longest downward path to a leaf from that node. The
height of the root is the height of the tree.

Trees
92
Data Structure and Files

 Subtree: A subtree of a tree T is a tree consisting of a node in T and all of its descendants in T. (This is
different from the formal definition of subtree used in graph theory.) The subtree corresponding to the
root node is the entire tree;
 Proper Subtree: The subtree corresponding to any other node is called a proper subtree (in analogy to
the term proper subset).
 Directed Edge: A directed edge refers to the link from the parent to the child.
 In-Degree: In-degree of a node is the number of edges arriving at that node.
 Out Degree: Out-degree of a node is the number of edges leaving that node.

 Learning:-
Student will learn how tree is used in real life applications and Operating system file structure.

 Theory:

Binary Trees

 A binary tree is an ordered tree in which all internal nodes are of degree 2.

 Recursive definition of binary tree:

 A binary tree is either

an external node (i.e., leaf) or

an internal node (the root of the tree) and two binary trees, the left subtree and
the right subtree.

Trees
93
Data Structure and Files

Binary Tree Example

 arithmetic expression

((((3  (1 + (4 + 6))) + (2 + 8))  5) + (4  (7 + 2)))

Properties of Binary Trees

 (# external nodes) = (# internal nodes) + 1

 (# nodes at level i)  2i

 (# external nodes)  2(height)

 (height)  log2(# external nodes)

 (height)  log2(# nodes) – 1

 (height)  (# internal nodes) = ((# nodes) –1)/2

 The number of nodes n in a perfect binary tree can be found using this formula: n = 2h + 1 − 1 where
h is the height of the tree.
 The number of nodes n in a complete binary tree is minimum: n = 2h and maximum: n = 2h + 1 − 1
where h is the height of the tree.

 The number of nodes n in a perfect binary tree can also be found using this formula: n = 2L − 1
where L is the number of leaf nodes in the tree.

 The number of leaf nodes n in a perfect binary tree can be found using this formula: n = 2h where h is the
height of the tree.

 The number of NULL links in a Complete Binary Tree of n-node is (n+1).

 The number of leaf node in a Complete Binary Tree of n-node is UpperBound(n / 2).
Trees
94
Data Structure and Files

 For any non-empty binary tree with n0 leaf nodes and n2 nodes of degree 2, n0 = n2 + 1.

 n = n0 + n1 + n2 + n4 + n3 + n5 + .... + nB-1 + nB

 B = n - 1, n = 1 + 1*n1 + 2*n2 + 3*n3 + 4*n4 + ... + B*nB, NOT include n0

 Define:
 h = height
 n = # nodes
 ni = # internal nodes
 ne = # external nodes
 Using n = ni + ne, it immediately follows that
 ni = ne – 1, or ne = ni + 1
 n = 2ni + 1

(1)

By definition we have that

(2)

Trees
95
Data Structure and Files

Note that . (3)

Taking logs of the inequality (3),

From (2) and (1),

The Tree ADT

 The nodes of a tree are viewed as positions.


 Generic container methods:
 size(), isEmpty(), elements(), newContainer()
 Positional container methods:
 positions(), replace(p,e), swap(p,q)
 Query methods:
 isRoot(p), isInternal(p), isExternal(p)
 accessor methods
 root(), parent(p), children(p), siblings(p)
 update methods are application specific.

The Binary Tree ADT


 Extends the InspectableTree ADT
 Accessor methods:
 leftChild(p), rightChild(p), sibling(p)
 Update methods:
 expandExternal(p), removeAboveExternal(p), cut(p), link(p,T), replaceSubtree(p,T).
 other, application-specific methods (e.g. AVL trees)

Trees
96
Data Structure and Files

void expandExternal(Position p)
 before:
 Position p points to an external node storing the string “CJ”.

p
“CJ”
 after:
 Position p points to the same node storing the same object. Now, however, the formerly external
node has two external children.

p “CJ”

BINARY TREE OPERATIONS:

Binary tree creation:

The binary tree creation follows a very simple principle -- for the new element to be added, compare it with
the current element in the tree. If its value is less than the current element in the tree then move towards
the left side of that element or else to its right. If there is no sub tree on the left, make your new element as
the left child of that current element or else compare it with the existing left child and follow the same rule.
Exactly same has to done for the case when your new element is greater than the current element in the
tree but this time with the right child. Check out the algo..

Algorithm:-
Step-1: If Value of New element < current element then go to step-2 or else step-3
Step-2: If the Current element does not have a left sub-tree then make your new element the left child of
the current element; else make the existing left child as your current element and go to step-1.
Step-3: If the current element does not have a right sub-tree then make your new element the right child of
the current element; else make the existing right child as your current element and go to step-1.

Trees
97
Data Structure and Files

Binary tree properties:

 The number of nodes n in a perfect binary tree can be found using this formula: n = 2h + 1 − 1 where h is
the height of the tree.
 The number of nodes n in a complete binary tree is minimum: n = 2h and maximum: n = 2h + 1 − 1 where
h is the height of the tree.
 The number of nodes n in a perfect binary tree can also be found using this formula: n = 2L − 1 where L
is the number of leaf nodes in the tree.
 The number of leaf nodes n in a perfect binary tree can be found using this formula: n = 2h where h is
the height of the tree.
 The number of NULL links in a Complete Binary Tree of n-node is (n+1).
 The number of leaf node in a Complete Binary Tree of n-node is UpperBound (n / 2).
 For any non-empty binary tree with n0 leaf nodes and n2 nodes of degree 2, n0 = n2 + 1.

Binary tree traversal:

Traversing a tree means visiting all the nodes of a tree in order.


There are three different methods of traversing a binary tree:
 pre-order traversal
 in-order traversal

 post-order traversal

In each case, the algorithms for traversal are recursive - they call themselves.

Illustration of tree traversal

Pre-order traversal
1. Start at the root node
2. Traverse the left subtree

Trees
98
Data Structure and Files

3. Traverse the right subtree

4. The nodes of the above tree would be visited in the order :

DBACFEG

Algorithm

preorder(node)
print node.value
if node.left ≠ null then preorder(node.left)
if node.right ≠ null then preorder(node.right)
In-order traversal
1. Traverse the left subtree
2. Visit the root node

3. Traverse the right subtree

4. The nodes of the above tree would be visited in the order :

ABCDEFG

Algorithm

inorder(node)
if node.left ≠ null then inorder(node.left)
print node.value
if node.right ≠ null then inorder(node.right)

Post-order traversal

1. Traverse the left subtree


2. Traverse the right subtree

3. Visit the root node

4. The nodes of the above tree would be visited in the order :

Trees
99
Data Structure and Files

ACBEGFD

Algorithm
postorder(node)
if node.left ≠ null then postorder(node.left)
if node.right ≠ null then postorder(node.right)
print node.value

Evaluating Arithmetic Expressions

 Specialization of a postorder traversal on binary trees.


Algorithm evaluateExpression(v)
if v is an external node
return the variable stored at v
else
let o be the operator stored at v
x  evaluateExpression(T,T.leftChild(v))
y  evaluateExpression(T,T.rightChild(v))
return x o y

Euler Tour Traversal

 Generic traversal of a binary tree.

Trees
100
Data Structure and Files

 The preorder, inorder, and postorder traversals are special cases of the Euler tree traversal.
 “Walk around” the tree and visit each node three times:
 on the left
 from below
 on the right

Implementation of binary tree

import java.io.*;
import java.util.*;
class NodeTree
{
int data;
NodeTree leftchild;
NodeTree rightchild;
NodeTree root;
public NodeTree()
{
leftchild=null;
rightchild=null;
}
public void insert(int id)
{
NodeTree n= new NodeTree();
n.data=id;
if(root==null)
Trees
101
Data Structure and Files

root=n;
else
{
NodeTree current=root;
NodeTree parent;
while(true)
{
parent = current;
if(id<current.data)
{
current=current.leftchild;
if(current==null)
{
parent.leftchild=n;
return;
}
}
else
{
current=current.rightchild;
if(current==null)
{
parent.rightchild=n;
return;
}
}
}
}

}
public NodeTree find(int key)
{
NodeTree current = root;
NodeTree t = new NodeTree();
while(current.data!= key)
{
if(key<current.data)
current=current.leftchild;
else
current=current.rightchild;
if(current==null)
return null;
}
return current;
}
Trees
102
Data Structure and Files

public boolean delete(int key)


{
NodeTree current = root;
NodeTree parent = root;
boolean isleftchild=true;
while(current.data!=key)
{
parent=current;
if(key<current.data)
{
isleftchild=true;
current=current.leftchild;
}
else
{
isleftchild=false;
current=current.rightchild;
}
if(current==null)
return false;
}
if(current.leftchild==null && current.rightchild==null)
{
if(current==root)
root=null;
else if(isleftchild)
parent.leftchild=null;
else
parent.rightchild=null;
}
else if(current.leftchild==null)
if(current==root)
root=current.rightchild;
else if(isleftchild)
parent.leftchild=current.rightchild;
else
parent.rightchild=current.rightchild;
else if(current.rightchild==null)
if(current==root)
root=current.leftchild;
else if(isleftchild)
parent.leftchild=current.leftchild;
else
parent.rightchild=current.leftchild;

Trees
103
Data Structure and Files

else
{
NodeTree successor = getsuccessor(current);
if(current==root)
root=successor;
else if(isleftchild)
parent.leftchild=successor;
else
parent.rightchild=successor;
successor.leftchild=current.leftchild;
}
return true;
}
public NodeTree getsuccessor(NodeTree deleNode)
{
NodeTree successorparent = deleNode;
NodeTree successor =deleNode;
NodeTree current=deleNode.rightchild;

while(current!=null)
{
successorparent=successor;
successor=current;
current=current.leftchild;
}
if(successor != deleNode.rightchild)
{
successorparent.leftchild=successor.rightchild;
successor.rightchild=deleNode.rightchild;
}
return successor;
}
public void traverse(int ttype)
{
switch(ttype)
{
case 1: System.out.println("\n Preorder traversal:");
preorder(root);
break;
case 2: System.out.println("\n Inorder traversal:");
inorder(root);
break;
case 3: System.out.println("\n Postorder traversal:");
postorder(root);
break;
Trees
104
Data Structure and Files

}
}
void preorder(NodeTree localroot)
{
if(localroot!=null)
{
System.out.println("\n"+localroot.data);
preorder(localroot.leftchild);
preorder(localroot.rightchild);
}
}
void inorder(NodeTree localroot)
{
if(localroot!=null)
{
inorder(localroot.leftchild);
System.out.println(localroot.data);
inorder(localroot.rightchild);
}
}
void postorder(NodeTree localroot)
{
if(localroot!=null)
{
postorder(localroot.leftchild);
postorder(localroot.rightchild);
System.out.println(localroot.data);
}
}
public static void main(String[] arg)
{
NodeTree t = new NodeTree();
NodeTree root = new NodeTree();
t.insert(50);
t.insert(25);
t.insert(75);
t.insert(10);
t.insert(30);
t.insert(60);
t.insert(80);

int n,key;
System.out.println("Enter a Traverseal method:-\n");
System.out.println("1 Preorder\n 2 Inorder\n 3 Postorder\n");
Scanner sc= new Scanner(System.in);
Trees
105
Data Structure and Files

n= sc.nextInt();
t.traverse(n);

System.out.println("Enter a Key to delete:-\n");


key=sc.nextInt();
t.delete(key);
t.traverse(n);
NodeTree found = t.find(5);
if(found!=null)
System.out.println("Value is found");
else
System.out.println("Value is not found");
}
}
Enter a Traverseal method:-
1 Preorder
2 Inorder
3 Postorder
2
Inorder traversal:
10 25 30 50 60 75 80
Enter a Key to delete:-
60
Inorder traversal:
10 25 30 50 75 80
Value is not found
THREADED BINARY TREE:

A threaded binary tree may be defined as follows:


"A binary tree is threaded by making all right child pointers that would normally be null point to the inorder
successor of the node, and all left child pointers that would normally be null point to the inorder
predecessor of the node."

A threaded binary tree makes it possible to traverse the values in the binary tree via a linear traversal that
is more rapid than a recursive in-order traversal.

Trees
106
Data Structure and Files

It is also possible to discover the parent of a node from a threaded binary tree, without explicit use of
parent pointers or a stack, albeit slowly. This can be useful where stack space is limited, or where a stack of
parent pointers is unavailable (for finding the parent pointer via DFS).

This is possible, because if a node (k) has a right child (m) then m's left pointer must be either a child, or a
thread back to k. In the case of a left child, that left child must also have a left child or a thread back to k,
and so we can follow m's left children until we find a thread, pointing back to k. The situation is similar for
when m is the left child of k

A threaded tree, with the special threading links shown by dashed arrows.

The node structure for a threaded binary tree varies a bit and its like this --
struct NODE
{
struct NODE *leftchild;
int node_value;
struct NODE *rightchild;
struct NODE *thread;
}

Advantages:

• The traversal operation is faster than that of its unthreaded version, because with threaded binary
tree non-recursive implementation is possible which can run faster and does not require the
botheration of stock management.
• The second advantage is more subtle with a threaded binary tree; we can efficiently determine the
predecessor and successor nodes starting from any node. A stack is required to provide upward
pointing information in the tree whereas in a threaded binary tree, without having to incur the
overload of using a stack mechanism the same can be carried out with the threads.

• Any node can be accessible from any other node. Threads are usually more to upward whereas links
are downward. Thus in a threaded tree, one can move in either direction and nodes are in fact
circularly linked. This is not possible in unthreaded counter part because there we can move only in
downward direction starting form root.

Trees
107
Data Structure and Files

• Insertion into and deletions from a threaded tree are all although time consuming operations (since
we have to manipulate both links and threads)

Disadvantages:

• Slower tree creation, since threads need to be maintained. This can partly be alleviated by constructing
the tree as an non-threaded tree, then threading it with a special libavl function.
• In theory, threaded trees need two extra bits per node to indicate whether each child pointer points to
an ordinary node or the node's successor/predecessor node. In libavl, however, these bits are stored in
a byte that is used for structure alignment padding in non-threaded binary trees, so no extra storage is
used.

Huffman Algorithm

Huffman's algorithm is a method for building an extended binary tree with a minimum weighted path
length from a set of given weights. Initially construct a forest of singleton trees, one associated with each
weight. If there are at least two trees, choose the two trees with the least weight associated with their
roots and replace them with a new tree, constructed by creating a root node whose weight is the sum of
the weights of the roots of the two trees removed, and setting the two trees just removed as this new
node's children. This process is repeated until the forest consists of one tree.

Pseudocode

Huffman (W, n)
Input: A list W of n weights.
Output: An extended binary tree T with weights taken from W that gives the minimum
weighted path length.
Procedure:
Create list F from singleton trees formed from {elements} of W
WHILE (F has more than one element) DO
Find T1, T2 in F that have minimum values associated with their roots
Construct new tree T by creating a new node and setting T1 and T2 as its children
Let the sum of the values associated with the roots of T1 and T2 be associated with the root of T
Add T to F
OD
Huffman: = tree stored in F
Trees
108
Data Structure and Files

Huffman's scheme uses a table of frequency of occurrence for each symbol (or character) in the input. This
table may be derived from the input itself or from data which is representative of the input. For instance,
the frequency of occurrence of letters in normal English might be derived from processing a large number
of text documents and then used for encoding all text documents. We then need to assign a variable-length
bit string to each character that unambiguously represents that character. This means that the encoding for
each character must have a unique prefix. If the characters to be encoded are arranged in a binary tree:

An encoding for each character is found by following the tree from


the route to the character in the leaf: the encoding is the string of
symbols on each branch followed.

For example:

String Encoding
TEA 10 00 010
Encoding tree for ETASNO SEA 011 00 010
TEN 10 00 110

Notes:

1. As desired, the highest frequency letters - E and T - have two digit encodings, whereas all the others
have three digit encodings.
2. Encoding would be done with a lookup table.

A divide-and-conquer approach might have us asking which characters should appear in the left and right
subtrees and trying to build the tree from the top down. As with the optimal binary search tree, this will
lead to to an exponential time algorithm.

A greedy approach places our n characters in n sub-trees and starts by combining the two least
weight nodes into a tree which is assigned the sum of the two leaf node weights as the weight for
its root node.

Operation of the Huffman algorithm.

1. Initial data sorted by frequency


Trees
109
Data Structure and Files

2. Combine the two lowest frequencies, F and E, to form a sub-tree of weight 14.
3. Move it into its correct place.

4. Again combine the two lowest frequencies, C and B, to form a sub-tree of weight 25.
5. Move it into its correct place.

6. Now the sub-tree with weight, 14, and D are combined to make a tree of weight, 30.
7. Move it to its correct place.

8. Now the two lowest weights are held by the "25" and "30" sub-trees, so combine them to make
one of weight, 55.
9. Move it after the A.

10. Finally, combine the A and the "55" sub-tree to produce the final tree.
The encoding table is:

A 0
C 100
B 101
F 1100
E 1101
D 111

The time complexity of the Huffman algorithm is O(nlogn). Using a heap to store the weight of
each tree, each iteration requires O(logn) time to determine the cheapest weight and insert the
new weight. There are O(n) iterations, one for each item.

Decoding Huffman-encoded Data


Curious readers are, of course, now asking

"How do we decode a Huffman-encoded bit string? With these variable length strings, it's not possible to
break up an encoded string of bits into characters!"

Trees
110
Data Structure and Files

The decoding procedure is deceptively simple.


Starting with the first bit in the stream, one then uses
successive bits from the stream to determine
whether to go left or right in the decoding tree.
When we reach a leaf of the tree, we've decoded a
character, so we place that character onto the
(uncompressed) output stream. The next bit in the
input stream is the first bit of the next character.

Analysis:
Each iteration of Huffman's algorithm reduces the size of the problem by 1, and so there are exactly n
iterations. The i th iteration consists of locating the two minimum values in a list of length n−i+1. This is a
linear operation, and so Huffman's algorithm clearly has a time complexity of (n2).

However, it would be faster to sort the weights initially, and then maintain two lists. The first list consists of
weights that have not yet been combined, and the second list consists of trees that have been formed by
combining weights. This initial ordering is obtained at a cost of (nlogn) . Obtaining the minimum two trees
at each step then consists of two comparisons (compare the heads of the two lists, and then compare the
larger to the item after the smaller). The ordering of the second list can be maintained cheaply by using a
binary search to insert new elements. Since at step i there are i−1 elements in the second list, (log i)
comparisons are needed for insertion. Over the entire duration of the algorithm the cost of keeping this list
sorted is (n log n). Therefore the overall time complexity of Huffman's algorithm is (n log n).

In terms of space complexity, the algorithm constructs a complete binary tree with exactly n leaves.
Therefore the output can only have at most 2n−1 nodes. Thus Huffman's algorithm requires linear space.

Multiway Trees:

B- Tree:

Trees
111
Data Structure and Files

B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and
deletions in logarithmic amortized time. The B-tree is a generalization of a binary search tree in that more
than two paths diverge from a single node. (Comer, p. 123) Unlike self-balancing binary search trees, the B-
tree is optimized for systems that read and write large blocks of data. It is commonly used in databases and
file systems.

A B-tree of order 2 or order 5.

Definition:

A B-tree of order m (the maximum number of children for each node) is a tree which satisfies the following
properties:

1. Every node has at most m children.


2. Every node (except root) has at least m⁄2 children.

3. The root has at least two children if it is not a leaf node.

4. All leaves appear in the same level, and carry information.

5. A non-leaf node with k children contains k−1 keys.

Each internal node's elements act as separation values which divide its subtrees. For example, if an internal
node has three child nodes (or subtrees) then it must have two separation values or elements a1 and a2. All
values in the leftmost subtree will be less than a1 , all values in the middle subtree will be between a1 and
a2, and all values in the rightmost subtree will be greater than a2.

Internal nodes in a B-tree – nodes which are not leaf nodes – are usually represented as an ordered set of
elements and child pointers. Every internal node contains a maximum of U children and – other than the
root – a minimum of L children. For all internal nodes other than the root, the number of elements is one
less than the number of child pointers; the number of elements is between L−1 and U−1. The number U
must be either 2L or 2L−1; thus each internal node is at least half full. This relationship between U and L
Trees
112
Data Structure and Files

implies that two half-full nodes can be joined to make a legal node, and one full node can be split into two
legal nodes (if there is room to push one element up into the parent). These properties make it possible to
delete and insert new values into a B-tree and adjust the tree to preserve the B-tree properties.

Leaf nodes have the same restriction on the number of elements, but have no children, and no child
pointers.

The root node still has the upper limit on the number of children, but has no lower limit. For example, when
there are fewer than L−1 elements in the entire tree, the root will be the only node in the tree, and it will
have no children at all.

A B-tree of depth n+1 can hold about U times as many items as a B-tree of depth n, but the cost of search,
insert, and delete operations grows with the depth of the tree. As with any balanced tree, the cost grows
much more slowly than the number of elements.

Some balanced trees store values only at the leaf nodes, and so have different kinds of nodes for leaf nodes
and internal nodes. B-trees keep values in every node in the tree, and may use the same structure for all
nodes. However, since leaf nodes never have children, a specialized structure for leaf nodes in B-trees will
improve performance.

B+ TREES:

B+ tree (B plus Tree) is a type of tree which represents sorted data in a way that allows for efficient
insertion, retrieval and removal of records, each of which is identified by a key. It is a dynamic, multilevel
index, with maximum and minimum bounds on the number of keys in each index segment (usually called a
"block" or "node"). In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of the tree;
only keys are stored in interior nodes.

The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context in
particular, file systems. This is primarily because unlike binary search trees, B+ trees have very high fanout
(typically on the order of 100 or more), which reduces the number of I/O operations required to find an
element in the tree.

Trees
113
Data Structure and Files

Example of a simple B+ Tree with 7 key entries pointing to data d1 to d7

A simple B+ tree example linking the keys 1–7 to data values d 1-d7. Note the linked list (red) allowing rapid
in-order traversal.

The algorithm to perform a search for a record r follows pointers to the correct child of each node until a
leaf is reached. Then, the leaf is scanned until the correct record is found (or until failure).

function search(record r)
u := root
while (u is not a leaf) do
choose the correct pointer in the node
move to the first node following the pointer
u := current node
scan u for r

This pseudocode assumes that no repetition is allowed.

Insertion

 do a search to determine what bucket the new record should go in


 if the bucket is not full, add the record.

 otherwise, split the bucket.

 allocate new leaf and move half the bucket's elements to the new bucket

 insert the new leaf's smallest key and address into the parent.

 if the parent is full, split it also

 now add the middle key to the parent node


Trees
114
Data Structure and Files

 repeat until a parent is found that need not split

 if the root splits, create a new root which has one key and two pointers.

Characteristics

For a b-order B+ tree with h levels of index:

 The maximum number of records stored is nmax = bh − bh − 1


 The minimum number of keys is

 The space required to store the tree is O(n)

 Inserting a record requires O(logbn) operations in the worst case

 Finding a record requires O(logbn) operations in the worst case

 Removing a (previously located) record requires O(logbn) operations in the worst case

 Performing a range query with k elements occurring within the range requires O(logbn + k)
operations in the worst case.

AVL TREE:

An AVL tree is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of
any node differ by at most one; therefore, it is also said to be height-balanced. Lookup, insertion, and
deletion all take O(log n) time in both the average and worst cases, where n is the number of nodes in the
tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more
tree rotations.

The balance factor of a node is the height of its left subtree minus the height of its right subtree
(sometimes opposite) and a node with balance factor 1, 0, or −1 is considered balanced. A node with any
other balance factor is considered unbalanced and requires rebalancing the tree. The balance factor is
either stored directly at each node or computed from the heights of the subtrees.

AVL trees are often compared with red-black trees because they support the same set of operations and
because red-black trees also take O(log n) time for the basic operations. AVL trees perform better than red-
black trees for lookup-intensive applications.

Trees
115
Data Structure and Files

INSERTION:

After inserting a node, it is necessary to check each of the node's ancestors for consistency with the rules of
AVL. For each node checked, if the balance factor remains −1, 0, or +1 then no rotations are necessary.
However, if the balance factor becomes ±2 then the subtree rooted at this node is unbalanced. If insertions
are performed serially, after each insertion, at most two tree rotations are needed to restore the entire
tree to the rules of AVL.

There are four cases which need to be considered, of which two are symmetric to the other two. Let P be
the root of the unbalanced subtree. Let R be the right child of P. Let L be the left child of P.

Right-Right case and Right-Left case: If the balance factor of P is −2, then the right subtree outweighs the
left subtree of the given node, and the balance factor of the right child (R) must be checked. If the balance
factor of R is ≤ 0, a left rotation is needed with P as the root. If the balance factor of R is +1, a double left
rotation (with respect to P) is needed. The first rotation is a right rotation with R as the root. The second is
a left rotation with P as the root.

Left-Left case and Left-Right case: If the balance factor of P is +2, then the left subtree outweighs the right
subtree of the given node, and the balance factor of the left child (L) must be checked. If the balance factor
of L is ≥ 0, a right rotation is needed with P as the root. If the balance factor of L is −1, a double right
rotation (with respect to P) is needed. The first rotation is a left rotation with L as the root. The second is a
right rotation with P as the root.

Trees
116
Data Structure and Files

Pictorial description of how rotations cause rebalancing tree, and then retracing one's steps toward the
root updating the balance factor of the nodes.

Deletion

If the node is a leaf, remove it. If the node is not a leaf, replace it with either the largest in its left subtree
(inorder predecessor) or the smallest in its right subtree (inorder successor), and remove that node. The
node that was found as a replacement has at most one subtree. After deletion, retrace the path back up
the tree (parent of the replacement) to the root, adjusting the balance factors as needed.

As with all binary trees, a node's in-order successor is the left-most child of its right subtree, and a node's
in-order predecessor is the right-most child of its left subtree. In either case, this node will have zero or one
children. Delete it according to one of the two simpler cases above.

Trees
117
Data Structure and Files

Deletion from AVL trees

In addition to the balancing described above for insertions, if the balance factor for the tree is 2 and that of
the left subtree is 0, a right rotation must be performed on P. The mirror of this case is also necessary.

The retracing can stop if the balance factor becomes −1 or +1 indicating that the height of that subtree has
remained unchanged. If the balance factor becomes 0 then the height of the subtree has decreased by one
and the retracing needs to continue. If the balance factor becomes −2 or +2 then the subtree is unbalanced
and needs to be rotated to fix it. If the rotation leaves the subtree's balance factor at 0 then the retracing
towards the root must continue since the height of this subtree has decreased by one. This is in contrast to
an insertion where a rotation resulting in a balance factor of 0 indicated that the subtree's height has
remained unchanged.

The time required is O(log n) for lookup, plus a maximum of O(log n) rotations on the way back to the root,
so the operation can be completed in O(log n) time.

 Objective Questions:-

A. The number of nodes on the longest path from the root to a leaf.
a) Height of tree b) Depth of tree c) length of tree

B. Each node in a tree have a 0 , 1 or 2 children is called

a) Binary Tree b.) Full Binary Tree c.) Complete Binary Tree

C. Search time for binary tree is

O(n) b) O(lg n) c) O(2n)

D. A top of tree is known as

Trees
118
Data Structure and Files

a)Leaf b) Root c) child d) parent

E. A bottom of tree is known as


a)Leaf b) Root c) child d) parent

F. What is the minimum height of a binary tree with 31 nodes?

a) 2 b)3 c)4 d)5

ANS:- A.a B.a C.b D.b E.a F.c

 Subjective Questions:-

 ADT tree

 Binary Tree

 University Exam Questions:-

Write a Program to implement a binary tree using all traversal techniques.

 References:-

 Michael T. Goodrich, “Data Structures and Algorithms in Java”, Wiley Publication

Trees
119

You might also like