Transform and Conquer: AVL & 2-3 Trees
Transform and Conquer: AVL & 2-3 Trees
Module- 3
Transform and Conquer, Space- Time Tradeoffs
Transform-and-conquer methods work as two-stage procedures. First, in the transformation stage, the
problem’s instance is modified to be, for one reason or another, more amenable to solution. Then, in the
second or conquering stage, it is solved.
There are three major variations of this idea
1. Instance simplification: Transformation to a simpler or more convenient instance of the same
problem
2. Representation change : Transformation to a different representation of the same instance.
3. problem reduction- Transformation to an instance of a different problem for which an
algorithm is already available
Binary search tree is one of the principal data structures for implementing dictionaries.
A binary Search tree is a binary tree whose nodes contain elements of a set of orderable items, one
element per node, so that all elements in the left subtree are smaller than the element in the subtree’s
root, and all the elements in the right subtree are greater than it.
This Transformation gives us a time efficiency of searching, insertion, and deletion, which are all in
(logn), but only in the average case. In the worst case, these operations are in (n) because the tree can
degenerate into a severely unbalanced one with its height equal to n − 1.
CSE@HKBKCE 1 2023-24
ADA Module-3
2. Representation-change variety approach: It allow more than one element in a node of a search
tree. Specific cases of such trees are 2-3 trees, 2-3-4 trees, and more general and important B-trees.
They differ in the number of elements admissible in a single node of a search tree, but all are
perfectly balanced.
AVL Trees
Balance factor of a node, is defined as the difference between the heights of the node’s left and right
subtrees.An AVL tree is a binary search tree in which the balance
is either 0 or +1 or −1
Example
If an insertion of a new node makes an AVL tree unbalanced, we transform the tree by a rotation.
A rotation in an AVL tree is a local transformation of its subtree rooted at a node whose balance has
become either +2 or −2. If there are several such nodes, we rotate the tree rooted at the unbalanced node
that is the closest to the newly inserted leaf. There are only four types of rotations
Single right rotation, or R-rotation :This rotation is performed after a new key is inserted into the left
subtree of the left child of a tree whose root had the balance of +1 before the insertion.
CSE@HKBKCE 2 2023-24
ADA Module-3
The symmetric single left rotation, or L-rotation, is the mirror image of the single R-rotation. It is
performed after a new key is inserted into the right subtree of the right child of a tree whose root had the
balance of −1 before the insertion.
Double left-right rotation (LR-rotation): It is a combination of two rotations: The L-rotation of the left
subtree of root r followed by the R-rotation of the new tree rooted at r . It is performed after a new key is
inserted into the right subtree of the left child of a tree whose root had the balance of +1 before the
insertion.
Double right-left rotation (RL-rotation) is the mirror image of the double LR-rotation. The R-rotation of
the Right subtree of root r followed by the L-rotation of the new tree rooted at r. It is performed after a
new key is inserted into the left subtree of the right child of a tree whose root had the balance of -1 before
the insertion.
CSE@HKBKCE 3 2023-24
ADA Module-3
Note: if there are several nodes with the ±2 balance, the rotation is done for the tree rooted at the
unbalanced node that is the closest to the newly inserted leaf.
Time Efficiency of AVL tree: The efficiency of the dictionary operations depends on the tree’s height.
The operations of searching and insertion are (log n) in the worst case
The drawbacks of AVL trees are frequent rotations and the need to maintain balances for its nodes.
These drawbacks have prevented AVL trees from becoming the standard structure for implementing
dictionaries
But their underlying idea of rebalancing a binary search tree via rotations has proved to be very fruitful
and has led to discoveries of other interesting variations of the classical binary search tree.
CSE@HKBKCE 4 2023-24
ADA Module-3
2-3 Trees
the second idea of balancing a search tree is to allow more than one key in the same node of such a tree.
The simplest implementation of this idea is 2-3 trees, introduced by the U.S. computer scientist John
Hopcroft in 1970.
A 2-3 tree is a tree that can have nodes of two kinds: 2-nodes and 3-nodes.
• A 2-node contains a single key K and has two children . The left child serves as the root of a
subtree whose keys are less than K, and the right child serves as the root of a subtree whose keys
are greater than K.
• A 3-node contains two ordered keys K1 and K2 (K1<K2) and has three children. The leftmost
child serves as the root of a subtree with keys less than K1, the middle child serves as the root of
a subtree with keys between K1 and K2, and the rightmost child serves as the root of a subtree with
keys greater than K2
The last requirement of the 2-3 tree is that all its leaves must be on the same level. In other words, a 2-3
tree is always perfectly height-balanced, the length of a path from the root to a leaf is the same for every
leaf.
If the root is a 3- node, we know after no more than two key comparisons whether the search can be
stopped if K is equal to one of the root’s keys or in which of the root’s three subtrees it needs to be
continued.
Always insert a new key K in a leaf, except for the empty tree. The appropriate leaf is found by performing
a search for K.
• If the leaf in question is a 2-node, we insert K th aere as either the first or the second key, depending
on whether K is smaller or larger than the node’s old key.
• If the leaf is a 3-node, we split the leaf in two:
o The smallest of the three keys and the new is put in the first leaf,
o The largest key is put in the second leaf, and
o The middle key is promoted to the old leaf’s parent. If the leaf happens to be the tree’s root,
a new root is created to accept the middle key. The promotion of a middle key to its parent
CSE@HKBKCE 5 2023-24
ADA Module-3
can cause the parent’s overflow (if it was a 3-node) and hence can lead to several node
splits along the chain of the leaf’s ancestors.
Time Efficiency: The efficiency of the dictionary operations depends on the tree’s height.
h ≤ log2(n + 1) − 1.
A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each with two keys and
three children. Therefore, for any 2-3 tree with n nodes
h ≥ log3(n + 1) − 1.
log3(n + 1) − 1≤ h ≤ log2(n + 1) − 1
Implies that the time efficiencies of searching, insertion, and deletion are all in (log n) in both the worst
and average case
Definition - A heap can be defined as a binary tree with keys assigned to its nodes, one key per node,
and satisfying the following conditions
1. The shape property—the binary tree is essentially complete (or simply complete), i.e., all its levels
are full except possibly the last level, where only some rightmost leaves may be missing.
2. The parental dominance or heap property—the key in each node is greater than or equal to the
keys in its children.
Example:
CSE@HKBKCE 6 2023-24
ADA Module-3
Properties of heap
1. There exists exactly one essentially complete binary tree with n nodes. Its height is equal to log2 n.
2. The root of a heap always contains its largest element.
3. A node of a heap considered with all its descendants is also a heap.
4. A heap can be implemented as an array by recording its elements in the top down,left-to-right
fashion. It is convenient to store the heap’s elements in positions 1 through n of such an array,
leaving H[0] either unused or putting there a sentinel whose value is greater than every element in
the heap. In such a representation,
a. The parental node keys will be in the first n/2 positions of the array, while the leaf keys will
occupy the last n/2 positions;
b. The children of a key in the array’s parental position i (1≤ i ≤ _n/2_) will be in positions 2i and
2i + 1, and, correspondingly, the parent of a key in position i (2 ≤ i ≤ n) will be in position i/2.
Consider the example given below, and verify the properties listed above
Construction of a Heap
CSE@HKBKCE 7 2023-24
ADA Module-3
• This method first initializes the essentially complete binary tree with n nodes by placing keys in
the order given and then “heapifies” the tree as follows.
• Starting with the last parental node, the algorithm checks whether the parental. dominance holds
for the key in this node.
• If it does not, the algorithm exchanges the node’s key K with the larger key of its children and
checks whether the parental dominance holds for K in its new position.
• This process continues until the parental dominance for K is satisfied.
• After completing the heapification” of the subtree rooted at the current parental node, the
algorithm proceeds to do the same for the node’s immediate predecessor.
• The algorithm stops after this is done for the root of the tree.
ALGORITHM HeapBottomUp(H[1..n])
//Constructs a heap from elements of a given array
//Input: An array H[1..n] of orderable items
//Output: A heap H[1..n]
for i ←n/2 downto 1 do
k←i; v←H[k]
heap←false
while not heap and 2 ∗ k ≤ n do
j ←2 ∗ k
if j <n //there are two children
if H[j ]<H[j + 1] j ←j + 1
if v ≥ H[j ]
heap←true
else H[k]←H[j ]; k←j
H[k]←v
Effeciency
Assume n = 2k − 1 so that a heap’s tree is full, i.e., the largest possible number of nodes occurs on each
level. Let h be the height of the tree. According to the property of heaps h= log2 n.
CSE@HKBKCE 8 2023-24
ADA Module-3
• Each key on level i of the tree will travel to the leaf level h in the worst case of the heap
construction algorithm.
• Moving to the next level down requires two comparisons one to find the larger child and the
other to determine whether the exchange is required—the total number of key comparisons
involving a key on level i will be 2(h − i).
• Therefore, the total number of key comparisons in the worst case will be
•
Algorithm HeapTopup
This method constructs a heap by successive insertions of a new key into a previously constructed heap.
1. Attach a new node with key K in it after the last leaf of the existing heap.
2. Then shift K up to its appropriate place in the new heap as follows.
a. Compare K with its parent’s key: if the latter is greater than or equal to K, stop(the
structure is a heap); otherwise
b. swap these two keys and compare K with its new parent. This swapping continues until K
is not greater than its last parent or it reaches the root
Example:
Step 1 Exchange the root’s key with the last key K of the heap.
Step 2 Decrease the heap’s size by 1.
Step 3 “Heapify” the smaller tree. That is, verify the parental dominance for K: if it holds, we can stop if not,
swap K with the larger of its children and repeat this operation until the parental dominance condition holds
for K in its new position.
CSE@HKBKCE 9 2023-24
ADA Module-3
The efficiency of deletion is determined by the number of key comparisons needed to ―heapify the tree
after the swap has been made and the size of the tree is decreased by 1. Since this cannot require more key
comparisons than twice the heap’s height, therefore the time efficiency of deletion is in O(log n)
Heapsort
CSE@HKBKCE 10 2023-24
ADA Module-3
The different techniques that exploits space time trade offs are:
Input Enhancement: The idea is to preprocess the problem’s input, in whole or in part, and store the
additional information obtained to accelerate solving the problem afterward. We call this approach input
enhancement. Example Counting methods for sorting , Boyer-Moore algorithm for string matching
Pre-structuring: This technique uses extra space to facilitate faster and/or more flexible access to the
data. We call this approach pre-structuring. Here some processing is done before a problem in question is
actually solved example hashing
the two resources—time and space—do not have to compete with each other in all design situations. In
fact, they can align to bring an algorithmic solution that minimizes both the running time and the space
consumed. Such a situation arises, in particular, when an algorithm uses a space efficient data structure to
represent a problem’s input, which leads, in turn, to a faster algorithm. Consider, as an example, the
problem of traversing graphs. Re call that the time efficiency of the two principal traversal algorithms—
depth-first search and breadth-first search—depends on the data structure used for representing graphs: it
CSE@HKBKCE 11 2023-24
ADA Module-3
is n2 for the adjacency matrix representation and (n+ m) for the adjacency list representation, where n
and m are the numbers of vertices and edges, respectively.
Sorting By counting
It is an example of applying the input-enhancement technique. The idea is to count, for each element of a
list to be sorted, the total number of elements smaller than this element and record the results in a table.
These numbers will indicate the positions of the elements in the sorted list
Time Complexity
CSE@HKBKCE 12 2023-24
ADA Module-3
Thus, the algorithm makes the same number of key comparisons as selection sort and in addition uses a
linear amount of extra space. On the positive side, the algorithm makes the minimum number of key
moves possible, placing each of them directly in their final position in a sorted array. The counting idea
does work productively in a situation in which elements to be sorted belong to a known small set of values.
Distribution Counting
EXAMPLE Consider sorting the array
13 11 12 13 12 12
whose values are known to come from the set {11, 12, 13} and should not be overwritten in the process
of sorting.
CSE@HKBKCE 13 2023-24
ADA Module-3
Here we see how the technique of input enhancement can be applied to the problem of string matching.
Horspool’s Algorithm
Starting with the last R of the pattern and moving right to left, we compare the corresponding pairs of
characters in the pattern and the text. If all the pattern’s characters match successfully, a matching
substring is found. Then the search can be either stopped altogether or continued if another occurrence of
the same pattern is desired.
If a mismatch occurs, we need to shift the pattern to the right. Clearly, we would like to make as large a
shift as possible without risking the possibility of missing a matching substring in the text.
Horspool’s algorithm determines the size of such a shift by looking at the character c of the text that is
aligned against the last character of the pattern. This is the case even if character c itself matches its
counterpart in the pattern.
Case 1 If there are no c’s in the pattern—e.g., c is letter S in our example— we can safely shift the pattern
by its entire length
Case2 If there are occurrences of character c in the pattern but it is not the last one there—e.g., c is letter
B in our example—the shift should align the rightmost occurrence of c in the pattern with the c in the text:
Case 3 If c happens to be the last character in the pattern but there are no c’s among its other
m−1characters—e.g., c is letter R in our example—the situation is similar to that of Case 1 and the pattern
should be shifted by the entire pattern’s length m:
CSE@HKBKCE 14 2023-24
ADA Module-3
Case 4 Finally, if c happens to be the last character in the pattern and there are other c’s among its first
m − 1characters—e.g., c is letter R in our example— the situation is similar to that of Case 2 and the
rightmost occurrence of c among the first m −1characters in the pattern should be aligned with the text’s
c
However, if such an algorithm had to check all the characters of the pattern on every trial, it would lose
much of this superiority. But the idea of input enhancement makes repetitive comparisons unnecessary.
We can pre compute shift sizes and store them in a table. The table will be indexed by all possible
characters that can be encountered in a text, including, for natural language texts, the space, punctuation
symbols, and other special characters The table’s entries will indicate the shift sizes computed by the
formula
CSE@HKBKCE 15 2023-24
ADA Module-3
EXAMPLE As an example of a complete application of Horspool’s algorithm, consider searching for the
pattern BARBER in a text that comprises English letters and spaces (denoted by underscores). The shift
table, as we mentioned, is filled as follows:
CSE@HKBKCE 16 2023-24