Binary Search Trees
Binary Trees
• Recursive definition
1. An empty tree is a binary tree
2. A node with two child subtrees is a binary tree
3. Only what you get from 1 by a finite number of applications of 2 is a binary
tree.
5
6
2
2
0
Is this a binary tree? 6
0
1 2
1 2
9 1
8 8
0 3
1 2 2
2 4 7
Binary Search Trees
• Viewed as data structures that can support dynamic set operations.
• Search, Minimum, Maximum, Predecessor, Successor, Insert, and Delete.
• Can be used to build
• Dictionaries.
• Priority Queues.
• Basic operations take time proportional to the height of the tree –
O(h).
BST – Representation
• Represented by a linked data structure of nodes.
• root(T) points to the root of tree T.
• Each node contains fields:
• key
• left – pointer to left child: root of left subtree.
• right – pointer to right child : root of right subtree.
• p – pointer to parent. p[root[T]] = NIL (optional).
Binary Search Tree Property
• Stored keys must satisfy the 5
binary search tree property. 6
• ∀ y in left subtree of x, then 2
2
key[y] ≤ key[x]. 0
6
• ∀ y in right subtree of x, then 0
key[y] ≥ key[x]. 1 2
1 2
9 1
8 8
0 3
1 2 2
2 4 7
Inorder Traversal
The binary-search-tree property allows the keys of a binary search
tree to be printed, in (monotonically increasing) order, recursively.
i.e the Inorder of a BST will be a sorted array of its 5
6
keys(ascending) 2
2
0
6
0
Inorder-Tree-Walk (x) 1 2
1
9
2
1
8 8
0 3
1. if x ≠ NIL 1 2 2
2 4 7
2. then Inorder-Tree-Walk(left[p])
3. print key[x]
4. Inorder-Tree-Walk(right[p])
Querying a Binary Search Tree
• All dynamic-set search operations can be supported in O(h)
time.
• h = Θ(lg n) for a balanced binary tree (and for an average tree
built by adding nodes in random order.)
• h = Θ(n) for an unbalanced tree that resembles a linear chain
of n nodes in the worst case.
Tree Search
Tree-Search(x, k)
1. if x = NIL or k = key[x]
2. then return x
3. if k < key[x]
5
4. then return Tree-Search(left[x], k) 6
2
5. else return Tree-Search(right[x], k) 2
6
0
0
1 2
1 2
9 1
Running time: O(h) 8 8
0 3
1 2 2
Aside: tail-recursion 2 4 7
Iterative Tree Search
Iterative-Tree-Search(x, k) 5
6
1. while x ≠ NIL and k ≠ key[x] 2
2
0
2. do if k < key[x] 6
0
3. then x ← left[x] 1 2
1
9
2
1
8 8
0 3
4. else x ← right[x]
1 2 2
5. return x 2 4 7
The iterative tree search is more efficient on most computers.
The recursive tree search is more straightforward.
Finding Min & Max
⬥The binary-search-tree property guarantees that:
» The minimum is located at the left-most node.
» The maximum is located at the right-most node.
Tree-Minimum(x) Tree-Maximum(x)
1. while left[x] ≠ NIL 1. while right[x] ≠ NIL
2. do x ← left[x] 2. do x ← right[x]
3. return x 3. return x
Predecessor and Successor
• Successor of node x is the node y such that key[y] is the smallest
key greater than key[x].
• The successor of the largest key is NIL.
• Search consists of two cases.
• If node x has a non-empty right subtree, then x’s successor is the
minimum in the right subtree of x.
• If node x has an empty right subtree, then:
• As long as we move to the left up the tree (move up through right children), we are
visiting smaller keys.
• x’s successor y is the node that x is the predecessor of (x is the maximum in y’s left
subtree).
• In other words, x’s successor y, is the lowest ancestor of x whose left child is also
an ancestor of x.
Pseudo-code for Successor
Tree-Successor(x)
• if right[x] ≠ NIL
2. then return Tree-Minimum(right[x])
3. y ← p[x]
5
4. while y ≠ NIL and x = right[y] 6
2
5. do x ← y 2
6
0
0
6. y ← p[y]
1 2
1 2
7. return y 8 8
9
0
1
3
Code for predecessor is symmetric. 1 2 2
2 4 7
Running time: O(h)
BST Insertion – Pseudocode
Tree-Insert(T, z)
1. y ← NIL
• Change the dynamic set
represented by a BST. 2. x ← root[T]
3. while x ≠ NIL
• Ensure the 4. do y ← x
binary-search-tree 5. if key[z] < key[x]
property holds after
6. then x ← left[x]
change. 5
6 7. else x ← right[x]
• Insertion is easier than
2
2 8. p[z] ← y
deletion. 6
0
0 9. if y = NIL
2
1 2 10. then root[t] ← z
1 9 1
8 8
0 3
11. else if key[z] < key[y]
12. then left[y] ← z
1
2
2
4
2
7
13. else right[y] ← z
Analysis of Insertion Tree-Insert(T, z)
1. y ← NIL
2. x ← root[T]
3. while x ≠ NIL
4. do y ← x
5. if key[z] < key[x]
6. then x ← left[x]
7. else x ← right[x]
8. p[z] ← y
9. if y = NIL
10. then root[t] ← z
11. else if key[z] < key[y]
12. then left[y] ← z
13. else right[y] ← z
Analysis of Insertion Tree-Insert(T, z)
1. y ← NIL
• Initialization: O(1) 2. x ← root[T]
3. while x ≠ NIL
• While loop in lines 3-7 searches 4. do y ← x
for place to insert z, maintaining 5. if key[z] < key[x]
parent y. 6. then x ← left[x]
This takes O(h) time. 7. else x ← right[x]
8. p[z] ← y
• Lines 8-13 insert the value: O(1) 9. if y = NIL
10. then root[t] ← z
⇒ TOTAL: O(h) time to insert a 11. else if key[z] < key[y]
node. 12. then left[y] ← z
13. else right[y] ← z
Exercise: Sorting Using BSTs
Sort (A)
for i ← 1 to n
do tree-insert(A[i])
inorder-tree-walk(root)
• What are the worst case and best case running times?
• In practice, how would this compare to other sorting algorithms?
Tree-Delete (T, x)
if x has no children ♦ case 0
then remove x
if x has one child ♦ case 1
then make p[x] point to child
if x has two children (subtrees) ♦ case 2
then swap x with its successor
perform case 0 or case 1 to delete it
⇒ TOTAL: O(h) time to delete a node
Deletion – Pseudocode
Tree-Delete(T, z)
/* Determine which node to splice out: either z or z’s successor. */
• if left[z] = NIL or right[z] = NIL
• then y ← z
• else y ← Tree-Successor[z]
/* Set x to a non-NIL child of x, or to NIL if y has no children. */
4. if left[y] ≠ NIL
5. then x ← left[y]
6. else x ← right[y]
/* y is removed from the tree by manipulating pointers of p[y] and x
*/
7. if x ≠ NIL
8. then p[x] ← p[y]
/* Continued on next slide */
Deletion – Pseudocode
Tree-Delete(T, z) (Contd. from previous slide)
9. if p[y] = NIL
10. then root[T] ← x
11. else if y ← left[p[i]]
12. then left[p[y]] ← x
13. else right[p[y]] ← x
/* If z’s successor was spliced out, copy its data into z */
14. if y ≠ z
15. then key[z] ← key[y]
16. copy y’s satellite data into z.
17. return y
Correctness of Tree-Delete
• How do we know case 2 should go to case 0 or case 1
instead of back to case 2?
• Because when x has 2 children, its successor is the
minimum in its right subtree, and that successor has no
left child (hence 0 or 1 child).
• Equivalently, we could swap with predecessor instead
of successor. It might be good to alternate to avoid
creating lopsided tree.
Binary Search Trees
• View today as data structures that can support
dynamic set operations.
• Search, Minimum, Maximum, Predecessor, Successor, Insert, and
Delete.
• Can be used to build
• Dictionaries.
• Priority Queues.
• Basic operations take time proportional to the height of the
tree – O(h).
Red-black trees: Overview
• Red-black trees are a variation of binary search trees to ensure that
the tree is balanced.
• Height is O(lg n), where n is the number of nodes.
• Operations take O(lg n) time in the worst case.
Red-black Tree
• Binary search tree + 1 bit per node: the attribute color, which is either
red or black.
• All other attributes of BSTs are inherited:
•key, left, right, and p.
• All empty trees (leaves) are colored black.
• We use a single sentinel, nil, for all the leaves of red-black tree T, with
color[nil] = black.
• The root’s parent is also nil[T ].
Red-black Tree – Example
2
6
1
41
7
3 4
0 7
3 5
8 0
nil[T]
Red-black Properties
1. Every node is either red or black.
2. The root is black.
3. Every leaf (nil) is black.
4. If a node is red, then both its children are black.
5. For each node, all paths from the node to descendant leaves
contain the same number of black nodes.
Height of a Red-black Tree
• Height of a node:
• Number of edges in a longest path to a leaf.
• Black-height of a node x, bh(x):
• bh(x) is the number of black nodes (including nil[T ]) on the path from x to
leaf, not counting x.
• Black-height of a red-black tree is the black-height of its root.
• By Property 5, black height is well defined.
Height of a Red-black Tree 2 h=4
bh=2
6
• Example:
1 h=3
h=1 41 bh=2
7 bh=1
• Height of a node:
• Number of edges in a longest path
to a leaf. h=2 3 4 h=2
0 bh=1
bh=1 h=1 7
• Black-height of a node bh(x) is
bh=1
the number of black nodes on 3 h=1 5
path from x to leaf, not counting 8 bh=1 0
x.
nil[T]