KEMBAR78
Advanced data structures and implementation | PPT
Advanced Data Structures and
Implementation
•
•
•
•
•
•
•
•
•

Top-Down Splay Trees
Red-Black Trees
Top-Down Red Black Trees
Top-Down Deletion
Deterministic Skip Lists
AA-Trees
Treaps
k-d Trees
Pairing Heaps
Top-Down Splay Tree
• Direct strategy requires traversal from the root
•
•
•

down the tree, and then bottom-up traversal to
implement the splaying tree.
Can implement by storing parent links, or by
storing the access path on a stack.
Both methods require large amount of overhead
and must handle many special cases.
Initial rotations on the initial access path uses
only O(1) extra space, but retains the O(log N)
amortized time bound.
Case 1: Zig
X

L

R

L

R

Y
X

Y
XR
YL

YL

Yr

If Y should become root, then X and
its right sub tree are made left
children of the smallest value in R,
and Y is made root of “center” tree. Y
does not have to be a leaf for the Zig
case to apply.

Yr
XR
Case 2: Zig-Zig
X

L
Y

R

L

R

Z
Y

XR

Z

ZL

X

Zr

YR
YR
ZL

Zr

The value to be splayed is in
the tree rooted at Z. Rotate Y
about X and attach as left
child of smallest value in R

XR
Case 3: Zig-Zag
(Simplified)
X

L
Y

R

X

YL

Z

ZL

R
Y

XR

YL

L

Zr

Z

ZL

XR

Zr

The value to be splayed is in the tree rooted at Z. To make code simpler, the
Zig-Zag rotation is reduced to a single Zig. This results in more iterations in
the splay process.
Splay Trees Implementation
• See page 457 for generic SplayTree class

implementation.
• Page 458 shows Top-Down splaying
method.
• Refer to page 459 for insertion method
into the Top-Down Splay Tree.
• Page 460 shows deletion method from the
tree.
Reassembling the Splay Tree
L

X

X

R
L

XL

R

XR

XL

XR

When the value to be splayed to the root is at the root of the “center”
tree, we have reached the point where we are ready to reassemble the
tree. This is accomplished by a) making XL the right child of the
maximum element in L, b) making XR the left child of the minimum
element in R, and then making L and R the left and right children of X
Operation 1: Zig-Zig
L

R

A

L

Ar

B

Dl
F

H

Gl
G

Fl

X

Hl
H

Gl

Xl

Xr

X

Hl
Xl

Rotate B around
A and make L
child of
minimum
element in R
(which is now
empty)

Ar

G

Fl

Er

F

Br

Er

E

Dl

A

E

Cr

D

B
Cr

D

Br

C

R

C

Xr

L is still empty, and R
is now the tree rooted
at B. Note that R
contains nodes > X but
not in the right subtree
of X.
Operation 2: Zig-Zag
L

R

C
B

Cr

D

Er

F

Br

Dl
Fl

Cr

H

Br

X

Hl
H

A

C

G
Gl

Gl

Xl

Xr

X

Hl
Xl

Just perform
Zig
(simplified
Zig-Zag)

B

Er

F
Ar

G

Fl

R

E
D

A

E

Dl

L

Xr

L was previously
empty and it now
consists of node D and
D’s left subtree

Ar
After X reaches root:
L
D
Dl

Xr

Xl

B

G

A

C
H

F
Fl

R

X

Gl

Cr

E

Hl

Br

Ar

Er

Reassemble – XL becomes right sub
tree of H, XR becomes left sub tree of
E, and then L, R reattached to X

X

B

D
Dl

G
H
Gl

A

C

F
Fl

This
configuration
was achieved
by doing Zig
Zig (of F, G)
followed by a
Zig (node H)

Hl

Cr

E
Xl

Er

Br

Ar

Note that this is not the
same tree as was obtained
by doing BU splaying.
Red-Black Tree
•
•
•
•
•
•
•
•

Popular alternative to the AVL tree.
Operations take O(log N) time in worst case.
Height is at most 2log(N+1).
A red-black tree is a binary search tree with one extra
attribute for each node: the color, which is either red or
black.
The root is black.
If node is red, its children must be black .
Every path from a node to a null reference must
contain the same number of black nodes .
Basic operations to conform with rules are color changes
and tree rotations.
Theorem 1 – In a red-black tree, at least
half the nodes on any path from the root to
a leaf must be black.
Proof – If there is a red node on the path,
there must be a corresponding black node.
Theorem 2 – In a red-black tree, no path from
any node, N, to a leaf is more than twice as
long as any other path from N to any other
leaf.
Proof: By definition, every path from a node to
any leaf contains the same number of black
nodes. By Theorem1, a least ½ the nodes on
any such path are black. Therefore, there can
no more than twice as many nodes on any
path from N to a leaf as on any other path.
Therefore the length of every path is no more
than twice as long as any other path.
Theorem 3 –
A red-black tree with n internal nodes has
height h <= 2 lg(n + 1).
Proof: Let h be the height of the red-black tree
with root x. By Theorem 1,
bh(x) >= h/2
From Theorem 1, n >= 2bh(x) - 1
Therefore n >= 2 h/2 – 1
n + 1 >= 2h/2
lg(n + 1) >= h/2
2lg(n + 1) >= h
Bottom-Up Insertion
•
•
•
•
•

Cases:
0: X is the root – color it black
1: Both parent and uncle are red – color parent and
uncle black, color grandparent red, point X to
grandparent, check new situation
2 (zig-zag): Parent is red, but uncle is black. X and its
parent are opposite type children – color grandparent
red, color X black, rotate left on parent, rotate right on
grandparent
3 (zig-zig): Parent is red, but uncle is black. X and its
parent are both left or both right children – color parent
black, color grandparent red, rotate right on grandparent
Top-Down Red-Black Trees
• In T-Down insertion, the corrections are done
•
•
•

while traversing down the tree to the insertion
point.
When the actual insertion is done, no further
corrections are needed, so no need to traverse
back up the tree.
So, T-Down insertion can be done iteratively
which is generally faster.
Insertion is always done as a leaf (as in ordinary
BST insertion).
Process
• On the way down, when we see a node X that
•
•

has two red children, we make X red and its two
children black.
If X’s parent is red, we can apply either the
single or double rotation to keep us from having
two consecutive red nodes.
X’s parent and the parent’s sibling cannot both
be red, since their colors would already have
been flipped in that case.
Example: Insert 45
30
15

85

60

20

10

5

70

Two red
children

50
40

65
55

80

90
Example (Cont.)
30
15
10

5

70
85

60

20

flip colors two red nodes

50
40

65
55

80

90
Example (Cont.): Do a single
rotation
30
15
10

5

60
20

70

50
40

55

85

65
80

90
Example (Cont.): Now Insert 45
30
15
10

5

60
20

70

50
40

55
45

85

65
80

90
Note
• Since the parent of the newly inserted node was
•
•

black, we are done.
Had the parent of the inserted node been red,
one more rotation would have had to be
performed.
Although red-black trees have slightly weaker
balancing properties, their performance in
experimentally almost identical to that of AVL
trees.
Implementation of Top-Down RedBlack Trees
• See pages 464-467
Top-Down Deletions
• Recall that in deleting from a binary search tree, the only
•
•
•
•

nodes which are actually removed are leaves or nodes
with exactly one child.
Nodes with two children are never removed. Their
contents are just replaced.
If the node to be deleted is red, there is no problem - just
delete the node.
If the node to be deleted is black, its removal will violate
property.
The solution is to ensure that any node to be deleted is
red.
Deterministic Skip Lists
• A probabilistically balanced
•
•
•

linked list.
Invented in 1986 by William
Pugh.
Definition: Two elements are
linked if there exists at least
one link going from one to
another.
Definition: The gap size
between two elements linked
at height h is equal to the
number of elements of height
h-1 between them.
Skip List
d) xtra pointers every eighth item - full structure

3

6

7

9

12

17

21

19

25

26

NIL

e) skip list - same link distribution, random choice
6
3

7

9

12

17

19

21

25

NIL
26
Search time
• In the deterministic version (a-d):
–
–
–
–

in a, we need to check at most n nodes
in b, at most n/2+1 nodes
in c, at most n/4+2 nodes
in general, at most log N nodes

• Efficient search, but impractical insertion
and deletion.
Levels
• A node with k forward level
•

pointers is called a
level k node.
If every (2i)th node
has a pointer 2i nodes
ahead, they have the
following distribution:

percent

1

50

2

25

3

12.5

…

…
Central idea in skip lists
• Choose levels of nodes randomly, but in

the same proportions (as in e).
• A node’s i th forward pointer, points to the
next node of level i or higher.
• Insertions and deletions require only local
modifications.
• A node’s level never changes after first
being chosen.
Insertion
• To perform insertion, we must make sure

that when a new node of height h is
added, it doesn’t create a gap of four
heights of h node (in 1-2-3 deterministic
skip list).
• See page 269 fig. 12.19
• For implementation of Skip List see pages
472-474.
AA-Trees
• Also known as binary B-tree (BB-tree).
• BB-tree is a red-black tree with one extra condition: any
•

node may have at most one red child.
Some conditions to make it simpler (p.475):
- only right child can be red
- code functions recursively
- instead of color store information in small
integer:
- one if the node is a leaf
- the level of its parent, if the node is red
- one less then the level of its parent, if the
node is black
Advantages
• AA-trees simplify algorithms by:

- eliminating half of the restructuring cases
- simplifying deletion by removing an annoying
case
• if an internal node has only one child, that
child must be a red right child
• We can always replace a node with the
smallest child in the right sub tree (it will
either be a leaf or have a red child)
Links in AA-tree
• A horizontal link is a connection between a node
•
•
•
•

and a child with equal levels.
Horizontal links are right references.
There cannot be two consecutive horizontal
links.
Nodes at level 2 or higher must have two
children.
If a node has no right horizontal link, its two
children are at the same level.
Example
70

30
15

5

10

50

20

35

40

60

55

85

65

80

90
Insertion in AA-tree
• A new item is always inserted at the bottom
•
•
•

level.
In the previous example, inserting 2 will create a
horizontal left link.
In the previous example, inserting 45 generates
consecutive right links.
After inserting at the bottom level, we may need
to perform rotations to restore the horizontal link
properties.
skew – remove left horizontal links
P

A

X

B

P

C

A

X

B

C
split – remove consecutive
horizontal links
P
X

A

P

B

G

X

A

G

B
More on skew & split
• skew removes a left horizontal link.
• skew might also create consecutive right

horizontal links.
• First we must apply skew and then use
split, if necessary.
• After a split, the middle node increases a
level, which may create a problem for the
original parent.
Implementation of AA-trees
• Refer to pages 476 – 480 for detailed

implementation techniques.
• See page 477 & 479 for more examples of
left and right rotations of AA-trees.
Treaps
• Binary search tree.
• Like skip list, it uses random numbers and
gives O (log N) expected time for any
input.
• Slower than balanced search tree.
• Although deletion is much slower, it is still
O (log N) expected time.
Definition of a treap
• Each node stores an item, left and right link, and
•

•

a priority that is randomly assigned when the
node is created.
Treap is a binary search tree with the property
that the node priorities satisfy heap order: any
node’s priority must be at least as large as its
parents.
See pages 481-483 for implementation details.
k – d Trees
•
•

•
•

Multidimensional b-tree.
Branching of odd levels is
done with respect to the
first key, and branching
on even levels is done
with respect to the
second key.
Root is arbitrary chosen
to be an odd level.
Can be visually
represented:
Some facts about k-d trees
• Can have any number of dimensions.
• In practice searches tend to be very

efficient.
• For a randomly constructed tree, the
average running time of a partial match
query is O (M+kN^(1-1/k)).
• See pages 484-485 for implementation
details.
Pairing Heaps
• A min (max) pairing heap is a min (max)
tree in which operations are done in a
specified manner. 8

2

1

6

5

4

1

3

3

3

4

2

1

5
Insert
• Create 1-element max tree with new item
and meld with existing max pairing heap.
9

9

7

6

6

3

7

+ insert(2) =

2

7

6

6

3

7
Insert (Cont.)
• Create 1-element max tree with new item
and meld with existing max pairing heap.
14
9

7

6

7

9

+ insert(14) =
7

6

6

6

3

3

7
IncreaseKey (Node, theAmount)
• Since nodes do not have parent fields, we

cannot easily check whether the key in the
Node becomes larger than that in its
parent.
• So, detach the Node from sibling doublylinked list and meld.
IncreaseKey (Node, theAmount)
9

theNode
6

4

3

5

2

4

2

4

1

2

3

6

1

3

• If theNode is not the root,
1

remove sub tree rooted at
theNode from its sibling list.
IncreaseKey (Node, theAmount)
9
18

2

2

6

3
4

3

3

1

5

2

4

1

1

6
IncreaseKey (Node, theAmount)
18
9
2
2

6

3

6
3

4

3

1

5

2

4

1

1
Pairing heaps
• See pages 488 – 491 for implementation
details.

Advanced data structures and implementation

  • 1.
    Advanced Data Structuresand Implementation • • • • • • • • • Top-Down Splay Trees Red-Black Trees Top-Down Red Black Trees Top-Down Deletion Deterministic Skip Lists AA-Trees Treaps k-d Trees Pairing Heaps
  • 2.
    Top-Down Splay Tree •Direct strategy requires traversal from the root • • • down the tree, and then bottom-up traversal to implement the splaying tree. Can implement by storing parent links, or by storing the access path on a stack. Both methods require large amount of overhead and must handle many special cases. Initial rotations on the initial access path uses only O(1) extra space, but retains the O(log N) amortized time bound.
  • 4.
    Case 1: Zig X L R L R Y X Y XR YL YL Yr IfY should become root, then X and its right sub tree are made left children of the smallest value in R, and Y is made root of “center” tree. Y does not have to be a leaf for the Zig case to apply. Yr XR
  • 6.
    Case 2: Zig-Zig X L Y R L R Z Y XR Z ZL X Zr YR YR ZL Zr Thevalue to be splayed is in the tree rooted at Z. Rotate Y about X and attach as left child of smallest value in R XR
  • 8.
    Case 3: Zig-Zag (Simplified) X L Y R X YL Z ZL R Y XR YL L Zr Z ZL XR Zr Thevalue to be splayed is in the tree rooted at Z. To make code simpler, the Zig-Zag rotation is reduced to a single Zig. This results in more iterations in the splay process.
  • 9.
    Splay Trees Implementation •See page 457 for generic SplayTree class implementation. • Page 458 shows Top-Down splaying method. • Refer to page 459 for insertion method into the Top-Down Splay Tree. • Page 460 shows deletion method from the tree.
  • 10.
    Reassembling the SplayTree L X X R L XL R XR XL XR When the value to be splayed to the root is at the root of the “center” tree, we have reached the point where we are ready to reassemble the tree. This is accomplished by a) making XL the right child of the maximum element in L, b) making XR the left child of the minimum element in R, and then making L and R the left and right children of X
  • 12.
    Operation 1: Zig-Zig L R A L Ar B Dl F H Gl G Fl X Hl H Gl Xl Xr X Hl Xl RotateB around A and make L child of minimum element in R (which is now empty) Ar G Fl Er F Br Er E Dl A E Cr D B Cr D Br C R C Xr L is still empty, and R is now the tree rooted at B. Note that R contains nodes > X but not in the right subtree of X.
  • 13.
    Operation 2: Zig-Zag L R C B Cr D Er F Br Dl Fl Cr H Br X Hl H A C G Gl Gl Xl Xr X Hl Xl Justperform Zig (simplified Zig-Zag) B Er F Ar G Fl R E D A E Dl L Xr L was previously empty and it now consists of node D and D’s left subtree Ar
  • 14.
    After X reachesroot: L D Dl Xr Xl B G A C H F Fl R X Gl Cr E Hl Br Ar Er Reassemble – XL becomes right sub tree of H, XR becomes left sub tree of E, and then L, R reattached to X X B D Dl G H Gl A C F Fl This configuration was achieved by doing Zig Zig (of F, G) followed by a Zig (node H) Hl Cr E Xl Er Br Ar Note that this is not the same tree as was obtained by doing BU splaying.
  • 16.
    Red-Black Tree • • • • • • • • Popular alternativeto the AVL tree. Operations take O(log N) time in worst case. Height is at most 2log(N+1). A red-black tree is a binary search tree with one extra attribute for each node: the color, which is either red or black. The root is black. If node is red, its children must be black . Every path from a node to a null reference must contain the same number of black nodes . Basic operations to conform with rules are color changes and tree rotations.
  • 17.
    Theorem 1 –In a red-black tree, at least half the nodes on any path from the root to a leaf must be black. Proof – If there is a red node on the path, there must be a corresponding black node.
  • 18.
    Theorem 2 –In a red-black tree, no path from any node, N, to a leaf is more than twice as long as any other path from N to any other leaf. Proof: By definition, every path from a node to any leaf contains the same number of black nodes. By Theorem1, a least ½ the nodes on any such path are black. Therefore, there can no more than twice as many nodes on any path from N to a leaf as on any other path. Therefore the length of every path is no more than twice as long as any other path.
  • 19.
    Theorem 3 – Ared-black tree with n internal nodes has height h <= 2 lg(n + 1). Proof: Let h be the height of the red-black tree with root x. By Theorem 1, bh(x) >= h/2 From Theorem 1, n >= 2bh(x) - 1 Therefore n >= 2 h/2 – 1 n + 1 >= 2h/2 lg(n + 1) >= h/2 2lg(n + 1) >= h
  • 20.
    Bottom-Up Insertion • • • • • Cases: 0: Xis the root – color it black 1: Both parent and uncle are red – color parent and uncle black, color grandparent red, point X to grandparent, check new situation 2 (zig-zag): Parent is red, but uncle is black. X and its parent are opposite type children – color grandparent red, color X black, rotate left on parent, rotate right on grandparent 3 (zig-zig): Parent is red, but uncle is black. X and its parent are both left or both right children – color parent black, color grandparent red, rotate right on grandparent
  • 21.
    Top-Down Red-Black Trees •In T-Down insertion, the corrections are done • • • while traversing down the tree to the insertion point. When the actual insertion is done, no further corrections are needed, so no need to traverse back up the tree. So, T-Down insertion can be done iteratively which is generally faster. Insertion is always done as a leaf (as in ordinary BST insertion).
  • 22.
    Process • On theway down, when we see a node X that • • has two red children, we make X red and its two children black. If X’s parent is red, we can apply either the single or double rotation to keep us from having two consecutive red nodes. X’s parent and the parent’s sibling cannot both be red, since their colors would already have been flipped in that case.
  • 23.
    Example: Insert 45 30 15 85 60 20 10 5 70 Twored children 50 40 65 55 80 90
  • 24.
  • 25.
    Example (Cont.): Doa single rotation 30 15 10 5 60 20 70 50 40 55 85 65 80 90
  • 26.
    Example (Cont.): NowInsert 45 30 15 10 5 60 20 70 50 40 55 45 85 65 80 90
  • 27.
    Note • Since theparent of the newly inserted node was • • black, we are done. Had the parent of the inserted node been red, one more rotation would have had to be performed. Although red-black trees have slightly weaker balancing properties, their performance in experimentally almost identical to that of AVL trees.
  • 28.
    Implementation of Top-DownRedBlack Trees • See pages 464-467
  • 29.
    Top-Down Deletions • Recallthat in deleting from a binary search tree, the only • • • • nodes which are actually removed are leaves or nodes with exactly one child. Nodes with two children are never removed. Their contents are just replaced. If the node to be deleted is red, there is no problem - just delete the node. If the node to be deleted is black, its removal will violate property. The solution is to ensure that any node to be deleted is red.
  • 30.
    Deterministic Skip Lists •A probabilistically balanced • • • linked list. Invented in 1986 by William Pugh. Definition: Two elements are linked if there exists at least one link going from one to another. Definition: The gap size between two elements linked at height h is equal to the number of elements of height h-1 between them.
  • 31.
    Skip List d) xtrapointers every eighth item - full structure 3 6 7 9 12 17 21 19 25 26 NIL e) skip list - same link distribution, random choice 6 3 7 9 12 17 19 21 25 NIL 26
  • 32.
    Search time • Inthe deterministic version (a-d): – – – – in a, we need to check at most n nodes in b, at most n/2+1 nodes in c, at most n/4+2 nodes in general, at most log N nodes • Efficient search, but impractical insertion and deletion.
  • 33.
    Levels • A nodewith k forward level • pointers is called a level k node. If every (2i)th node has a pointer 2i nodes ahead, they have the following distribution: percent 1 50 2 25 3 12.5 … …
  • 34.
    Central idea inskip lists • Choose levels of nodes randomly, but in the same proportions (as in e). • A node’s i th forward pointer, points to the next node of level i or higher. • Insertions and deletions require only local modifications. • A node’s level never changes after first being chosen.
  • 35.
    Insertion • To performinsertion, we must make sure that when a new node of height h is added, it doesn’t create a gap of four heights of h node (in 1-2-3 deterministic skip list). • See page 269 fig. 12.19 • For implementation of Skip List see pages 472-474.
  • 36.
    AA-Trees • Also knownas binary B-tree (BB-tree). • BB-tree is a red-black tree with one extra condition: any • node may have at most one red child. Some conditions to make it simpler (p.475): - only right child can be red - code functions recursively - instead of color store information in small integer: - one if the node is a leaf - the level of its parent, if the node is red - one less then the level of its parent, if the node is black
  • 37.
    Advantages • AA-trees simplifyalgorithms by: - eliminating half of the restructuring cases - simplifying deletion by removing an annoying case • if an internal node has only one child, that child must be a red right child • We can always replace a node with the smallest child in the right sub tree (it will either be a leaf or have a red child)
  • 38.
    Links in AA-tree •A horizontal link is a connection between a node • • • • and a child with equal levels. Horizontal links are right references. There cannot be two consecutive horizontal links. Nodes at level 2 or higher must have two children. If a node has no right horizontal link, its two children are at the same level.
  • 39.
  • 40.
    Insertion in AA-tree •A new item is always inserted at the bottom • • • level. In the previous example, inserting 2 will create a horizontal left link. In the previous example, inserting 45 generates consecutive right links. After inserting at the bottom level, we may need to perform rotations to restore the horizontal link properties.
  • 41.
    skew – removeleft horizontal links P A X B P C A X B C
  • 42.
    split – removeconsecutive horizontal links P X A P B G X A G B
  • 43.
    More on skew& split • skew removes a left horizontal link. • skew might also create consecutive right horizontal links. • First we must apply skew and then use split, if necessary. • After a split, the middle node increases a level, which may create a problem for the original parent.
  • 44.
    Implementation of AA-trees •Refer to pages 476 – 480 for detailed implementation techniques. • See page 477 & 479 for more examples of left and right rotations of AA-trees.
  • 45.
    Treaps • Binary searchtree. • Like skip list, it uses random numbers and gives O (log N) expected time for any input. • Slower than balanced search tree. • Although deletion is much slower, it is still O (log N) expected time.
  • 46.
    Definition of atreap • Each node stores an item, left and right link, and • • a priority that is randomly assigned when the node is created. Treap is a binary search tree with the property that the node priorities satisfy heap order: any node’s priority must be at least as large as its parents. See pages 481-483 for implementation details.
  • 47.
    k – dTrees • • • • Multidimensional b-tree. Branching of odd levels is done with respect to the first key, and branching on even levels is done with respect to the second key. Root is arbitrary chosen to be an odd level. Can be visually represented:
  • 48.
    Some facts aboutk-d trees • Can have any number of dimensions. • In practice searches tend to be very efficient. • For a randomly constructed tree, the average running time of a partial match query is O (M+kN^(1-1/k)). • See pages 484-485 for implementation details.
  • 49.
    Pairing Heaps • Amin (max) pairing heap is a min (max) tree in which operations are done in a specified manner. 8 2 1 6 5 4 1 3 3 3 4 2 1 5
  • 50.
    Insert • Create 1-elementmax tree with new item and meld with existing max pairing heap. 9 9 7 6 6 3 7 + insert(2) = 2 7 6 6 3 7
  • 51.
    Insert (Cont.) • Create1-element max tree with new item and meld with existing max pairing heap. 14 9 7 6 7 9 + insert(14) = 7 6 6 6 3 3 7
  • 52.
    IncreaseKey (Node, theAmount) •Since nodes do not have parent fields, we cannot easily check whether the key in the Node becomes larger than that in its parent. • So, detach the Node from sibling doublylinked list and meld.
  • 53.
    IncreaseKey (Node, theAmount) 9 theNode 6 4 3 5 2 4 2 4 1 2 3 6 1 3 •If theNode is not the root, 1 remove sub tree rooted at theNode from its sibling list.
  • 54.
  • 55.
  • 56.
    Pairing heaps • Seepages 488 – 491 for implementation details.