5.4 randomized datastructures

CSE 326
Randomized Data Structures
David Kaplan
Dept of Computer Science & Engineering
Autumn 2001

Randomized Data
Structures
CSE 326 Autumn 2001
2
Motivation for
We’ve seen many data structures with good average
case performance on random inputs, but bad behavior
on particular inputs
 e.g. Binary Search Trees
 Instead of randomizing the input (since we cannot!),
consider randomizing the data structure
 No bad inputs, just unlucky random numbers
 Expected case good behavior on any input

Randomized Data
Structures
CSE 326 Autumn 2001
3
Average vs. Expected Time
Average (1/N) • ∑ xi
Expectation ∑ Pr(xi) • xi
Deterministic with good average time
 If your application happens to always (or often) use the
“bad” case, you are in big trouble!
Randomized with good expected time
 Once in a while you will have an expensive operation, but no
inputs can make this happen all the time
Like an insurance policy for your algorithm!

Randomized Data
Structures
CSE 326 Autumn 2001
4
 Define a property (or subroutine) in an algorithm
 Sample or randomly modify the property
 Use altered property as if it were the true property
Can transform average case runtimes into expected
runtimes (remove input dependency).
Sometimes allows substantial speedup in exchange for
probabilistic unsoundness.

Randomized Data
Structures
CSE 326 Autumn 2001
5
Randomization in Action
 Quicksort
 Randomized data structures
 Treaps
 Randomized skip lists
 Random number generation
 Primality testing

Randomized Data
Structures
CSE 326 Autumn 2001
6
Treap
Dictionary Data Structure
Treap is a BST
 binary tree property
 search tree property
Treap is also a heap
 heap-order property
 random priorities
15
12
10
30
9
15
7
8
4
18
6
7
2
9
priority
key

Randomized Data
Structures
CSE 326 Autumn 2001
7
Treap Insert
 Choose a random priority
 Insert as in normal BST
 Rotate up until heap order is restored
6
7
insert(15)
7
8
2
9
14
12
6
7
7
8
2
9
14
12
9
15
6
7
7
8
2
9
9
15
14
12

Randomized Data
Structures
CSE 326 Autumn 2001
8
Tree + Heap… Why Bother?
 Insert data in sorted order into a treap …
What shape tree comes out?
6
7
insert(7)
6
7
insert(8)
7
8
6
7
insert(9)
7
8
2
9
6
7
insert(12)
7
8
2
9
15
12

Treap Delete
 Find the key
 Increase its value to ∞
 Rotate it to the fringe
 Snip it off
delete(9)
6
7
7
8
2⇒∞
9
9
15
15
12
7
8
6
7
∞
9
9
15
15
12
7
8
6
7
9
15
∞
9
15
12
7
8
6
7
9
15
15
12
∞
9
7
8
6
7
9
15
15
12

Randomized Data
Structures
CSE 326 Autumn 2001
10
Treap Summary
 Implements Dictionary ADT
 insert in expected O(log n) time
 delete in expected O(log n) time
 find in expected O(log n) time
 but worst case O(n)
 Memory use
 O(1) per node
 about the cost of AVL trees
 Very simple to implement
 little overhead – less than AVL trees

Randomized Data
Structures
CSE 326 Autumn 2001
11
Perfect Binary Skip List
 Sorted linked list
 # of links of a node is its height
 The height i link of each node (that has one)
links to the next node of height i or greater
8
2
11
10
19
13
20
22
29
23

Randomized Data
Structures
CSE 326 Autumn 2001
12
Find in a Perfect Binary Skip List
 Start i at the maximum height
 Until the node is found or i is one and the next
node is too large:
 If the next node along the i link is less than the
target, traverse to the next node
 Otherwise, decrease i by one
Runtime?

Randomized Data
Structures
CSE 326 Autumn 2001
13
Randomized Skip List Intuition
It’s far too hard to insert into a perfect skip list
But is perfection necessary?
What matters in a skip list?
 We want way fewer tall nodes than short ones
 Make good progress through the list with each
high traverse

Randomized Data
Structures
CSE 326 Autumn 2001
14
Randomized Skip List
 Sorted linked list
 # of links of a node is its height
 The height i link of each node (that has one) links to
the next node of height i or greater
 There should be about 1/2 as many height i+1 nodes
as height i nodes
2
19
23
8
13
29
20
10
22
11

Randomized Data
Structures
CSE 326 Autumn 2001
15
Find in a RSL
 Start i at the maximum height
 Until the node is found or i is one and the
next node is too large:
 If the next node along the i link is less than the
target, traverse to the next node
 Otherwise, decrease i by one
Same as for a perfect skip list!
Runtime?

Randomized Data
Structures
CSE 326 Autumn 2001
16
Insertion in a RSL
 Flip a coin until it comes up heads; that takes
i flips. Make the new node’s height i.
 Do a find, remembering nodes where we go
down
 Put the node at the spot where the find ends
 Point all the nodes where we went down (up
to the new node’s height) at the new node
 Point the new node’s links where those
redirected pointers were pointing

RSL Insert Example
2
19
23
8
13
29
20
10
11
insert(22)
with 3 flips
2
19
23
8
13
29
20
10
22
11
Runtime?

Randomized Data
Structures
CSE 326 Autumn 2001
18
Range Queries and Iteration
Range query: search for everything that falls between
two values
 find start point
 walk list at the bottom
 output each node until the end point
Iteration: successively return (in order) each element in
the structure
 start at beginning, walk list to end
 Just like a linked list!
Runtimes?

Randomized Data
Structures
CSE 326 Autumn 2001
19
Randomized Skip List
Summary
 Implements Dictionary ADT
 insert in expected O(log n)
 find in expected O(log n)
 Memory use
 expected constant memory per node
 about double a linked list
 Less overhead than search trees for iteration
over range queries

Randomized Data
Structures
CSE 326 Autumn 2001
20
Random Number Generation
Linear congruential generator
 xi+1 = Axi mod M
Choose A, M to minimize repetitions
 M is prime
 (Axi mod M) cycles through {1, … , M-1} before repeating
 Good choices:
M = 231
– 1 = 2,147,483,647
A = 48,271
Must handle overflow!

Randomized Data
Structures
CSE 326 Autumn 2001
21
Some Properties of Primes
If:
 P is prime
 0 < X < P
Then:
 XP-1
= 1 (mod P)
 the only solutions to X2
= 1 (mod P) are:
X = 1 and X = P - 1

Randomized Data
Structures
CSE 326 Autumn 2001
22
Checking Non-Primality
Exponentiation (pow function, Weiss 2.4.4) can be
modified to detect non-primality (composite-ness)
// Computes x^n mod(M)
HugeInt pow(HugeInt x, HugeInt n, HugeInt M) {
if (n == 0) return 1;
if (n == 1) return x;
HugeInt squared = x * x % M;
// 1 < x < M - 1 && squared == 1 --> M isn’t prime!
if (isEven(n)) return pow(squared, n/2, M);
else return (pow(squared, n/2, M) * x);
}

Randomized Data
Structures
CSE 326 Autumn 2001
23
Checking Primality
Systematic algorithm
For prime P, for all A such that 0 < A < P
Calculate AP-1
mod P using pow
Check at each step of pow and at end for primality conditions
Randomized algorithm
Use one random A
 If the randomized algorithm reports failure, then P isn’t prime.
 If the randomized algorithm reports success, then P might be prime.
 At most (P-9)/4 values of A fool this prime check
 if algorithm reports success, then P is prime with probability > ¾
Each new A has independent probability of false positive
 We can make probability of false positive arbitrarily small

Randomized Data
Structures
CSE 326 Autumn 2001
24
Evaluating
Randomized Primality Testing
Your probability of being struck by lightning this year:
0.00004%
Your probability that a number that tests prime 11 times
in a row is actually not prime: 0.00003%
Your probability of winning a lottery of 1 million people
five times in a row: 1 in 2100
Your probability that a number that tests prime 50 times
in a row is actually not prime: 1 in 2100

Randomized Data
Structures
CSE 326 Autumn 2001
25
Cycles
Prime numbers
Random numbers
Randomized
algorithms

5.4 randomized datastructures

More Related Content

What's hot

Viewers also liked

Similar to 5.4 randomized datastructures

More from Krish_ver2

Recently uploaded

5.4 randomized datastructures

Editor's Notes