HASHING FUNCTIONS
Hashing Function
• Hashing function is a function which is applied on a key by which it
produces an integer, which can be used as an address in hash table.
• A simple hashing function: h(k) = k mod m
UCS405 "Discrete Mathematical Structures"
Properties of Hashing Functions
• Easy to compute
• Uniform distribution
• Less collisions
UCS405 "Discrete Mathematical Structures"
Hash Table: Example
• Example: phone book with table size N = 5
• hash function h(w) = (length of the word w) mod 5
0 (Alice,9578 - -)
Alice
1
John 2
• Problem: collisions
3 (Ria,1234 - -)
Ria
• Where to store Joe (collides with Ria) 4 (John,8649 - -)
UCS405 "Discrete Mathematical Structures"
Collisions
• Collisions occur when different elements are mapped to the same cell.
• Keys k1, k2 with h(k1) = h(k2) are said to collide
What should we do now?
• Find a better hashing algorithm
• Use a bigger table
• Need a system to deal with collisions
UCS405 "Discrete Mathematical Structures"
Resolving Collisions
• Two different methods for collision resolution:
• Separate Chaining: Use a dictionary data structure (such as a linked
list) to store multiple items that hash to the same slot.
• Closed Hashing (or Open Addressing): search for empty slots using
a second function and store item in first empty slot that is found.
UCS405 "Discrete Mathematical Structures"
Separate Chaining
• Each cell of the hash table points to a linked list of elements that are mapped to this cell.
• Simple, but requires additional memory outside of the table
0 (Alice,9578 - -) X
Alice
1
John 2
3 (Ria,8976 - -) (Joe,3182 - -)
Ria
4 (John,4743 - -) X
Joe
UCS405 "Discrete Mathematical Structures"
Closed Hashing or Open Addressing
• Open addressing does not introduce a new structure.
• If a collision occurs then we look for availability in the next spot generated by an
algorithm.
• There are many implementations of open addressing, using different strategies for where
to probe next:
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
UCS405 "Discrete Mathematical Structures"
Contd..
• Given an item X, try cells h0(X), h1(X), h2(X), …, hi(X)
• hi(X) = (Hash(X) + F(i)) mod TableSize
• F(0) = 0
• F is the collision resolution function. Some possibilities:
• Linear: F(i) = i
• Quadratic: F(i) = i2
• Double Hashing: F(i) = i*Hash2(X)
UCS405 "Discrete Mathematical Structures"
Linear Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
21 21
3 3 3 3
2
4 4 4 4
5 5 5 5
6 6 6 6
UCS405 "Discrete Mathematical Structures"
Quadratic Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
2
3 3 3 3
4 4 4 4
21 21
5 5 5 5
6 6 6 6
UCS405 "Discrete Mathematical Structures"
Double Hashing
• Double hashing can be done using :
(hash1(key) + i * hash2(key)) % TABLE_SIZE
• First hash function is typically
hash1(key) = key % TABLE_SIZE
• A popular second hash function is :
hash2(key) = PRIME – (key % PRIME)
where PRIME is a prime smaller than the TABLE_SIZE.
UCS405 "Discrete Mathematical Structures"
UCS405 "Discrete Mathematical Structures"
Double Hashing Example
insert(19) insert(27) insert(36) insert(10)
19%13 = 6 27%13 = 1 36%13 = 10 10%13 = 10
0 0 0 0
1 1 27 1 27 1 27 Collision 2
2 2 2 2
3 3 3 3 Let Hash2(key)=7-(key % 7)
4 4 4 4
5 5 5
Hash1(10)=10%13=10 (Collision1)
5 10
Hash 2(10)=7-(10%7)=4
6 19 6 19 6 19 6 19
7 7 7 7 (Hash1(10)+1*Hash2(10))%13=1(Collision 2)
8 8 8 8 (Hash1(10)+2*Hash2(10))%13=5
9 9 9 9
10 10 10 36 10 36 Collision 1
11 11 11 11
12 12 12 12 UCS405 "Discrete Mathematical Structures"
PRACTICE QUESTIONS ON
GROWTH OF FUNCTIONS
Q1. Give a big-O notation to estimate the sum of the first n positive integers.
UCS405 "Discrete Mathematical Structures"
Q2. Give a big-O estimate for the factorial function.
UCS405 "Discrete Mathematical Structures"
Q3. Give a big-O estimate for the following function:
f(n) = 3n log (n!) + (n2 + 3) log n
UCS405 "Discrete Mathematical Structures"
As mentioned before, big-O notation is used to estimate the number of operations needed to solve a problem using a
specified procedure or algorithm. The functions used in these estimates often include the following:
1, log n, n, n log n, n2, 2n, n!
Using calculus it can be shown that each function in the list is smaller than the succeeding function, in the sense that the
ratio of a function and the succeeding function tends to zero as n grows without bound. Figure displays the graphs of these
functions, using a scale for the values of the functions that doubles for each successive marking on the graph. That is, the
vertical scale in this graph is logarithmic.
UCS405 "Discrete Mathematical Structures"
A Display of the Growth of Functions Commonly
Used in Big-O Estimates.
Thank You