MODULE -5
1. What is hashing? List any three applications of hashing
In data structures,
Hashing is a well-known technique to search any particular element among several
elements.
It minimizes the number of comparisons while performing the search
Hashing Mechanism-
In hashing,
• Hashing is the process of mapping large amount of data item to smaller table
with the help of hashing function.
• Hashing is also known as Hashing Algorithm or Message Digest Function.
• An array data structure called as Hash table is used to store the data items.
• Based on the hash key value, data items are inserted into the hash table.
Applications
Database indexing: Hashing is used to index and retrieve data efficiently in
databases and other data storage systems.
Dictionaries : To implement a dictionary so that we can quickly search a word
Password storage: Hashing is used to store passwords securely by applying a hash
function to the password and storing the hashed result, rather than the plain text
password.
Network Routing: Determining the best path for data packets
Bloom Filters : Bloom filter is a space optimized and probabilistic version of
hashing and has huge applications like spam filtering, recommendations.
Cryptography: Hashing is used in cryptography to generate digital signatures,
message authentication codes (MACs), and key derivation functions.
Load balancing: Hashing is used in load-balancing algorithms, such as consistent
hashing, to distribute requests to servers in a network.
Blockchain: Hashing is used in blockchain technology, such as the proof-of-work
algorithm, to secure the integrity and consensus of the blockchain.
Image processing: Hashing is used in image processing applications, such as
perceptual hashing, to detect and prevent image duplicates and modifications.
File comparison: Hashing is used in file comparison algorithms, such as the MD5
and SHA-1 hash functions, to compare and verify the integrity of files.
Fraud detection: Hashing is used in fraud detection and cybersecurity applications,
such as intrusion detection and antivirus software, to detect and prevent malicious
activities.
Caching: Storing frequently accessed data for faster retrieval. For example browser
caches, we can use URL as keys and find the local storage of the URL.
Symbol Tables: Mapping identifiers to their values in programming languages
Write any 3 applications
2. What is meant by collision? Give an example.
3. Write an algorithm to sort a set of numbers using insertion sort
Insertion Algorithms: Steps on how it works:
1. If it is the first element, it is already sorted.
2. Pick the next element.
3. Compare with all the elements in sorted sub-list.
4. Shift all the the elements in sorted sub-list that is greater than the value to be sorted.
5. Insert the value.
6. Repeat until list is sorted.
4. Why is Merge Sort preferred for Linked List
Merge Sort is well-suited for linked lists because of its sequential access nature,
efficient merging using pointers, and absence of dependency on random access. These
features make it a natural choice for sorting linked list data structures.
5. How Folding method can be used for Hashing
The key k is partitioned into no. of parts
Then add these parts together and ignoring the last carry.
One can also reverse the first part before adding (right or left justified. Mostly right)
H(k) = k1 + k2 + ………. + kn
Example:
H(3205)=32+05=37 or H(3250)=32+50=82
H(7148)=71+48=19 or H(7184)=71+84=55
H(2345)=23+45=68 or H(2354)=23+54=77
6. What you mean by Open Addressing and Closed Addressing?
7. A hash table contains 7 buckets and uses linear probing to solve collision. The key
values are integers and the hash function used is key%7. Draw the table that results after
inserting in the given order the following values: 16,8,4,13,29,11,22.
8. Construct a max heap with the following set of numbers entered sequentially. 10, 5,
14, 7, 12, 18, 15, 13.
9. What are the limitations of linear probing? How can double hashing be used to resolve
these limitations? Illustrate with examples.
Limitation of Linear Probing
The main problem with linear probing is clustering.
Many consecutive elements form groups.
Then, it takes time to search an element or to find an empty bucket.
Double hashing resolves the limitations of linear probing by addressing its primary
weaknesses—primary clustering and poor performance at high load factors—through the use of
a second, independent hash function. This creates a more dispersed probing sequence,
minimizing clustering and improving efficiency.
Linear Probing example
Quadratic Probing example
10. What is hash table? What are the properties of hash function?
Hash tables are a type of data structure in which the address/ index value of the data element is
generated from a hash function. This enables very fast data access as the index value behaves
as a key for the data value.
One-way: It's computationally infeasible to find an input that maps to a pre-specified output.
Collision resistant: It's computationally infeasible to find two distinct inputs that map to the
same output.
Deterministic: The same input message always produces the same hash value.
Efficient: The hash value is computed quickly, regardless of the input size.
Hidden: It's difficult to guess the input value for a hash function from its output.
Puzzle-friendly: It's difficult to select an input that provides a pre-defined output.
11. What is max heap?Write an algorithm to perform heap sort. Give example.
A max heap is a complete binary tree in which the value of a node is greater than or equal to the
values of its children. Max Heap data structure is useful for sorting data using heap sort.
12. Show all the passes using insertion sort for the following list
54,26,93,17,77,31,44,55,20.
13. Briefly explain any 4 hashing functions.
Division Method:
Definition: The key is divided by the size of the hash table, and the remainder is used
as the hash value.
Formula: h(k)=kmod mh(k) = k \mod mh(k)=kmodm where kkk is the key and mmm
is the size of the hash table.
Advantages:
o Simple and efficient to compute.
Limitations:
o Works best when mmm (hash table size) is a prime number to avoid clustering
and improve distribution.
Example:
For k=25k = 25k=25 and m=7m = 7m=7:
h(25)=25mod 7=4h(25) = 25 \mod 7 = 4h(25)=25mod7=4.
Mid-Square Method:
Definition: The key is squared, and some middle bits of the resulting number are
extracted as the hash value.
Steps:
1. Square the key.
2. Extract the middle rrr bits.
Advantages:
o Works well for keys with patterns, as squaring spreads the bits.
Limitations:
o Requires consistent bit extraction.
Example:
k: 3205 7148 2345
k2: 10272025 51093904 5499025
H(k): 72 93 99
4th and 5th digits have been selected. From the right side.
Folding Method
The key k is partitioned into no. of parts
Then add these parts together and ignoring the last carry.
One can also reverse the first part before adding (right or left justified. Mostly right)
H(k) = k1 + k2 + ………. + kn
Example:
H(3205)=32+05=37 or H(3250)=32+50=82
H(7148)=71+48=19 or H(7184)=71+84=55
H(2345)=23+45=68 or H(2354)=23+54=77
Multiplication Method:
Formula:
Advantages:
o Works well even if mmm is not a prime number.
Limitations:
o Slightly more computationally expensive.
Example:
14. Write the algorithm for Quicksort. Analyse its worst case and best case
performances. Let the size of the hash table be 12. Consider the keys 43, 24, 57,
12, 10, 64, 19, 82, 36, 39 in the order. Show how the keys are occupied using
chaining method.
Quicksort is a sorting algorithm based on the divide and conquer approach where
1. An array is divided into subarrays by selecting a pivot element (element selected from
the array).
While dividing the array, the pivot element should be positioned in such a way that
elements less than pivot are kept on the left side and elements greater than pivot are on
the right side of the pivot.
2. The left and right subarrays are also divided using the same approach. This process
continues until each subarray contains a single element.
At this point, elements are already sorted. Finally, elements are combined to form a
sorted array.
Algorithm
• ..Step 1 - Consider the first element of the list as pivot (i.e., Element at first position in
the list).
• Step 2 - Define two variables i and j. Set i and j to first and last elements of the list
respectively.
• Step 3 - Increment i until list[i] > pivot then stop.
• Step 4 - Decrement j until list[j] < pivot then stop.
• Step 5 - If i < j then exchange list[i] and list[j].
• Step 6 - Repeat steps 3,4 & 5 until i > j.
• Step 7 - Exchange the pivot element with list[j] element.
15. Explain Merge Sort algorithm/pseudocode with the help of an example? Mention the
best case and worst case time complexity of Merge sort algorithm?
Merge Sort is one of the most popular sorting algorithms that is based on the principle
of Divide and Conquer Algorithm.
Here, a problem is divided into multiple sub-problems. Each sub-problem is solved
individually. Finally, sub-problems are combined to form the final solution.
#include<stdio.h>
int l,a[10],n,b[10],i;
void merge(int a,int b,int c);
void mergesort(int a,int b);
int main()
{
printf("Enter the no.of elements: ");
scanf("%d",&n);
printf("\nEnter the elements: ");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
mergesort(0,n-1);
printf("\nSorted elements are: ");
for(i=0;i<n;i++)
printf("%d ",a[i]);
}
void mergesort(int l,int h)
{
int mid;
if(l<h)
{
mid=(l+h)/2;
mergesort(l,mid);
mergesort(mid+1,h);
merge(l,mid,h);
}
}
void merge(int l,int mid,int h)
{
int hi=l,i=l,j=mid+1,k;
while(hi<=mid&&j<=h)
{
if(a[hi]<=a[j])
{
b[i]=a[hi];
hi=hi+1;
}
else
{
b[i]=a[j];
j=j+1;
}
i=i+1;
}
if(hi<=mid)
{
for(k=hi;k<=mid;k++)
{
b[i]=a[k];
i=i+1;
}
}
else
{
for(k=j;k<=h;k++)
{
b[i]=a[k];
i++;
}
}
for(k=l;k<=h;k++)
{
a[k]=b[k];
}
Time Complexity
Best O(n*log n)
Worst O(n*log n)
Average O(n*log n)