0% found this document useful (0 votes)

44 views13 pages

06 Design Consistent Hashing

The document discusses the problem of rehashing keys when servers are added or removed from a distributed system using consistent hashing. Consistent hashing assigns keys to servers by taking the hash of each key and applying a modulus operation based on the number of servers. However, if servers are removed, keys may be remapped inconsistently. The document provides an example to illustrate how keys would be remapped to different servers if one server went offline, disrupting the even distribution of data. Consistent hashing aims to minimize data redistribution when the number of servers changes over time.

Uploaded by

srawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views13 pages

06 Design Consistent Hashing

Uploaded by

srawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

[yByteByteGo Products Newsletter YouTube Jobs

System Design Interview

00 Foreword 06
01 Join the Community
Design Consistent Hashing
To achieve horizontal scaling, it is important to distribute requests/data efficiently and evenly across servers.
Scale From Zero To Millions Of Consistent hashing is a commonly used technique to achieve this goal. But first, let us take an in-depth look at
02
Users the problem.

03
Back-of-the-envelope
Estimation
The rehashing problem
If you have n cache servers, a common way to balance the load is to use the following hash method:
A Framework For System Design
04 serverIndex = hash(key) % N, where N is the size of the server pool.
Interviews

Let us use an example to illustrate how it works. As shown in Table 1, we have 4 servers and 8 string keys with

05 Design A Rate Limiter their hashes.

key hash hash % 4

06 Design Consistent Hashing

key0 18358617 1

key1 26143584 0
07 Design A Key-value Store

key2 18131146 2

Design A Unique ID Generator In

08 key3 35863496 0
Distributed Systems

key4 34085809 1
09 Design A URL Shortener
key5 27581703 3

key6 38164978 2
https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 1 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

key6 38164978 2

key7 22530351 3

Table 1

To fetch the server where a key is stored, we perform the modular operation f(key) % 4. For instance,
hash(key0) % 4 = 1 means a client must contact server 1 to fetch the cached data. Figure 1 shows the
distribution of keys based on Table 1.

serverIndex = hash % 4

Server Index 0 1 2 3

Servers server 0 server 1 server 2 server 3

Keys key1 key0 key2 key5

key3 key4 key6 key7

Figure 1

This approach works well when the size of the server pool is fixed, and the data distribution is even. However,
problems arise when new servers are added, or existing servers are removed. For example, if server 1 goes
offline, the size of the server pool becomes 3. Using the same hash function, we get the same hash value for a
key. But applying modular operation gives us different server indexes because the number of servers is reduced
by 1. We get the results as shown in Table 2 by applying hash % 3:

key hash hash % 3

key0 18358617 0

key1 26143584 0

key2 18131146 1

key3 35863496 2

key4 34085809 1

key5 27581703 0

key6 38164978 1
https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 2 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

key6 38164978 1

key7 22530351 0

Table 2

Figure 2 shows the new distribution of keys based on Table 2.

serverIndex = hash % 3

Server Index 0 1 2

Servers server 0 server 1 server 2 server 3

Keys key0 key2 key3

key1 key4
key5 key6
key7

Figure 2

As shown in Figure 2, most keys are redistributed, not just the ones originally stored in the offline server (server
1). This means that when server 1 goes offline, most cache clients will connect to the wrong servers to fetch
data. This causes a storm of cache misses. Consistent hashing is an effective technique to mitigate this
problem.

Consistent hashing
Quoted from Wikipedia: "Consistent hashing is a special kind of hashing such that when a hash table is re-sized
and consistent hashing is used, only k/n keys need to be remapped on average, where k is the number of keys,
and n is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots
causes nearly all keys to be remapped [1]”.

Hash space and hash ring

Now we understand the definition of consistent hashing, let us find out how it works. Assume SHA-1 is used as
the hash function f, and the output range of the hash function is: x0, x1, x2, x3, …, xn. In cryptography, SHA-1’s
hash space goes from 0 to 2^160 - 1. That means x0 corresponds to 0, xn corresponds to 2^160 – 1, and all
the other hash values in the middle fall between 0 and 2^160 - 1. Figure 3 shows the hash space.

x0 xn

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 3 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Figure 3

By collecting both ends, we get a hash ring as shown in Figure 4:

xn x0

Figure 4

Hash servers
Using the same hash function f, we map servers based on server IP or name onto the ring. Figure 5 shows that
4 servers are mapped on the hash ring.

f (server 0)

Servers
s3 s0
s0 = server 0
s1 = server 1
server 0 s2 = server 2
s3 = server 3

server 1

server 2

server 3 s2 s1

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 4 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Figure 5

Hash keys
One thing worth mentioning is that hash function used here is different from the one in “the rehashing problem,”
and there is no modular operation. As shown in Figure 6, 4 cache keys (key0, key1, key2, and key3) are hashed
onto the hash ring

Servers
s3 s0
s0 = server 0
server 0 s1 = server 1
s2 = server 2
s3 = server 3
server 1
k0 = key0
k1 = key1
k2 = key2
k3 k3 = key3
server 2 k1

server 3 s2 s1

Figure 6

Server lookup
To determine which server a key is stored on, we go clockwise from the key position on the ring until a server is
found. Figure 7 explains this process. Going clockwise, key0 is stored on server 0; key1 is stored on server 1;
key2 is stored on server 2 and key3 is stored on server 3.

Servers
s3 s0
s0 = server 0
server 0 s1 = server 1
s2 = server 2
s3 = server 3

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 5 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

server 1 k0 = key0
k1 = key1
k2 = key2
k3 k3 = key3
server 2 k1

server 3 s2 s1

Figure 7

Add a server
Using the logic described above, adding a new server will only require redistribution of a fraction of keys.

In Figure 8, after a new server 4 is added, only key0 needs to be redistributed. k1, k2, and k3 remain on the
same servers. Let us take a close look at the logic. Before server 4 is added, key0 is stored on server 0. Now,
key0 will be stored on server 4 because server 4 is the first server it encounters by going clockwise from key0’s
position on the ring. The other keys are not redistributed based on consistent hashing algorithm.

key0
k0
s4
Servers
s3 s0
s0 = server 0
server 0
s1 = server 1
s2 = server 2
s3 = server 3
server 1
k0 = key0
k1 = key1
k2 = key2
k3
k3 = key3
server 2 k1

server 3 s2 s1

k2
server 4 key2

Figure 8

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 6 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Remove a server
When a server is removed, only a small fraction of keys require redistribution with consistent hashing. In Figure
9, when server 1 is removed, only key1 must be remapped to server 2. The rest of the keys are unaffected.

Servers
s3 s0
s0 = server 0
server 0
s1 = server 1
s2 = server 2
s3 = server 3
server 1
k0 = key0
k1 = key1
k2 = key2
k3
k3 = key3
server 2 k1

server 3 s2 s1

Figure 9

Two issues in the basic approach

The consistent hashing algorithm was introduced by Karger et al. at MIT [1]. The basic steps are:

Map servers and keys on to the ring using a uniformly distributed hash function.

To find out which server a key is mapped to, go clockwise from the key position until the first server on the
ring is found.

Two problems are identified with this approach. First, it is impossible to keep the same size of partitions on the
ring for all servers considering a server can be added or removed. A partition is the hash space between
adjacent servers. It is possible that the size of the partitions on the ring assigned to each server is very small or
fairly large. In Figure 10, if s1 is removed, s2’s partition (highlighted with the bidirectional arrows) is twice as
large as s0 and s3’s partition.

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 7 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Servers
s3 s0
s0 = server 0
server 0 s1 = server 1
s2 = server 2
s3 = server 3

server 1

server 2

server 3 s2 s1

Figure 10

Second, it is possible to have a non-uniform key distribution on the ring. For instance, if servers are mapped to
positions listed in Figure 11, most of the keys are stored on server 2. However, server 1 and server 3 have no
data.

s3
Servers
s0
s0 = server 0
server 0 s1 = server 1
s2
s2 = server 2
s1 s3 = server 3

server 1

server 2

server 3

Figure 11

A technique called virtual nodes or replicas is used to solve these problems.

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing
Virtual nodes Page 8 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Virtual nodes
A virtual node refers to the real node, and each server is represented by multiple virtual nodes on the ring. In
Figure 12, both server 0 and server 1 have 3 virtual nodes. The 3 is arbitrarily chosen; and in real-world
systems, the number of virtual nodes is much larger. Instead of using s0, we have s0_0, s0_1, and s02 to
represent _server 0 on the ring. Similarly, s1_0, s1_1, and s1_2 represent server 1 on the ring. With virtual
nodes, each server is responsible for multiple partitions. Partitions (edges) with label s0 are managed by server
0. On the other hand, partitions with label s1 are managed by server 1.

Servers
s0_2 s1_0
s0 s0 = server 0
server 0 s1 = server 1

s0_0

server 1
s0

s1_2 s1_1

s0_1
s0
s1

Figure 12

To find which server a key is stored on, we go clockwise from the key’s location and find the first virtual node
encountered on the ring. In Figure 13, to find out which server k0 is stored on, we go clockwise from k0’s
location and find virtual node s1_1, which refers to server 1.

Servers
s0_2 s1_0
s0 = server 0
server 0 s1 = server 1

s0_0

server 1

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 9 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

s1_2 s1_1

s0_1

Figure 13

As the number of virtual nodes increases, the distribution of keys becomes more balanced. This is because the
standard deviation gets smaller with more virtual nodes, leading to balanced data distribution. Standard
deviation measures how data are spread out. The outcome of an experiment carried out by online research [2]
shows that with one or two hundred virtual nodes, the standard deviation is between 5% (200 virtual nodes)
and 10% (100 virtual nodes) of the mean. The standard deviation will be smaller when we increase the number
of virtual nodes. However, more spaces are needed to store data about virtual nodes. This is a tradeoff, and we
can tune the number of virtual nodes to fit our system requirements.

Find affected keys

When a server is added or removed, a fraction of data needs to be redistributed. How can we find the affected
range to redistribute the keys?

In Figure 14, server 4 is added onto the ring. The affected range starts from s4 (newly added node) and moves
anticlockwise around the ring until a server is found (s3). Thus, keys located between s3 and s4 need to be
redistributed to s4.

key0
k0
s4
Servers
s3 s0
s0 = server 0
server 0
s1 = server 1
s2 = server 2
s3 = server 3
server 1
k0 = key0
k1 = key1
k2 = key2
k3
k3 = key3
server 2 k1

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 10 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

server 3 s2 s1

k2
server 4 key2

Figure 14

When a server (s1) is removed as shown in Figure 15, the affected range starts from s1 (removed node) and
moves anticlockwise around the ring until a server is found (s0). Thus, keys located between s0 and s1 must be
redistributed to s2.

Servers
s3 s0
s0 = server 0
server 0
s1 = server 1
s2 = server 2
s3 = server 3
server 1
k0 = key0
k1 = key1
k2 = key2
k3
k3 = key3
server 2 k1

server 3 s2 s1

Figure 15

Wrap up
In this chapter, we had an in-depth discussion about consistent hashing, including why it is needed and how it
works. The benefits of consistent hashing include:

Minimized keys are redistributed when servers are added or removed.

It is easy to scale horizontally because data are more evenly distributed.

Mitigate hotspot key problem. Excessive access to a specific shard could cause server overload. Imagine

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 11 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

data for Katy Perry, Justin Bieber, and Lady Gaga all end up on the same shard. Consistent hashing helps to
mitigate the problem by distributing the data more evenly.

Consistent hashing is widely used in real-world systems, including some notable ones:

Partitioning component of Amazon’s Dynamo database [3]

Data partitioning across the cluster in Apache Cassandra [4]

Discord chat application [5]

Akamai content delivery network [6]

Maglev network load balancer [7]

Congratulations on getting this far! Now give yourself a pat on the back. Good job!

Reference materials
[1] Consistent hashing:
https://en.wikipedia.org/wiki/Consistent_hashing

[2] Consistent Hashing:

https://tom-e-white.com/2007/11/consistent-hashing.html

[3] Dynamo: Amazon’s Highly Available Key-value Store:

https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

[4] Cassandra - A Decentralized Structured Storage System:

http://www.cs.cornell.edu/Projects/ladis2009/papers/Lakshman-ladis2009.PDF

[5] How Discord Scaled Elixir to 5,000,000 Concurrent Users:

https://blog.discord.com/scaling-elixir-f9b8e1e7c29b

[6] CS168: The Modern Algorithmic Toolbox Lecture #1: Introduction and Consistent Hashing:
http://theory.stanford.edu/~tim/s16/l/l1.pdf

[7] Maglev: A Fast and Reliable Software Network Load Balancer:

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44824.pdf

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 12 of 13
ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

Become a Contributor Be an affiliate Suggest a new topic

Our Team

https://bytebytego.com/courses/system-design-interview/design-consistent-hashing Page 13 of 13

Consistent Hashing
No ratings yet
Consistent Hashing
2 pages
28 Consistent Hashing
No ratings yet
28 Consistent Hashing
6 pages
Design Consistent Hashing: The Rehashing Problem
No ratings yet
Design Consistent Hashing: The Rehashing Problem
2 pages
Consistent Hashihhhhng Explained
No ratings yet
Consistent Hashihhhhng Explained
28 pages
Consistent Hashing With Bounded Load
No ratings yet
Consistent Hashing With Bounded Load
9 pages
Binomial
No ratings yet
Binomial
14 pages
Leveraging Consistent Hashing in Your Python Applications
No ratings yet
Leveraging Consistent Hashing in Your Python Applications
39 pages
Dxhash: A Scalable Consistent Hashing Based On The Pseudo-Random Sequence
No ratings yet
Dxhash: A Scalable Consistent Hashing Based On The Pseudo-Random Sequence
12 pages
Memcached & Consistent Hashing: Distributing Keys & Values
No ratings yet
Memcached & Consistent Hashing: Distributing Keys & Values
3 pages
Consistent Hashing - Explanation and Implementation
No ratings yet
Consistent Hashing - Explanation and Implementation
7 pages
Hashing in Networked Systems: Mike Freedman
No ratings yet
Hashing in Networked Systems: Mike Freedman
24 pages
Consistent Hashing
No ratings yet
Consistent Hashing
19 pages
Data Partitioning
No ratings yet
Data Partitioning
39 pages
Hashing Essentials for Tech Enthusiasts
No ratings yet
Hashing Essentials for Tech Enthusiasts
4 pages
CS143: Hash Index
No ratings yet
CS143: Hash Index
26 pages
Dynamo Data Partitioning Guide
No ratings yet
Dynamo Data Partitioning Guide
5 pages
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
No ratings yet
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
49 pages
Presentation 7
No ratings yet
Presentation 7
16 pages
HLD Basics & Consistent Hashing Exercises
No ratings yet
HLD Basics & Consistent Hashing Exercises
9 pages
Data and File Structures: Hashing
No ratings yet
Data and File Structures: Hashing
24 pages
DHTLookup
No ratings yet
DHTLookup
56 pages
Foundational Concepts in System Design - Part 2
No ratings yet
Foundational Concepts in System Design - Part 2
25 pages
Lect26 After
No ratings yet
Lect26 After
28 pages
Hashing
No ratings yet
Hashing
16 pages
ADBMS - Unit3 (Autosaved)
No ratings yet
ADBMS - Unit3 (Autosaved)
12 pages
10 More Hashing
No ratings yet
10 More Hashing
5 pages
Memento
No ratings yet
Memento
16 pages
Hash New 23july
No ratings yet
Hash New 23july
14 pages
L19 Mod6 ReplicationPartitioning P2
No ratings yet
L19 Mod6 ReplicationPartitioning P2
27 pages
Hashing
No ratings yet
Hashing
38 pages
Hashing
No ratings yet
Hashing
47 pages
Anchorhash: A Scalable Consistent Hash
No ratings yet
Anchorhash: A Scalable Consistent Hash
12 pages
Hashing - Datastructures and Algorithms
No ratings yet
Hashing - Datastructures and Algorithms
32 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
15-440 Distributed Systems: Hashing and Cdns
No ratings yet
15-440 Distributed Systems: Hashing and Cdns
38 pages
Class 7 - Scaling, Sharding, Consistent Hashing
No ratings yet
Class 7 - Scaling, Sharding, Consistent Hashing
4 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Hashing
No ratings yet
Hashing
14 pages
Hashing Muddasir
No ratings yet
Hashing Muddasir
43 pages
Eee DS (Unit 6)
No ratings yet
Eee DS (Unit 6)
33 pages
DSimp 2
No ratings yet
DSimp 2
21 pages
Presentation Dsa
No ratings yet
Presentation Dsa
40 pages
A Fast, Minimal Memory, Consistent Hash Algorithm
No ratings yet
A Fast, Minimal Memory, Consistent Hash Algorithm
12 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
78 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Modue 5
No ratings yet
Modue 5
10 pages
Module 5
No ratings yet
Module 5
25 pages
7 Algorithms To Know Before Your Next System Design Interview
100% (1)
7 Algorithms To Know Before Your Next System Design Interview
18 pages
Linear Hash
No ratings yet
Linear Hash
15 pages
Hashing
No ratings yet
Hashing
40 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
No ratings yet
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
26 pages
Dsa Module 6 Ktuassist
No ratings yet
Dsa Module 6 Ktuassist
9 pages
UNIT 8 Hashing
No ratings yet
UNIT 8 Hashing
24 pages
Mod 5
No ratings yet
Mod 5
13 pages
4
No ratings yet
4
29 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Hashing
No ratings yet
Hashing
30 pages
82182A Time Module Owner's Manual
No ratings yet
82182A Time Module Owner's Manual
87 pages
DIALux evo: Office Lighting Design
0% (1)
DIALux evo: Office Lighting Design
51 pages
Manual Ut350
No ratings yet
Manual Ut350
88 pages
Accordion Arduino Mega Code
No ratings yet
Accordion Arduino Mega Code
9 pages
Statistical Analysis Homework Guide
No ratings yet
Statistical Analysis Homework Guide
3 pages
M514 M516 AU400: EL Hardware Manual Rev. 0700
No ratings yet
M514 M516 AU400: EL Hardware Manual Rev. 0700
15 pages
Statistical Process Monitoring Using Advanced Data-Driven and Deep Learning Approaches: Theory and Practical Applications 1st Edition Fouzi Harrou
No ratings yet
Statistical Process Monitoring Using Advanced Data-Driven and Deep Learning Approaches: Theory and Practical Applications 1st Edition Fouzi Harrou
55 pages
Presentazione Di KUKA Roboter Italia - Kuka
No ratings yet
Presentazione Di KUKA Roboter Italia - Kuka
24 pages
Call Center Culture
No ratings yet
Call Center Culture
2 pages
Q1 - LE - TLE 7 - Lesson 2 - Week 2
No ratings yet
Q1 - LE - TLE 7 - Lesson 2 - Week 2
13 pages
Microcontroller Basics Course
No ratings yet
Microcontroller Basics Course
5 pages
Apple’s Unique Social Media Strategy
No ratings yet
Apple’s Unique Social Media Strategy
6 pages
Oracle Control File Recreation Guide
100% (1)
Oracle Control File Recreation Guide
3 pages
Now I Know My Colors
No ratings yet
Now I Know My Colors
80 pages
Journal On Water Level Indicator
No ratings yet
Journal On Water Level Indicator
24 pages
Transformation: General Key Points
No ratings yet
Transformation: General Key Points
11 pages
Module 3 - NC II - Solving and Addressing General Workplace Problems - ForTrainingOnly
100% (9)
Module 3 - NC II - Solving and Addressing General Workplace Problems - ForTrainingOnly
86 pages
Pydroid
No ratings yet
Pydroid
3 pages
Smart Meter Verification Emerson Recommended Calibration Practices For Coriolis Meters Used To Comply Aga 11 Api Mpms CH 14 9 Micro Motion en 4928858
No ratings yet
Smart Meter Verification Emerson Recommended Calibration Practices For Coriolis Meters Used To Comply Aga 11 Api Mpms CH 14 9 Micro Motion en 4928858
2 pages
4 - Tamerlane and Other Poems
No ratings yet
4 - Tamerlane and Other Poems
93 pages
Java - Developer
No ratings yet
Java - Developer
4 pages
Logic Circuit & Switching Theory Sequencial Logic Circuits
No ratings yet
Logic Circuit & Switching Theory Sequencial Logic Circuits
67 pages
Datasheet ModEvaPac en V2.3
No ratings yet
Datasheet ModEvaPac en V2.3
6 pages
Fit AP v200r005c10spcd00 (&ac) Upgrade Guide
No ratings yet
Fit AP v200r005c10spcd00 (&ac) Upgrade Guide
10 pages
March 2025 Lept Exam Program (Prof Teachers) Revised March 2025
No ratings yet
March 2025 Lept Exam Program (Prof Teachers) Revised March 2025
2 pages
Air Regulations RK Bali PDF
38% (8)
Air Regulations RK Bali PDF
1 page
Small Business Color Printer Guide
No ratings yet
Small Business Color Printer Guide
2 pages
Java Unit Wise Questions
No ratings yet
Java Unit Wise Questions
4 pages
Bug 10114837 Rman Deletes The Unapplied Archived Redo Logs
No ratings yet
Bug 10114837 Rman Deletes The Unapplied Archived Redo Logs
4 pages
Aadhaar Details for Mehtab
No ratings yet
Aadhaar Details for Mehtab
1 page

06 Design Consistent Hashing

Uploaded by

06 Design Consistent Hashing

Uploaded by

ByteByteGo | Technical Interview Prep 31/10/23, 3:43 PM

[yByteByteGo Products Newsletter YouTube Jobs

System Design Interview

05 Design A Rate Limiter their hashes.

key hash hash % 4

06 Design Consistent Hashing

Design A Unique ID Generator In

Servers server 0 server 1 server 2 server 3

Keys key1 key0 key2 key5

key hash hash % 3

Figure 2 shows the new distribution of keys based on Table 2.

Servers server 0 server 1 server 2 server 3

Keys key0 key2 key3

Hash space and hash ring

By collecting both ends, we get a hash ring as shown in Figure 4:

Two issues in the basic approach

A technique called virtual nodes or replicas is used to solve these problems.

Find affected keys

Minimized keys are redistributed when servers are added or removed.

It is easy to scale horizontally because data are more evenly distributed.

Partitioning component of Amazon’s Dynamo database [3]

Data partitioning across the cluster in Apache Cassandra [4]

Akamai content delivery network [6]

[2] Consistent Hashing:

[3] Dynamo: Amazon’s Highly Available Key-value Store:

[4] Cassandra - A Decentralized Structured Storage System:

[5] How Discord Scaled Elixir to 5,000,000 Concurrent Users:

[7] Maglev: A Fast and Reliable Software Network Load Balancer:

Become a Contributor Be an affiliate Suggest a new topic

Copyright ©2023 ByteByteGo Inc. All rights reserved.

You might also like