RapidChain - A Fast Blockchain Protocol Via Full Sharding
RapidChain - A Fast Blockchain Protocol Via Full Sharding
Abstract
A major approach to overcoming the performance and scalability limitations of current blockchain
protocols is to use sharding, which is to split the overheads of processing transactions among multiple,
smaller groups of nodes. These groups work in parallel to maximize performance while requiring signif-
icantly smaller communication, computation, and storage per node, allowing the system to scale to large
networks. However, existing sharding-based blockchain protocols still require a linear amount of com-
munication (in the number of participants) per transaction, and hence, attain only partially the potential
benefits of sharding. We show that this introduces a major bottleneck to the throughput and latency
of these protocols. Aside from the limited scalability, these protocols achieve weak security guarantees
due to either a small fault resiliency (e.g., 1/8 and 1/4) or high failure probability, or they rely on strong
assumptions (e.g., trusted setup) that limit their applicability to mainstream payment systems.
We propose RapidChain, the first sharding-based public blockchain protocol that is resilient to Byzan-
tine faults from up to a 1/3 fraction of its participants, and achieves complete sharding of the communica-
tion, computation, and storage overhead of processing transactions without assuming any trusted setup.
We introduce an optimal intra-committee consensus algorithm that can achieve very high throughputs
via block pipelining, a novel gossiping protocol for large blocks, and a provably-secure reconfigura-
tion mechanism to ensure robustness. Using an efficient cross-shard transaction verification technique,
RapidChain avoids gossiping transactions to the entire network. Our empirical evaluations suggest that
RapidChain can process (and confirm) more than 7,300 tx/sec with an expected confirmation latency of
roughly 8.7 seconds in a network of 4,000 nodes with an overwhelming time-to-failure of more than
4,500 years.
1 Introduction
Our global financial system is highly centralized making it resistant to change, vulnerable to failures and at-
tacks, and inaccessible to billions of people in need of basic financial tools [61, 27]. On the other hand, decentral-
ization poses new challenges of ensuring a consistent view among a group of mutually-distrusting participants.
The permissionless mode of operation which allows open membership and entails constant churn (i.e.,
join/leave) of its participants further complicates this task. Additionally, any agile financial system, including
a distributed one, should be able to adequately serve realistic market loads. This implies that it should scale
easily to a large number of participants, and it should handle a high throughput of transactions with relatively
low delays in making their outputs available. Achieving these properties together should also not require
significant resources from each of the participants since otherwise, it runs contrary to the idea of constructing
a tool easily accessible to anyone.
Existing solutions currently either fail to solve the above challenges or make security/performance trade-
offs that, unfortunately, make them no longer truly-decentralized solutions. In particular, traditional Byzantine
consensus mechanisms such as [42, 20, 17] can only work in a closed membership setting, where the set of
participants is fixed and their identities are known to everyone via a trusted third party. If used in an open
setting, these protocols can be easily compromised using Sybil attacks [24], where the adversary repeatedly
† This work was done in part while the author was affiliated with Yale University.
1
rejoins malicious parties with fresh identities to gain significant influence on the protocol outcome. Moreover,
most traditional schemes assume a static adversary who can select the set of corrupt parties only at the start of
the protocol. Existing protocols that are secure against an adaptive adversary such as [13, 18, 36] either scale
poorly with the number of participants or are inefficient.
Most cryptocurrencies such as Bitcoin [52] and Ethereum [16] maintain a distributed transaction ledger
called the blockchain over a large peer-to-peer (P2P) network, where every node maintains an updated, full copy
of the entire ledger via a Byzantine consensus protocol, dubbed as the Nakamoto consensus. Unlike traditional
consensus mechanisms, the Nakamoto consensus allows new participants to join the protocol using a proof-of-
work (PoW) process [26], where a node demonstrates that it has done a certain amount of work by presenting
a solution to a computational puzzle. The use of PoW not only allows the consensus protocol to impede Sybil
attacks by limiting the rate of malicious participants joining the system, but also provides a lottery mechanism
through which a random leader is elected in every round to initiate the consensus process.
Unfortunately, it is now well-known that Bitcoin’s PoW-based consensus comes with serious drawbacks
such as very low transaction throughput, high latency, poor energy efficiency [44], and mining-pool central-
ization [32, 2]. Moreover, the protocol cannot scale out its transaction processing capacity with the number of
participants joining the protocol [45, 40]. Another major scalability issue of Bitcoin is that every party needs
to initially download the entire blockchain from the network to independently verify all transactions. The size
of the blockchain is currently about 150 GB and has nearly doubled in the past year [1]. One can expect a
larger growth in the size of blockchains that are updated via higher-throughput consensus protocols than that
of Bitcoin.
Recently, several protocols have been proposed to mitigate the performance and scalability issues of Bit-
coin’s blockchain [23, 28, 39, 50, 54, 45, 30, 4, 40] using hybrid architectures that combine the open-membership
nature of Bitcoin with traditional Byzantine fault tolerance [55, 19]. While most of these protocols can report-
edly improve the throughput and latency of Bitcoin, all of them still require the often-overlooked assumption
of a trusted setup to generate an unpredictable initial common randomness in the form of a common genesis
block to bootstrap the blockchain. Similar to Bitcoin, these protocols essentially describe how one can ensure
agreement on new blocks given an initial agreement on some genesis block. Such an assumption plays a crucial
role in achieving consistency among nodes in these protocols, and if compromised, can easily affect the security
of the entire consensus protocol, casting a major contradiction to the decentralized nature of cryptocurrencies.
In addition to being partially decentralized, most of these solutions have either large per-node storage re-
quirements [23, 28, 39, 50, 54, 45, 4], low fault resiliency [45, 40], incomplete specifications [23, 39], or other se-
curity issues [39, 45, 40] (see Section 2.2 for more details). Furthermore, all previous protocols require every par-
ticipant in the consensus protocol to broadcast a message to the entire network to either submit their consensus
votes [28, 50], verify transactions [39, 45, 40], and/or update every node’s local blockchain replica [45, 54, 30, 4].
While the large overhead of such a broadcast for every participant is usually reduced from a linear number of
messages (with respect to the number of participants) to nearly a constant using a peer-to-peer gossiping
protocol [34], the relatively high latency of such a “gossip-to-all” invocation (e.g., 12.6 seconds per block on
average [22]) increases the overall latency of the consensus protocol significantly (e.g., the gossip-to-all latency
roughly quadruples the consensus time in [40]). Moreover, due to the very high transaction throughput of most
scalable blockchain protocols (e.g., about 3,500 tx/sec in [40]), the bandwidth usage of each node becomes very
large (e.g., at least 45 Mbps in [40] – see Section 5 for more details), if all transactions are gossiped to the entire
network.
2
Protocol # Nodes Resiliency Complexity1 Throughput Latency Storage2 Shard Size Time to Fail
Elastico [45] n = 1,600 t < n/4 Ω(m 2 /b+n) 40 tx/sec 800 sec 1x m = 100 1 hour
OmniLedger [40] n = 1,800 t < n/4 Ω(m 2 /b+n) 500 tx/sec 14 sec 1/3x m = 600 230 years
OmniLedger [40] n = 1,800 t < n/4 Ω(m 2 /b+n) 3,500 tx/sec 63 sec 1/3x m = 600 230 years
RapidChain n = 1,800 t < n/3 O (m 2 /b+m log n) 4,220 tx/sec 8.5 sec 1/9x m = 200 1,950 years
RapidChain n = 4,000 t < n/3 O (m 2 /b+m log n) 7,380 tx/sec 8.7 sec 1/16x m = 250 4,580 years
Table 1: Comparison of RapidChain with state-of-the-art sharding blockchain protocols (b is the block size)
Let n denote the number of participants in the protocol at any given time, and m n denote the size of
each committee. RapidChain creates k = n/m committees each of size m = c log n nodes, where c is a constant
depending only on the security parameter (in practice, c is roughly 20). In summary, RapidChain provides the
following novelties:
• Sublinear Communication. RapidChain is the first sharding-based blockchain protocol that requires
only a sublinear (i.e., o(n)) number of bits exchanged in the network per transaction. In contrast, all previ-
ous work incur an Ω(n) communication overhead per transaction (see Table 1).
• Higher Resiliency. RapidChain is the first sharding-based blockchain protocol that can tolerate corrup-
tions from less than a 1/3 fraction of its nodes (rather than 1/4) while exceeding the throughput and latency
of previous work [45, 40].
• Super-Fast Committee Consensus. Building on [5, 58], we reduce the communication overhead and
latency of P2P consensus on large blocks gossiped in each committee by roughly 3-10 times compared to
previous solutions [45, 30, 4, 40].
• Secure Reconfiguration. RapidChain builds on the Cuckoo rule [8, 60] to provably protect against
join/leave attacks during node churn, an important property missing in previous sharding-based proto-
cols [45, 40]. RapidChain also allows new nodes to join the protocol in a seamless way without any inter-
ruptions or delays in the protocol execution.
• Fast Cross-Shard Verification. We introduce a novel technique for partitioning the blockchain such
that each node is required to store only a 1/k fraction of the entire blockchain. To verify cross-shard
transactions, RapidChain’s committees discover each other via an efficient routing mechanism inspired by
Kademlia [46] that incurs only a logarithmic (in number of committees) latency and storage. In contrast,
the committee discovery in existing solutions [45, 40] requires several “gossip-to-all” invocations.
• Decentralized Bootstrapping. RapidChain operates in the permissionless setting that allows open mem-
bership, but unlike most previous work [28, 39, 54, 45, 40, 4], does not assume the existence of an initial
common randomness, usually in the form of a common genesis block. While common solutions for gener- √
ating such a block require exchanging Ω(n2 ) messages, RapidChain can bootstrap itself with only O (n n)
messages without assuming any initial randomness.
We also implement a prototype of RapidChain to evaluate its performance and compare it with the state-
of-the-art sharding-based protocols. Table 1 shows a high-level comparison between the results. In this table,
we assume 512 B/tx, one-day long epochs, 100 ms network latency for all links, and 20 Mbps bandwidth for all
nodes in all three protocols. The choices of 1,600 and 1,800 nodes for [45] and [40] respectively is based on
the maximum network sizes reported in these work. Unfortunately, the time-to-failure of the protocol of [45]
decreases rapidly for larger network sizes. For [40], we expect larger network sizes will, at best, only slightly
increase the throughput due to the large committee sizes (i.e., m) required. The latency numbers reported in
Table 1 refer to block (or transaction) confirmation times which is the delay from the time that a node proposes
a block to the network until it can be confirmed by all honest nodes as a valid transaction. We refer the reader
to Section 2.2 and Section 5 for details of our evaluation and comparison with previous work.
3
1.2 Overview of RapidChain
RapidChain proceeds in fixed time periods called epochs. The first epoch starts by running a randomized
committee-election protocol (described in Section 4.6) that allows the participants to agree on a committee
of m = O (log n) nodes in a constant number of rounds. Assuming t < n/3 nodes are controlled by a slowly-
adaptive Byzantine adversary, the committee-election protocol samples a committee from the set of all nodes
in a way that the fraction of corrupt nodes in the sampled set is bounded by 1/2 with high probability. At
the beginning of every epoch, this committee, which we refer to as the reference committee, generates a fresh
random string, called the epoch randomness. This randomness is used by the protocol to (1) sample a set of
1/2-resilient committees, referred to as the sharding committees, in the first epoch, (2) allow new nodes to join
the system in every epoch, and (3) re-organize the existing committees to prevent adversarial take-over after
nodes join and leave the system at the beginning of every epoch.
The nodes belonging to the same sharding committee discover each other via a peer-discovery algorithm.
Each sharding committee is responsible for maintaining a disjoint transaction ledger known as a shard, which
is stored as a blockchain by every member of the committee. Each transaction tx is submitted by a user of the
system to a small number of arbitrary RapidChain nodes who route tx, via an inter-committee routing protocol,
to a committee responsible for storing tx. We refer to this committee as the output committee for tx and denote
it by C out . This committee is selected deterministically by hashing the ID of tx to a number corresponding to
C out .
The members of C out batch several transactions into a large block (about 2 MB), and then, append it to their
own ledger. Before the block can be appended, the committee has to verify the validity of every transaction
in the block. In Bitcoin, such a verification usually depends on other (input) transactions that record some
previously-unspent money being spent by the new transaction. Since transactions are stored into disjoint
ledgers, each stored by a different committee, the members of C out need to communicate with the corresponding
input committees to ensure the input transactions exist in their shards. Inspired by Kademlia [46], the verifying
committee in RapidChain communicates with only a logarithmic number of other committees to discover the
ones that store the related transactions.
Once all of the transactions in the block are verified, the members of C out participate in an intra-committee
consensus protocol to append the block to their shard. The consensus protocol proceeds as follows. First, the
members of C out choose a local leader using the current epoch randomness. Second, the leader sends the block
to all the members of C out using a fast gossiping protocol that we build based on the information dispersal
algorithm (IDA) of Alon et al. [6, 5] for large blocks. Third, to ensure the members of C out agree on the same
block, they participate in a Byzantine consensus protocol that we construct based on the synchronous protocol
of Ren et al. [58]. This protocol allows RapidChain to obtain an intra-committee consensus with the optimal
resiliency of 1/2, and thus, achieve a total resiliency of 1/3 with small committees. While the protocol of [58]
requires exchanging O(m 2 `) bits to broadcast a message of length ` to m parties, our intra-committee consensus
protocol requires O(m 2h log m + m`) bits, where h is the length of a hash that depends only on the security
parameter.
A consensus decision in RapidChain is made on either a block of transactions or on a periodic reconfiguration
of committees. Reconfiguration events are deterministically marked by the epochs and allow RapidChain to
re-organize its committees in response to a slowly-adaptive adversary [54]. Such an adversary is allowed to
corrupt new nodes (and hence, take over committees) only at the end of epochs, i.e., the set of committees
is fixed during each epoch. Unfortunately, repeating the entire committee-election protocol to reconfigure
committees in response to only a few-but-frequent joins and leaves imposes a huge overhead on the network.
Therefore, at the end of each epoch, RapidChain performs a minimally-expensive reconfiguration protocol
based on the Cuckoo rule [8, 60] that moves only a constant number of nodes between committees while
provably guaranteeing security against join/leave attacks.
During the reconfiguration protocol happening at the end of the i-th epoch, each committee generates a
fresh randomness, r i+1 , for the next epoch. The fresh randomness not only allows the protocol to move a
1 Total
message complexity of consensus per transaction (see Section 6.6).
2 Reduction
in the amount of storage required for each participant after the same number of transactions are processed (and confirmed) by
the network.
4
certain number of nodes between committees in an unpredictable manner, thus hindering malicious takeovers,
but also allows creation of fresh computational puzzles for current nodes and also new nodes who want to join
at the end of the next epoch (i.e., epoch i + 1). A node who wishes to participate in the protocol for the next
epoch will be admitted to the system only if it can solve a fresh puzzle before the next reconfiguration event.
Otherwise, it has to solve another fresh puzzle, which will be revealed at the end of the next epoch.
Paper Organization. In Section 2, we review related work and present a background on previous work that
RapidChain builds on. In Section 3, we state our network and threat models and define the general problem we
aim to solve. We present our protocol design in Section 4. We formally analyze the security and performance
of RapidChain in Section 6. Finally, we describe our implementation and evaluation results in Section 5 and
conclude in Section 8.
5
2.2 Sharding-Based Consensus
Unlike Bitcoin, a sharding-based blockchain protocol can increase its transaction processing power with the
number of participants joining the network by allowing multiple committees of nodes process incoming trans-
actions in parallel. Thus, the total number of transaction processed in each consensus round by the entire pro-
tocol is multiplied by the number of committees. While there are multiple exciting, parallel work on sharding-
based blockchain protocols such as [31, 62, 63], we only study peer-reviewed results that focus on handling
sharding in the Bitcoin transaction model.
Elastico. Luu et al. [45] propose Elastico, the first sharding-based consensus protocol for public blockchains.
In every consensus epoch, each participant solves a PoW puzzle based on an epoch randomness obtained from
the last state of the blockchain. The PoW’s least-significant bits are used to determine the committees which
coordinate with each other to process transactions.
While Elastico can improve the throughput and latency of Bitcoin by several orders of magnitude, it still
has several drawbacks: (1) Elastico requires all parties to re-establish their identities (i.e., solve PoWs) and
re-build all committees in “every” epoch. Aside from a relatively large communication overhead, this incurs a
significant latency that scales linearly with the network size as the protocol requires more time to solve enough
PoWs to fill up all committees. (2) In practice, Elastico requires a small committee size (about 100 parties) to
limit the overhead of running PBFT in each committee. Unfortunately, this increases the failure probability of
the protocol significantly and, using a simple analysis (see [40]), this probability can be as high as 0.97 after
only six epochs, rendering the protocol completely insecure in practice.
(3) The randomness used in each epoch of Elastico can be biased by an adversary, and hence, compromise the
committee selection process and even allow malicious nodes to precompute PoW puzzles. (4) Elastico requires
a trusted setup for generating an initial common randomness that is revealed to all parties at the same time.
(5) While Elastico allows each party to only verify a subset of transactions, it still has to broadcast all blocks to
all parties and requires every party to store the entire ledger. (6) Finally, Elastico can only tolerate up to a 1/4
fraction faulty parties even with a high failure probability. Elastico requires this low resiliency bound to allow
practical committee sizes.
OmniLedger. In a more recent work, Kokoris-Kogias et al. [40] propose OmniLedger, a sharding-based dis-
tributed ledger protocol that attempts to fix some of the issues of Elastico. Assuming a slowly-adaptive adver-
sary that can corrupt up to a 1/4 fraction of the nodes at the beginning of each epoch, the protocol runs a global
reconfiguration protocol at every epoch (about once a day) to allow new participants to join the protocol.
The protocol generates identities and assigns participants to committees using a slow identity blockchain
protocol that assumes synchronous channels. A fresh randomness is generated in each epoch using a bias-
resistant random generation protocol that relies on a verifiable random function (VRF) [49] for unpredictable
leader election in a way similar to the lottery algorithm of Algorand [48]. The consensus protocol assumes
partially-synchronous channels to achieve fast consensus using a variant of ByzCoin [39], where the epoch
randomness is further used to divide a committee into smaller groups. The ByzCoin’s design is known to
have several security/performance issues [54, 4], notably that it falls back to all-to-all communication in the
Byzantine setting. Unfortunately, due to incomplete (and changing) specification of the new scheme, it is
unclear how the new scheme used in OmniLedger can address these issues.
Furthermore, there are several challenges that OmniLedger leaves unsolved: (1) Similar to Elastico, Om-
niLedger can only tolerate t < n/4 corruptions. In fact, the protocol can only achieve low latency (less than 10
seconds) when t < n/8. (2) OmniLedger’s consensus protocol requires O (n) per-node communication as each
committee has to gossip multiple messages to all n nodes for each block of transaction. (3) OmniLedger requires
a trusted setup to generate an initial unpredictable configuration to “seed” the VRF in the first epoch. Trivial
algorithms for generating such a common seed require Ω(n 2 ) bits of communication. (4) OmniLedger requires
the user to participate actively in cross-shard transactions which is often a strong assumption for typically
light-weight users. (5) Finally, OmniLedger seems vulnerable to denial-of-service (DoS) attacks by a malicious
user who can lock arbitrary transactions leveraging the atomic cross-shard protocol.
When t < n/4, OmniLedger can achieve a high throughput (i.e., more than 500 tx/sec) only when an opti-
mistic trust-but-verify approach is used to trade-off between throughput and transaction confirmation latency.
6
In this approach, a set of optimistic validators process transactions quickly providing provisional commitments
that are later verified by a set of core validators. While such an approach seems useful for special scenarios such
as micropayments to quickly process low-stake small transactions, it can be considered as a high-risk approach
in regular payments, especially due to the lack of financial liability mechanisms in today’s decentralized sys-
tems. Nevertheless, any blockchain protocol (including Bitcoin’s) has a transaction confirmation latency that
has to be considered in practice to limit the transaction risk.
Network Model. Consider a peer-to-peer network with n nodes who establish identities (i.e., public/private
keys) through a Sybil-resistant identity generation mechanism such as that of [7], which requires every node to
solve a computationally-hard puzzle on their locally-generated identities (i.e., public keys) verified by all other
(honest) nodes without the assumption of a trusted randomness beacon. Without loss of generality and similar
to most hybrid blockchain protocols [23, 54, 45, 40], we assume all participants in our consensus protocol have
equivalent computational resources.
We assume all messages sent in the network are authenticated with the sender’s private key. The mes-
sages are propagated through a synchronous gossip protocol [34] that guarantees a message sent by an honest
7
node will be delivered to all honest nodes within a known fixed time, ∆, but the order of these messages is
not necessarily preserved. This is the standard synchronous model adopted by most public blockchain proto-
cols [45, 30, 40, 4]. We require synchronous communication only during our intra-committee consensus. In
other parts of our protocol, we assume partially-synchronous channels [20] between nodes with exponentially-
increasing time-outs (similar to [40]) to minimize latency and achieve responsiveness.
Threat Model. We consider a probabilistic polynomial-time Byzantine adversary who corrupts t < n/3 of the
nodes at any time. The corrupt nodes not only may collude with each other but also can deviate from the
protocol in any arbitrary manner, e.g., by sending invalid or inconsistent messages, or remaining silent. Similar
to most committee-based protocols [23, 39, 54, 40, 4], we assume the adversary is slowly adaptive, meaning that
it is allowed to select the set of corrupt nodes at the beginning of the protocol and/or between each epoch but
cannot change this set within an epoch. Nodes may disconnect from the network during an epoch or between
two epochs due to any reason such as internal failure or network jitter. However, at any moment, at least a 2/3
fraction of the computational resources belong to uncorrupted participants that are online (i.e., respond within
the network time bound). Finally, our protocol does not rely on any public-key infrastructure or any secure
broadcast channel, but assumes the existence of a random oracle needed for collision-resistant hash functions.
Problem Definition. We assume a set of transactions are sent to our protocol by a set of users that are external
to the protocol. Similar to Bitcoin [52], a transaction consists of a set of inputs and outputs that reference other
transactions, and a signature generated by their issuer to certify its validity. The set of transactions is divided
into k disjoint blocks. Let x i, j represent the j-th transaction in the i-th block. All nodes have access to an
external function д that, given any transaction, outputs 0 or 1 indicating whether the transaction is invalid or
not respectively, e.g., the sum of all outputs of a transaction is equal to the sum of its inputs. The protocol Π
outputs a set X containing k disjoint subsets or shards X i = {x i, j }, for every j ∈ {1..|X i |} such that the following
conditions hold:
• Agreement: For every i ∈ {1..k}, Ω(log n) honest nodes agree on X i with a high probability of at least
1 − 2−λ , where λ is the security parameter.
• Efficiency: The per-node communication and computation complexity is o(n) and the per-node storage
complexity is o(s), where s is the total number of transactions.
4 Our Protocol
In this section, we present RapidChain in detail. We start by defining notations and terms used in the rest of
the paper.
Notation and Terminology. We say an event occurs with high probability meaning that it occurs with prob-
ability 1 − O (1/2λ ), where λ is the security parameter. We refer to any set of o(n) nodes as a committee if at
least a 1/2 fraction of its members belongs to honest nodes. Let node P be a member of a group C. We refer to
other members of C as the neighbors of P in C. When we say a committee runs a protocol, we mean all honest
members of the committee participate in an execution of the protocol. Let C 1 , C 2 be two committees. When
we say C 1 sends a message M to C 2 , we mean every honest member of C 1 sends M to every member of C 2 who
he knows. Since each member of C 2 may receive different messages due to malicious behavior, it chooses the
message with a frequency of at least 1/2 + 1.
8
sensus followed by a Reconfiguration phase. We now explain each component in more details.
Bootstrap. The initial set of participants
√ start RapidChain by running a committee election protocol, where
all nodes agree on a group of O ( n) nodes which we refer to as the root group. The group is responsible for
generating and distributing a sequence of random bits that are used to establish a reference committee of size
O (log n). Next, the reference committee creates k committees {C 1 , ..., Ck } each of size O (log n). The bootstrap
phase runs only once at the start RapidChain.
Consensus. Once members of each committee are done with the epoch reconfiguration, they wait for external
users to submit their transactions. Each user sends its transactions to a subset of nodes (found via a P2P
discovery protocol) who batch and forward the transactions to the corresponding committee responsible for
processing them. The committee runs an intra-committee consensus protocol to approve the transaction and
add it to its ledger.
Reconfiguration. Reconfiguration allows new nodes to establish identities and join the existing committees
while ensuring all the committees maintain their 1/2 resiliency. In Section 4.5, we describe how to achieve this
goal using the Cuckoo rule [60] without regenerating all committees.
In the following, we first describe our Consensus component in Section 4.2 assuming a set of committees
exists. Then, we describe how cross-shard transactions can be verified in Section 4.3, and how committees can
communicate with each other via an inter-committee routing protocol in Section 4.4. Next, we describe the
Reconfiguration component in Section 4.5, and finally, finish this section by describing how to bootstrap the
committees in Section 4.6.
9
In RapidChain, the source chooses a random subset of size d/(1 − f ) of its neighbors and only sends the
digest to the node in that subset. We refer to this process as sparsification. As a result, a node may receive a
message from the source that does not contain all of the digests needed to verify the validity of the gossiped
message. Therefore, the node may not be able to validate the message immediately. However, since at least
one honest node receives each intermediate digest, it will forward the digest. This guarantees that all the node
will have all the correct intermediate digests.
In Section 6.2, we show that if an honest node starts the IDA-Gossip protocol for message M in a committee,
all honest nodes in that committee will receive M correctly with high probability. We also show that, using
sparsification, it suffices to send each intermediate digest to a number of node sublinear in the depth of the
tree. This guarantees that all nodes can verify the message with high probability.
10
initiates consensus protocol on Hi . Before describing the consensus protocol, we remark that all the messages
that the leader or other nodes send during the consensus is signed by their public key and thus the sender of
the message and its integrity is verified.
Our consensus protocol consists of four synchronous rounds. First, the leader gossips a messages containing
Hi and a tag in the header of the message that the leader sets it to propose. Second, all other nodes in the network
echo the headers they received from the leader, i.e., they gossip Hi again with the tag echo. This step ensures
that all the honest nodes will see all versions of the header that other honest nodes received in the first round.
Thus, if the leader equivocates and gossips more than one version of the message, it will be noticed by the
honest nodes. In the third round, if an honest node receives more than one version of the header for iteration
i, it knows that the leader is corrupt and will gossip Hi0 with the tag pending, where Hi0 contains a null Merkle
root and iteration number i.
Finally, in the last round, if an honest node receives f + 1 echoes of the same and the only header Hi for
iteration i, he accepts Hi , and gossips Hi with the tag accept along with all the f + 1 echoes of Hi . The f + 1
echoes serve as the proof of why the node accepts Hi . Clearly, it is impossible for any node to create this proof
if the leader has not gossiped Hi to at least one honest node. If an honest node accepts a header, then all other
honest nodes either accept the same header or they reject any header from the leader. In the above scenario, if
the leader is corrupt, then some honest nodes reject the header and tag it as pending.
Definition 1 (Pending Block). A block is pending at iteration i if it is proposed by a leader at some iteration j
before i, while there are honest nodes that have not accepted the block header at iteration i.
Since less than 1/2 of the committee members are corrupt, the leader will be corrupt with a probability less
than 1/2. Thus, to ensure a block header gets accepted, two leaders have to propose it in expectation. One
way to deal with this issue is to ask the leader of the next iteration to propose the same block again if it is still
pending. This, however, reduces the throughput by roughly half.
11
Transaction (ID=TX9)
UTXO State UTXO State
Input Signature
TX1:row2 TX5:row6 67a8b7635789 TX1:row2
TX5:row6 TX8:row2 8774bb84274c TX7:row3
TX7:row3 TX9:row1
TX8:row2 Output TX9:row2
TX9:row1
TX9:row2
verification of transactions challenging, because the inputs and outputs of each transaction might reside in
multiple committees.
Similar to Bitcoin, each transaction in RapidChain has a unique identity and it has a list of inputs (depicted
by their identities) and a list of outputs that is shown by the transaction ID and their row number (see Figure 1).
All inputs to a transaction must be unspent transaction outputs (UTXOs) which are unused coins from previous
transactions. The outputs of the transaction are new coins generated for the recipients of the exchanged money.
After receiving a transaction, the nodes verify if a transaction is valid by checking (1) if the input is unspent;
and (2) if the sum of outputs is less than the sum of the inputs. The nodes add the valid transaction to the
next block they are accepting. RapidChain partitions the transactions based on their transaction ID among
the committees which will be responsible for storing the transaction outputs in their UTXO databases. Each
committee only stores transactions that have the committee ID as their prefix in their IDs.
Let tx denote the transaction sent by the user. In the verification process, multiple committees may be
involved to ensure all the input UTXOs to tx are valid. We refer to the committee that stores tx and its possible
UTXOs as the output committee, and denote it by C out . We refer to the committees that store the input UTXOs
(1) (N )
to tx as the input committees, and denoted them by C in , . . . , C in .
To verify the input UTXOs, OmniLedger [40] proposes that the user obtain a proof-of-acceptance from every
input committee and submit the proof to the output committee for validation. If each input committee commits
to tx (and marks the corresponding input UTXO as "spent") independently from other input committees, then
tx may be committed partially, i.e., some of its inputs UTXOs are spent while the others are not. To avoid
this situation and ensure transaction atomicity, OmniLedger takes a two-phase approach, where each input
committee first locks the corresponding input UTXO(s) and issues a proof-of-acceptance, if the UTXO is valid.
The user collects responses from all input committees and issues an “unlock to commit”.
While this allows the output committee to verify tx independently, the transaction has to be gossiped to the
entire network and one proof needs to be generated for every transaction, incurring a large communication
overhead. Another drawback of this scheme is that it depends on the user to retrieve the proof which puts
extra burden on typically lightweight user nodes.
In RapidChain, the user does not attach any proof to tx. Instead, we let the user communicate with any
committee who routes tx to the output committee via the inter-committee routing protocol. Without loss of
generality, we assume tx has two inputs I 1 , I 2 and one output O. If I 1 , I 2 belong to different committees other
than C out , then the leader of C out , creates three new transactions: For i ∈ {1, 2}, txi with input Ii and output
Ii0, where |Ii0 | = |Ii | (i.e., the same amounts) and Ii0 belongs to C out . tx3 with inputs I 10 and I 20 and output O. The
leader sends txi to C in i via the inter-committee routing protocol, and C i adds tx to its ledger. If tx is successful,
in i i
i
C in sends Ii to C out . Finally, C out adds tx3 to its ledger.
0
Batching Verification Requests. At each round, the output committee combines the transactions that use
UTXOs belonging to the same input committee into batches and sends a single UTXO request to the input
committee. The input committee checks the validity of each UTXO and sends the result of the batch to the
output committee. Since multiple UTXO requests are batched into the same request, a result can be generated
for multiple requests at the input committee.
12
0x000 0x001 0x010 0x011 0x100 0x101 0x110 0x111 0x000 0x001 0x010 0x011 0x100 0x101 0x110 0x111
C0 C1 C2 C3 C4 C5 C6 C7
20 20 C0 C1 C2 C3 C4 C5 C6 C7
21 21 20
21
22 22 22
Figure 2: (Left) Each committee in RapidChain maintains a routing table containing log n other committees. (Right)
Committee C 0 wants to locate committee C 7 (via C 4 and C 6 ) responsible for transactions with prefix 0x111.
13
Protocol 1 Epoch Reconfiguration
1. Random generation during epoch i − 1
(a) The reference committee (C R ) runs the DRG protocol to generate a random string r i for the next epoch.
(b) Members of C R reveal r i at the end of epoch i − 1.
2. Join during epoch i
(a) Invariant: All committees at the start of round i receive the random string r i from C R .
(b) New nodes locally choose a public key PK and contact a random committee C to request a PoW puzzle.
(c) C sends the r i for the current epoch along with a timestamp and 16 random nodes in C R to P.
(d) All the nodes who wish to participate in the next epoch find x such that O = H(timestamp||PK||r i ||x ) ≤ 2γ −d
and sends x to Cr .
(e) Cr confirms the solution if it received it before the end of the epoch i.
3. Cuckoo exchange at round i + 1
(a) Invariant: All members of C R participate in the DRG protocol during epoch i and have the value r i+1 .
(b) Invariant: During the epoch i, all members of C R receive all the confirmed transactions for the active nodes
of round i + 1.
(c) Members of Cr will create the list of all active nodes for round i + 1 and also create A, the set of active
committees, and I , the set of inactive committees.
(d) C R uses r i+1 to assign a committees in A for each new nodes.
(e) For each committee C with a new member, C R chooses a constant k members of C uniformly at random (use
r i+1 as the seed) and evict them.
(f) For all the evicted nodes, C R chooses a committees I uniformly at random (use r i+1 as the seed) and assign
the node to his new committee.
(g) C R adds r i and the new list of all the members and their committees and add it the first block of the epoch.
(h) C R gossips the first block to all the committees in the system using inter-committee routing protocol.
Offline PoW. RapidChain relies on PoW only to protect against Sybil attacks by requiring every node who
wants to join or stay in the protocol to solve a PoW puzzle. In each epoch, a fresh puzzle is generated based on
the epoch randomness so that the adversary cannot precompute the solutions ahead of the time to compromise
the committees. In RapidChain, all nodes solve a PoW offline without making the protocol stop and wait for
the solution. Thus, the expensive PoW calculations are performed off the critical latency path.
Since the adversary is bounded to 1/3 fraction of the total computation power during each epoch, the
fraction of total adversarial nodes is strictly less than n/3. In RapidChain, the reference committee (C R ) is
responsible to check the PoW solutions of all nodes. At the start of each epoch, C R agrees on an reference block
consisting of the list of all active nodes for that epoch as well as their assigned committees. C R also informs
other committees by sending the reference block to all other committees.
Epoch Randomness Generation. In each epoch, the members of the reference committee run a distributed
random generation (DRG) protocol to agree on an unbiased random value. C R includes the randomness in the
reference block so other committees can randomize their epochs. RapidChain uses a well-known technique
based on the verifiable secret sharing (VSS) of Feldman [29] to generate unbiased randomness within the ref-
erence committee.
Let Fp denote a finite field of prime order p, m denote the size of the reference committee, and r denote
the randomness for the current epoch to be generated by the protocol. For all i ∈ [m], node i chooses ρ i ∈ Fp
uniformly at random and VSS-shares it to all other node. Next, for all j ∈ [m], let ρ 1j , ..., ρmj be the shares
node j receives from the previous step. Node j computes its share of r by calculating m l =1 ρ l j . Finally, nodes
P
14
exchange their shares of r and reconstruct the result using the Lagrange interpolation technique [33]. The
random generation protocol consists of two phases: sharing and reconstruction. The sharing phase is more
expensive in practice but is executed in advance before the start of the epoch.
Any new node who wishes to join the system can contact any node in any committees at any time and
request the randomness of this epoch as a fresh PoW puzzle. The nodes who solve the puzzle will send a
transaction with their solution and public key to the reference committee. If the solution is received by the
reference committee before the end of the current epoch, the solution is accepted and the reference committee
adds the node to the list of active nodes for the next epoch.
Committee Reconfiguration. Partitioning the nodes into committees for scalability introduces a new chal-
lenge when dealing with churn. Corrupt nodes could strategically leave and rejoin the network, so that even-
tually they can take over one of the committees and break the security guarantees of the protocol.
One approach to prevent this attack is to re-create all the committees periodically faster than the adversary’s
ability to generate churn. However, there are two drawbacks to this solution. First, re-generating all of the
committees is very expensive due to the large overhead of the bootstrapping protocol (see Section 5). Second,
maintaining a separate ledger for each committee is challenging when several committee members may be
replaced in every epoch.
To handle the this problem, RapidChain uses a modified version of the Cuckoo rule [8, 60] to re-organize
only a subset of committee members during the reconfiguration event at the beginning of each epoch. We refer
to this method as the bounded Cuckoo rule.
Bounded Cuckoo Rule. At the start of each epoch after defining the set of active nodes who remain in the
protocol for the new epoch, the reference committee defines the set of the largest m/2 committees (who have
more active members) as the active committee set, which we denote by A. We refer to the remaining m/2
committees with smaller sizes as inactive committee set denoted by I. Active committees accept new nodes
that have joined the network in the previous epoch, as new members of the committee. However, inactive
committees only accept the members, who were part of the network before, to join them. Both active and
inactive committees fulfill any other responsibilities they have in the protocol (such as consensus on blocks and
routing transaction) indifferently. For each new node, the reference committee chooses a random committee
Ca from the set A. At the same time, the reference committee evicts (cuckoos) a constant number of members
from Ca and assigns them to other committees chosen randomly from I.
In Section 6.5, we show two invariant properties which are maintained for each committee during the
reconfiguration protocol: At any moment, the committees are balanced and honesty. The first property ensures
a bounded deviation in the sizes of the committees. The second property ensure that each committee maintains
its honest majority.
values for |R| and d R such that the failure probability of our bootstrap phase, i.e., the probability of E(T , S ), is
minimized. For example, for 4,000 nodes (i.e., |L| = 4,000), we set d R = 828 (i.e., a group size of 828 nodes) and
|R| = 642 to get a failure probability of 2−26.36 . In Section 6.1, we use this probability to bound the probability
15
that each epoch of RapidChain fails.
Once the groups of nodes are formed using the sampler graph, they participate in a randomized election
procedure. Before describing the procedure, we describe how a group of nodes can agree on an unbiased
random number in a decentralized fashion.
Subgroup Election. During the election, members of each group run the DRG protocol to generate a random
string s and use it to elect the parties associated with the next level groups: Each node with identification ID
computes h = H (s ||ID) and will announces itself elected if h <= 2256−e , where H is a hash function modeled
as a random oracle. All nodes sign the (ID, s) of the e elected nodes who have the smallest h and gossip their
signatures in the group as a proof of election for the elected node. We set e = 2 in practice.
Subgroup Peer Discovery. After each subgroup election, all nodes must learn the identities of the elected
nodes from each group. The elected nodes will gossip this information and a proof, that consists of d R /2
signature on (ID, s) from different members of the group, to all the nodes. If more than e nodes from the group
correctly announce they got elected, the group is dishonest and all honest parties will not accept any message
from any elected members of that group.
Committee Formation. The result of executing the above election protocol is a group with honest majority
whom we call root group. Root group selects the members of the first shard, reference shard. The reference
committee partitions the set of all nodes at random into sharding committees which are guaranteed to have
1/2 honest nodes, and which store the shards as discussed in Section 4.3.
Election Network. The election network is constructed by chaining ` sampler graphs {G (L 1 , R 1 ), ..., G (L ` , R ` )}
together. All sampler graphs definitions are included in the protocol specification. Initially, the n nodes are in
L 1 . Based on the edges in the graph, every node is assigned to a set of groups in R 1 . Then, each group runs a
subgroup election protocol (described below) to elect a random subset of its members. The elected members will
then serve as the nodes in L 2 of G (L 2 , R 2 ). This process continues to the last sampler graph G (L ` , R ` ) at which
point only a single group is formed. We call the last group, the leader group and we construct the election
network such that the leader group has honest majority with high probability.
To construct the election network, we set
|Li | = |Li−1 | α i +βi γi , |Ri | = |Li | α i ,
j k
(1)
where |L 1 | = n, |R 1 | = n α i , 0 < α i , βi , γi < 1, and i = {2, ..., `}. It can be easily shown that for some constant `,
|R ` | = 1. From Equation 3, we can bound the error probability for every level i of the election network denoted
by pi , where
2d
pi ≤ 2e |Li |+|Ri |−δ Ri |S i |/2
. (2)
In Section 6.4, we discuss on how to set the parameters α, β and γ to instantiate such a graph and present
a novel analysis that allows us to obtain better bounds for their sizes.
5 Evaluation
Experimental Setup. We implement a prototype of RapidChain in Go1 to evaluate its performance and com-
pare it with previous work. We simulate networks of up to 4,000 nodes by oversubscribing a set of 32 machines
each running up to 125 RapidChain instances. Each machine has a 64-core Intel Xeon Phi 7210 @ 1.3GHz pro-
cessor and a 10-Gbps communication link. To simulate geographically-distributed nodes, we consider a latency
of 100 ms for every message and a bandwidth of 20 Mbps for each node. Similar to Bitcoin Core, we assume
each node in the global P2P network can accept up to 8 outgoing connections and up to 125 incoming connec-
tions. The global P2P overlay is only used during our bootstrapping phase. During consensus epoch, nodes
communicate through much smaller P2P overlays created within every committee, where each node accepts
up to 16 outgoing connections and up to 125 incoming connections.
1 https://golang.org
16
Election Network
Level 2 Subgroup
Group
Level 1
Nodes
Level 0
Latency (sec)
Latency (ms)
425 20.00
5000
375 4000 15.00
3000
10.00
2000 8.84
325
1000 5.00
275 0 0.00
100 125 150 175 200 225 250 128 256 512 1024 2048 4096 8192
Committee Size Block Size (KB)
Figure 4: Latency of gossiping an 80-byte message for different committee sizes (left); Impact of block size on through-
put and latency (right)
Unless otherwise mentioned, all numbers reported in this section refer to the expected behavior of the sys-
tem when less than half of all nodes are corrupted. In particular, in our implementation of the intra-consensus
protocol of Section 4.2, the leader gossips two different messages in the same iteration with probability 0.49.
Also, in our inter-committee routing protocol, 49% of the nodes do not participate in the gossip protocol (i.e.,
remain silent).
To obtain synchronous rounds for our intra-committee consensus, we set ∆ (see Section 3 for definition)
conservatively to 600 ms based on the maximum time to gossip an 80-byte digest to all nodes in a P2P network of
250 nodes (our largest committee size) as shown in Figure 4 (left). Recall that synchronous rounds are required
only during the consensus protocol of Ren et al. [58] to agree on a hash of the block resulting in messages of
up to 80 bytes size including signatures and control bits.
We assume each block of transaction consist of 4,096 transactions, where each transaction consists of 512
bytes resulting in a block size of 2 MB. To implement our IDA-based gossiping protocol to gossip 2-MB blocks
within committees, we split each block into 128 chunks and use the Jerasure library [3] to encode messages us-
ing erasure codes based on Reed-Solomon codes [57] with the decoding algorithm of Berlekamp and Welch [10].
Choice of Block Size. To determine a reasonable block size, we measure the throughput and latency of Rapid-
Chain with various block sizes between 512 KB and 8,192 KB for our target network size of 4,000 nodes. As
shown in Figure 4 (right), larger block sizes generally result in higher throughput but also in higher confirma-
tion latency. To obtain a latency of less than 10 seconds common in most mainstream payment systems while
obtaining the highest possible throughput, we set our block size to 2,048 KB, which results in a throughput of
17
80
67.9 69.1 69.8 70.0 70.4 70.6 70.7 9.9
8000 7031 7384 70
18
7.0
1 node 5 nodes 10 nodes
6.0 6.6x
Increase in Throughput
400 6.3x
350
5.0 5.5x
300 4.0 4.6x
Latency (sec)
250 4.2x
3.0 3.3x
200
150 2.0 2.5x
100
1.0 1.6x
50
0 0.0
500 1000 1500 2000 2500 3000 3500 4000 500 1000 1500 2000 2500 3000 3500 4000
Number of Nodes Number of Nodes
Figure 6: Reconfiguration latency when 1, 5, or 10 nodes join (left); Impact of batching cross-shard verifications
(right)
happen in parallel, the latency does not increase significantly with more joins. Moreover, the network size
impacts the reconfiguration latency only slightly because churn mostly affects the committees involved in the
reconfiguration process. In contrast, Elastico [45] cannot handle churn in an incremental manner and requires
re-initialization of all committees. For a network of 1,800 nodes, epoch transition in OmniLedger [40] takes
more than 1,000 seconds while it takes less than 380 second for RapidChain. In practice, OmniLegder’s epoch
transition takes more than 3 hours since the distributed random generation protocol used has to be repeated
at least 10 times to succeed with high probability. Finally, it is unclear how this latency will be affected by the
number of nodes joining (and hence redistributing node between committees) in OmniLedger.
Impact of Cross-Shard Batching. One of the important features of RapidChain is that it allows batching
cross-shard verification requests in order to limit the amount of inter-committee communications to verify
transactions. This is especially crucial when the number of shards is large because, as we show in Section 6.7,
in our target network size of 4,000 nodes with 16 committees, roughly 99.98% of all transactions are expected to
be cross-shard, meaning that at least one of every transaction’s input UTXOs is expected to be located in a shard
other than the one that will store the transaction itself. Since transactions are assigned to committees based
on their randomly-generated IDs, transactions are expected to be distributed uniformly among committees. As
a result, the size of a batch of cross-shard transactions for each committee for processing every block of size
2 MB is expected to be equal to 2 MB/16 = 128 KB. Figure 5 (right) shows the impact of batching cross-shard
verifications on the throughput of RapidChain for various network sizes.
Storage Overhead. We measure the amount of data stored by each node after 1,250 blocks (about 5 million
transactions) are processed by RapidChain. To compare with previous work, we estimate the storage required
by each node in Elastico and OmniLedger based on their reported throughput and number of shards for similar
network sizes as shown in Table 2.
The storage overhead required by a high-throughput blockchain protocol such as RapidChain can become
problematic in practice. One can employ a ledger pruning/checkpointing mechanism, such as those described
in [43, 40], as an orthogonal mechanism to significantly reduce the storage and node initialization overhead of
RapidChain. For example, a large percentage of storage is currently used to store spent transactions, which can
be removed and replaced with a much smaller aggregate signature from committee members. Since designing
such a mechanism requires extensive efforts, we leave it as a future work. We refer the reader to the recent
work of Leung et al. [43] that focuses on minimizing the storage/initialization overhead of high throughput
blockchain protocols.
Overhead of Bootstrapping. We measure the overheads of our bootstrapping protocol to setup committees
for the first time in two different experiments with 500 and 4,000 nodes. The measured latencies are 2.7 hours
and 18.5 hours for each experiment respectively. Each participant in these two experiments consumes a band-
width of roughly 29.8 GB and 86.5 GB respectively. Although these latency and bandwidth overheads are sub-
stantial, we note that the bootstrapping protocol is executed only once, and therefore, its overhead can be
19
Protocol Network Size Storage
Elastico [45] 1,600 nodes 2,400 MB (estimated)
OmniLedger [40] 1,800 nodes 750 MB (estimated)
RapidChain 1,800 nodes 267 MB
RapidChain 4,000 nodes 154 MB
Table 2: Storage required per node after processing 5 M transactions without ledger pruning
amortized over several epochs. Elastico and OmniLedger assume a trusted setup for generating an initial ran-
domness, and therefore, do not report any measurements for such a setup.
Note that one can sample a committee with or without replacement from the total population of nodes.
If the sampling is done with replacement (i.e., committees can overlap), then the failure probability for one
committee can be calculated from the cumulative binomial distribution function,
m
m x
!
f (1 − f )m−x ,
f g X
Pr X ≥ bm/2c =
x =0
x
which calculates the probability that no less than x nodes are corrupt in a committee of n nodes sampled
from an infinite pool of nodes, where the probability of each node being corrupt is f = t/n. If the sampling is
done without replacement (as in RapidChain), then the binomial distribution can still be used to approximate
(and bound) the failure probability for one committee. However, when the committee size gets larger relative
to the population size, the hypergeometric distribution yields a better approximation (e.g., roughly 3x smaller
failure probability for n = 2, 000, m = 200, t < n/3).
Unlike the binomial distribution, the hypergeometric distribution depends directly on the total population
size (i.e., n). Since n can change over time in an open-membership network, the failure probability might
be affected consequently. To maintain the desired failure probability, each committee in RapidChain runs a
consensus in pre-determined intervals, e.g., once a week, to agree on a new committee size, based on which,
the committee will accept more nodes to join the committee in future epochs.
Figure 7 shows the probability of failure calculated using the hypergeometric distribution to sample a com-
mittee (with various sizes) from a population of 2,000 nodes for two scenarios: (1) n/3 total resiliency and n/2
committee resiliency (as in RapidChain); and (2) n/4 total resiliency and n/3 committee resiliency (as in previ-
ous work [45, 40]). As shown in the figure, the failure probability decreases much faster with the committee
size in the RapidChain scenario.
To bound the failure probability of each epoch, we calculate the union bound over k = n/m committees,
where each can fail with the probability pcommittee calculated previously. In the first epoch, the committee elec-
tion procedure (from the bootstrapping protocol) can fail with probability pbootstrap ≤ 2−26.36 . The random gen-
eration protocol executed by the reference committee at the beginning of each epoch is guaranteed to generate
20
1E+00
1E-01
1E-02
Failure Probability
1E-03
1E-04
1E-05
1E-06 1/3 to 1/2 (RapidChain)
1E-07 1/4 to 1/3 (Previous work)
1/5 to 1/3
1E-08
0 20 40 60 80 100 120 140 160 180 200 220 240
Committee Size
Figure 7: Log-scale plot of the probability of failure to sample one committee from a population of 2,000 nodes in
RapidChain and previous work [45, 40] using the hypergeometric distribution.
an unbiased coin with probability one, and the consensus protocol executed by each committee is guaranteed
to terminate with probability one. By setting n = 4, 000, m = 250, and t < n/3, we have pcommittee < 3.7 · 10−8 .
Therefore, the failure probability of each epoch is
Ideally, we would hope that the probability that the adversary taking t faction of blocks in the epoch, and
an honest miner takes 1 − t fraction of the blocks. However, it is not the case for committee-based sharding
protocols such as RapidChain. The increase in the adversary’s effective power comes from the fact that the
upper-bound on the fraction of adversarial ids will increase inside each shard comparing to the whole system.
Thus, we define and calculate the effective power of the adversary.
Definition 2. Adversarial effective power. The ratio of the blocks that is created by the adversary and is added to
the chain to the total number of blocks.
Theorem 1. The effective power of the adversary is 1/2; i.e., the adversary can create half of the blocks in the
system.
The proof of this lemma follows from the choices of the target failure probability and committee sizes which
we discussed in this section.
21
Next, we show that the process of sparsification is not going to change the correctness and security of the
gossiping protocol and it increases the probability of failure slightly.
Lemma 2. Assume the gossiper sparisfy all the Merkle tree nodes up to some level i. The probability that the
message is not received correctly after the gossiping of a big message with specification with parameter s is at most
0.1 + 2−(s−i−1) where s is the size of the subset of nodes whom gossiper sends each sparsified node and can be set
based on the the desired failure probability (as a function of the importance of the message).
Proof. The reconstruction stage fails if there is a node in the tree for which the hash at that node is distributed
to only corrupt nodes. We will call a tree node sparsified if the hash T (Mi , M j ) at the node is not sent along
with all of the leaves that require that hash for verification of the block. We will sparisfy all nodes up to some
level i. The sender can calculate s, which is the size of the subset of nodes whom he sends each sparsified
node to guarantee that with probability at least 2−c , the node is sent to at least one honest node. Let l (x ) count
the number of leaf nodes in the sub-tree rooted at node x, and u (x ) count the number of corrupt nodes in the
sub-tree rooted at node x.
If a node is distributed to s nodes at random, the probability that only corrupt nodes receive the node is at
most f s . Therefore, taking the union bound over all 2i+1 − 1 nodes and by setting f < 1/2,
The difference between sparsification and not-sparsification is that by sparsification, the gossiper decrease
his chance of a successful gossip slightly but in return puts less communication burden on the nodes and net-
work. Since the gossip of the blocks are crucial to the system, we do not use sparsification for them. However,
users can use sparsification for their large transactions if the transaction is not time-sensitive. In case the
transaction fails to be gossiped correctly due to sparsification, the user can re-send the Merkle tree nodes later
which will happen with small probability.
22
since it is safe (see the theorem for safety). Thus, all honest nodes will vote for the proposed header for all the
pending blocks, subsequently the will receive f + 1 votes for them since we have f + 1 honest nodes. Therefore
all honest nodes have finalized the block at the end of this iteration.
Security of Cross-Shard Verification. Without loss of generality, we assume tx has two inputs I 1 , I 2 and one
output O. If the user provides valid inputs, both input committees successfully verify their transactions and
send the new UTXOs to the output committee. Thus, tx3 will be successful. If both inputs are invalid, both
input committees will not send Ii0, and as a result tx3 will not be accepted. If I 1 is valid but I 2 is invalid, then C in
1
successfully transfers I 1 but C in will not send I 2 . Thus, tx3 will not be accepted in C out . However, I 1 is a valid
0 2 0 0
UTXO in the output committee and the user can spend it later. The output committee can send the resulting
UTXO to the user.
Original Analysis. Recall that L represents the set of parties and R represents the set of groups selected based
on the edges incident to the node from L.
Lemma 3. In a sampler graph with a random assignment of honest parties and adversarial assignment of dis-
honest parties, the probability that all the formed groups in any subset of size |S | being corrupted is less than
2e ( |L |+ |R |−δ d R |S |/2 .
2
Consider two fixed subsets of nodes T ⊆ L and S ⊆ R. Let N denote the number of edges between T and S.
One can imagine T represents the largest coalition of faulty parties and S represents any subset of groups. Let
E(T , S ) denote the event that every node in S has more than a |T |
|L | + δ fraction of its edges incident to nodes in
T . The number of edges incident to S is equal to d R |S |. By linearity of expectation, E[N ] = |T |
|L | d R |S |. Suppose
that we add the edges to S one-by-one. Let {X i } denote a sequence of random variables such that X i = 1 if
and only if the i-th edge is incident to T . The sequence {Z i = E[N |X 0 , ..., X i ]} defines a Doob martingale [25,
Chapter 5], where Z 0 = E[N ]. If any of the trials is altered, N changes by at most one. For some positive δ , the
failure event E happens whenever
!
|T |
N > + δ d R |S | = E[N ] + δd R |S |.
|L|
By Azuma’s inequality,
23
Lemma 4. In a sampler graph with a random assignment of honest parties and adversarial assignment of dis-
honest parties, the probability that all the formed groups in any subset of size |R 0 | being corrupted is less than
β2 |L | 2 |R | (e −µb (eµb /x ) x + e −µд (eµд /(2x )) 2x ) |R | .
0
Proof. The sampler graph selection process can be seen as the classic balls-and-bins process: |L|d L balls (parties)
are thrown independently and uniformly at random into |R| bins (groups). Without loss of generality we can
assume we first through all dishonest parties (bad balls) then all the honest parties (good balls).
For a fix committee C, let X д be a random variable representing the maximum number of dishonest parties
assigned to C, X b be a random variable representing the minimum number of honest parties assigned to C, µд
and µb be the expected number of honest and dishonest parties per group respectively.
It is well known that the distribution of the number of (good/bad) balls in a bin is approximately Poisson with
mean µb = d L |L|/4 and µд = 3d L |L|/4 [51, Chapter 5]. let X̃ and be the Poisson random variable approximating
X . We have µ = E[X ] = E[X̃ ]. We use the following Chernoff bounds from [51, Chapter 5] for Poisson random
variables:
We consider a group to be good if X д > x and X b < x. This is to make sure a good group has honest
majority. Note that this definition is an under-estimation and we do not count some of the good groups. Based
on this definition, a group is bad if X д ≤ x or X b ≥ x.
The the probability that a fixed committee being bad is:
Now, consider a subset of groups of size |R 0 |, the probability that all of them being bad is, (e −µb (eµb /x ) x +
e −µд (eµ x |R 0 | .
д /x ) )
Since the adversary can choose the bad parties and the bad groups, we use union bound over all the possible
adversarial choices to find the probability that all the groups in any subset of size |R 0 | being corrupted:
Tighter Bound. Not all the choices of the adversary gives him the same probability of success. Thus, we can
consider strategies that are strictly worst than another strategy and remove them from the union bound since
it is not beneficial for the adversary to choose such strategies. We consider the following random process:
1. All good parties are assigned randomly to groups.
2. The adversary assign α fraction of all its bad parties to the groups such that the assignment corrupts
maximum number of groups.
3. The adversary assigns remaining 1 − α bad parties such that each party assigned to at least one good
group.
We claim that any strategy who does not follow the previous process i.e. it assigns a bad party to all bad
committees at step (3) is a strictly worst since assigning bad parties to the groups that are already bad will not
increase the chance of adversary to corrupt a new group.
Similar to the previous analysis, we can calculate the probability that a set of size |R 0 | has only bad com-
mittees in it after throwing all good parties and α fraction of the bad parties that is,
24
Protocol ID Genera- Bootstrap Consensus Storage per Node
tion
Elastico [45] O (n2 ) Ω(n 2 ) O (m 2/b + n) O (|B|)
OmniLedger [40] O (n2 ) Ω(n 2 ) Ω( /b + n)
m2 O (m · |B|/n)
√
RapidChain O (n2 ) O (n n) O (m 2/b + m log n) O (m · |B|/n)
Now, we can calculate the fraction of strategies that the adversary ignores due to the step three rule,
|R 0 |!(|R 0 | − log n)!
β= . (9)
|R|!(|R| − log n)
Thus, in our union bound, we can ignore this fraction,
Lemma 6. Any k-region in active committees has age at most λ(n/2k ) log n.
25
Proof. The probability that a k-region Ri is evicted at any round is 2k/n since at any round we have m/2 ac-
tive committees and as a result half of the k-regions will accept a new join. Conditioned on the event that
the committee does not get inactive during this time, we can assume this probability is independent of other
rounds. Note that this condition considers the worst case scenario since otherwise the committee gets inac-
tive during this time. Hence, the probability that Ri has age at least λ(n/2k ) log n is (1 − 2k/n) λ(n/k ) log n ≤
e −2k /nλ(n/2k ) log n = n −λ .
Lemma 7. Any fixed node v in an active committee, it gets replaced at most (1 + δ )λ log n times within
λ(n/2k ) log n rounds.
Proof. We prove this lemma conditioned on the fact that the node is placed in an active committee with proba-
bility 1/2, i.e., half of the committees get to be active after one node joins them. This condition is considers the
worst case scenario in which we assume that half of the inactive committees have numbers of nodes very close
to being active in the next round. Let the indicator random variable zt = 1 if node p is replaced in t, otherwise
it is 0. Pr [zt = 1] = 1/2 2k n since at any time we randomly choose a region to evict from all active regions. Let
Pt =λ(n/2k ) log n k
Z = t =0 zt . We can compute E[Z ] = λ(n/2k ) n log n = 1/2λ log n. Using the Chernoff bound, we
can show that Z < (1 + δ )E[Z ] with high probability.
We state the following lemma from [8] (Lemma 2.8), which we use directly since our construction will not
change the fraction of honest and corrupt nodes in any way from their construction.
Lemma 8. Let t be the maximum number of corrupt nodes at any point in the system. At any time, a fixed
t
committee has within (1 − t/n)(1 ± δ )c log nk/2 old honest and n−t (1 ± δ )c log nk/2 old corrupt nodes with high
probability.
The balanced and honesty committee properties follow but we omit the details here.
Proof. The proof is similar to the proof of Lemma 2.8 from [8] with different parameters based on Lemmas 5, 6,
and 7.
Theorem 5. At any time during the protocol, all committees satisfy the balancing and honesty conditions.
Proof. To prove the theorem, it is enough to prove the balancing and honesty properties for any committee.
First note that the number of new nodes in each committee is at most c log n. We also calculated the number
of old nodes in Lemma 8.
Balanced Committees. The maximum number of nodes in each committee is c log n + c/2(1 +
t t
δ ) 3 − n + n−t k log n and the minimum load is c/2(1 − δ ) log n.
t
Honest Committees. Choosing k such that n−t < 1 − 1/k, any committee has (1 −t/n)(1 −δ )c log nk/2 honest
t
and n−t (1 + δ )c log nk/2 corrupt nodes with high probability. Note that this values are calculated for the worst
case scenario when the adversary targeted the committee of size (c log n)k/n.
a cross-shard transaction, and the leader requires to requests a verification proof from a constant number of
committees assuming tx has constant number of inputs and outputs. The cost of routing the requests and their
responses is O (m log n), and the cost of consensus on the request and response is O (m 2 /b) (amortized on the
26
block size due to batching) which happens in every input and output committee. Thus, the total per-transaction
communication and complexity of a consensus iteration is equal to O (m2 /b + m log n).
√ √
Complexity of Bootstrap Protocol. Assuming a group size of O ( n) and O ( n) groups, we count the total
communication complexity of our bootstrap protocol√as follows. Each group runs the DRG√protocol that re-
quires O (n) communication. Since each group has O ( n) nodes, the total complexity √ is O (n n). After DRG, a
constant number of members from each group will gossip the result incurring a O (n √ n) overhead, where O (n)
is the cost of each gossip. Since the number of nodes in the root group is also O ( n), its message complexity
to generate a randomness and to gossip it to all of its members for electing a reference committee is O (n).
Storage Complexity. Let |B| denote the size of the blockchain. We divide the ledger among n/m shards, thus
each node stores O (m · |B|/n) amount of data. Note that we store a reference block at the start of each epoch that
contains the list of all of the nodes and their corresponding committees. This cost is asymptotically negligible
in the size of the ledger that each node has to store as long as n = o(|B|).
1, if u = v = 1
(1/k )u , if v = 1
F (u, v, k ) = k −v
k · F (u − 1, v − 1, k ), if u = v
(10)
k −v
k · F (u − 1, v − 1, k )+
v
k · F (u − 1, v, k ),
otherwise.
For our target network of 4,000 nodes where we create k = 16 committees almost all transactions are expected
to be cross-shard because 1 − F (3, 1, 16) = 99.98%. In comparison, for a smaller network of 500 nodes where
we create only 3 committees, this probability is equal to 1 − F (3, 1, 3) = 96.3%.
7 Acknowledgment
The authors would like to acknowledge support from NSF grants CNS-1633282, 1562888, 1565208, and DARPA
SafeWare W911NF-15-C-0236 and W911NF-16-1-0389. We are also grateful for kind help from Loi Luu (NUS)
and Aditya Sinha (Yale), and invaluable comments from Dominic Williams (Dfinity), Timo Hanke (Dfinity),
Bryan Ford (EPFL), and Eleftherios Kokoris Kogias (EPFL).
1A similar calculation is done in [40] but the presented formula is, unfortunately, incorrect.
27
8 Conclusion
We present RapidChain, the first 1/3-resilient sharding-based blockchain protocol that is highly scalable to large
networks. RapidChain uses a distributed ledger design that partitions the blockchain across several commit-
tees along with several key improvements that result in significantly-higher transaction throughput and lower
latency. RapidChain handles seamlessly churn introducing minimum changes across committee membership
without affecting transaction latency. Our system also features several improved protocols for fast gossip of
large messages and inter-committee routing. Finally, our empirical evaluation demonstrates that RapidChain
scales smoothly to network sizes of up to 4,000 nodes showing better performance than previous work.
References
[1] Blockchain charts: Blockchain size, March 2017. Available at https://blockchain.info/charts/
blocks-size.
[2] Blockchain charts: Hashrate distribution, March 2017. Available at https://blockchain.info/pools.
[3] Jerasure: Erasure coding library, May 2018. Available at http://jerasure.org.
[4] Ittai Abraham, Dahlia Malkhi, Kartik Nayak, Ling Ren, and Alexander Spiegelman. Solida: A blockchain
protocol based on reconfigurable byzantine consensus. In Proceedings of the 21st International Conference
on Principles of Distributed Systems, OPODIS ’17, Lisboa, Portugal, 2017.
[5] Noga Alon, Haim Kaplan, Michael Krivelevich, Dahlia Malkhi, and JP Stern. Addendum to scalable secure
storage when half the system is faulty. Information and Computation, 2004.
[6] Noga Alon, Haim Kaplan, Michael Krivelevich, Dahlia Malkhi, and Julien Stern. Scalable secure storage
when half the system is faulty. In Proceedings of the 27th International Colloquium on Automata, Languages
and Programming, 2000.
[7] Marcin Andrychowicz and Stefan Dziembowski. PoW-Based Distributed Cryptography with No Trusted
Setup, pages 379–399. Springer Berlin Heidelberg, Berlin, Heidelberg, 2015.
[8] Baruch Awerbuch and Christian Scheideler. Towards a scalable and robust DHT. In Proceedings of the
Eighteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’06, pages 318–
327, New York, NY, USA, 2006. ACM.
[9] Shehar Bano, Alberto Sonnino, Mustafa Al-Bassam, Sarah Azouvi, Patrick McCorry, Sarah Meiklejohn,
and George Danezis. Consensus in the age of blockchains. CoRR, abs/1711.03936, 2017.
[10] E Berlekamp and L Welch. Error correction for algebraic block codes, US Patent 4,633,470, December 1986.
[11] Richard E Blahut. Theory and practice of error control codes, volume 126. Addison-Wesley Reading (Ma)
etc., 1983.
[12] G Bracha. An o(log n) expected rounds randomized byzantine generals protocol. In Proceedings of the
Seventeenth Annual ACM Symposium on Theory of Computing, STOC ’85, pages 316–326, New York, NY,
USA, 1985. ACM.
[13] Gabriel Bracha. An asynchronous [(n − 1)/3]-resilient consensus protocol. In Proceedings of the Third
Annual ACM Symposium on Principles of Distributed Computing, PODC ’84, pages 154–162, New York, NY,
USA, 1984. ACM.
[14] Gabriel Bracha. Asynchronous Byzantine agreement protocols. Information and Computation, 75(2):130–
143, November 1987.
28
[15] Gabriel Bracha and Sam Toueg. Resilient consensus protocols. In Proceedings of the Second Annual ACM
Symposium on Principles of Distributed Computing, PODC ’83, pages 12–26, New York, NY, USA, 1983.
ACM.
[16] Vitalik Buterin. Ethereum’s white paper. https://github.com/ethereum/wiki/wiki/White-Paper,
2014.
[17] Christian Cachin, Klaus Kursawe, and Victor Shoup. Random oracles in constantinople: Practical asyn-
chronous Byzantine agreement using cryptography. In Proceedings of the 19th ACM Symposium on Prin-
ciples of Distributed Computing (PODC), pages 123–132, 2000.
[18] Ran Canetti and Tal Rabin. Fast asynchronous byzantine agreement with optimal resilience. In Proceedings
of the Twenty-fifth Annual ACM Symposium on Theory of Computing, STOC ’93, pages 42–51, New York,
NY, USA, 1993. ACM.
[19] M. Castro and B. Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Transactions on
Computer Systems (TOCS), 20(4):398–461, 2002.
[20] Miguel Castro and Barbara Liskov. Practical byzantine fault tolerance. In Proceedings of the Third Sympo-
sium on Operating Systems Design and Implementation, OSDI ’99, pages 173–186, 1999.
[21] James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay
Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak,
Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan,
Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale
Woodford. Spanner: Google’s globally-distributed database. pages 251–264, 2012.
[22] C. Decker and R. Wattenhofer. Information propagation in the Bitcoin network. In P2P, pages 1–10. IEEE,
2013.
[23] Christian Decker, Jochen Seidel, and Roger Wattenhofer. Bitcoin meets strong consistency. In Proceedings
of the 17th International Conference on Distributed Computing and Networking, ICDCN ’16, pages 13:1–
13:10, New York, NY, USA, 2016. ACM.
[24] John Douceur. The Sybil attack. In Proceedings of the Second Internation Peer-to-Peer Symposium (IPTPS),
2002.
[25] Devdatt P. Dubhashi and Alessandro Panconesi. Concentration of Measure for the Analysis of Randomized
Algorithms. Cambridge University Press, New York, NY, USA, 2009.
[26] Cynthia Dwork and Moni Naor. Pricing via processing or combatting junk mail. In Advances in Cryptology
— CRYPTO’ 92: 12th Annual International Cryptology Conference Santa Barbara, California, USA August 16–
20, 1992 Proceedings, pages 139–147, Berlin, Heidelberg, 1993. Springer Berlin Heidelberg.
[27] David S. Evans. Economic aspects of Bitcoin and other decentralized public-ledger currency platforms. In
Coase-Sandor Working Paper Series in Law and Economics, No. 685. The University of Chicago Law School,
2014.
[28] Ittay Eyal, Adem Efe Gencer, Emin Gün Sirer, and Robbert Van Renesse. Bitcoin-NG: A scalable blockchain
protocol. In Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation,
NSDI’16, pages 45–59, Berkeley, CA, USA, 2016. USENIX Association.
[29] Paul Feldman. A practical scheme for non-interactive verifiable secret sharing. In Proceedings of the 28th
Annual Symposium on Foundations of Computer Science, SFCS ’87, pages 427–438, Washington, DC, USA,
1987. IEEE Computer Society.
29
[30] Yossi Gilad, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. Algorand: Scaling
byzantine agreements for cryptocurrencies. In Proceedings of the 26th Symposium on Operating Systems
Principles, SOSP ’17, pages 51–68, New York, NY, USA, 2017. ACM.
[31] Timo Hanke. Dfinity white paper: Our consensus algorithm. https://medium.com/dfinity/
dfinity-white-paper-our-consensus-algorithm-a11adc0a054c, 2018.
[32] Egor Homakov. Stop. calling. bitcoin. decentralized. https://medium.com/@homakov/
stop-calling-bitcoin-decentralized-cb703d69dc27, 2017.
[33] Min Huang and Vernon J. Rego. Polynomial evaluation in secret sharing schemes, 2010. URL: http:
//csdata.cs.purdue.edu/research/PaCS/polyeval.pdf.
[34] R. Karp, C. Schindelhauer, S. Shenker, and B. Vocking. Randomized rumor spreading. In Proceedings of
the 41st Annual Symposium on Foundations of Computer Science, FOCS ’00, pages 565–, Washington, DC,
USA, 2000. IEEE Computer Society.
[35] Jonathan Katz and Chiu-Yuen Koo. On expected constant-round protocols for Byzantine agreement. In
Advances in Cryptology - CRYPTO 2006, volume 4117 of Lecture Notes in Computer Science, pages 445–462.
Springer Berlin Heidelberg, 2006.
[36] Valerie King and Jared Saia. Breaking the o(n 2 ) bit barrier: Scalable byzantine agreement with an adap-
tive adversary. In Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed
Computing, PODC ’10, pages 420–429, New York, NY, USA, 2010. ACM.
[37] Valerie King, Jared Saia, Vishal Sanwalani, and Erik Vee. Scalable leader election. In Proceedings of the
Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, SODA ’06, pages 990–999, Philadelphia,
PA, USA, 2006.
[38] Valerie King, Jared Saia, Vishal Sanwalani, and Erik Vee. Towards secure and scalable computation in
peer-to-peer networks. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer
Science, FOCS ’06, pages 87–98, Washington, DC, USA, 2006. IEEE Computer Society.
[39] Eleftherios Kokoris-Kogias, Philipp Jovanovic, Nicolas Gailly, Ismail Khoffi, Linus Gasser, and Bryan
Ford. Enhancing bitcoin security and performance with strong consistency via collective signing. In
25th USENIX Security Symposium, USENIX Security ’16, pages 279–296, 2016.
[40] Eleftherios Kokoris-Kogias, Philipp Jovanovic, Linus Gasser, Nicolas Gailly, Ewa Syta, and Bryan Ford.
OmniLedger: A secure, scale-out, decentralized ledger via sharding. In 2018 IEEE Symposium on Security
and Privacy (S&P), pages 19–34, 2018.
[41] Hugo Krawczyk. Distributed fingerprints and secure information dispersal. In Proceedings of the Twelfth
Annual ACM Symposium on Principles of Distributed Computing, PODC ’93, pages 207–218, New York, NY,
USA, 1993. ACM.
[42] Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, May 1998.
[43] Derek Leung, Adam Suhl, Yossi Gilad, and Nickolai Zeldovich. Vault: Fast bootstrapping for cryptocur-
rencies. Cryptology ePrint Archive, Report 2018/269, 2018. https://eprint.iacr.org/2018/269.
[44] Eric Limer. The world’s most powerful computer network is be-
ing wasted on Bitcoin. May 2013. Available at http://gizmodo.com/
the-worlds-most-powerful-computer-network-is-being-was-504503726.
[45] Loi Luu, Viswesh Narayanan, Chaodong Zheng, Kunal Baweja, Seth Gilbert, and Prateek Saxena. A secure
sharding protocol for open blockchains. In Proceedings of the 2016 ACM SIGSAC Conference on Computer
and Communications Security, CCS ’16, pages 17–30, New York, NY, USA, 2016. ACM.
30
[46] Petar Maymounkov and David Mazières. Kademlia: A peer-to-peer information system based on the xor
metric. In Revised Papers from the First International Workshop on Peer-to-Peer Systems, IPTPS ’01, pages
53–65, London, UK, UK, 2002. Springer-Verlag.
[47] Ralph C. Merkle. A digital signature based on a conventional encryption function. In A Conference on
the Theory and Applications of Cryptographic Techniques on Advances in Cryptology, CRYPTO ’87, pages
369–378, London, UK, UK, 1988. Springer-Verlag.
[48] Silvio Micali. ALGORAND: the efficient and democratic ledger. CoRR, abs/1607.01341, 2016.
[49] Silvio Micali, Salil Vadhan, and Michael Rabin. Verifiable random functions. In Proceedings of the 40th
Annual Symposium on Foundations of Computer Science, FOCS ’99, pages 120–, Washington, DC, USA,
1999. IEEE Computer Society.
[50] Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, and Dawn Song. The honey badger of bft protocols.
In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16,
pages 31–42, New York, NY, USA, 2016. ACM.
[51] Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomized Algorithms and Probabilistic
Analysis. Cambridge University Press, 2005.
[52] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system, 2008. Available at https://bitcoin.
org/bitcoin.pdf.
[53] Rafail Ostrovsky, Sridhar Rajagopalan, and Umesh Vazirani. Simple and efficient leader election in the
full information model. 1994.
[54] Rafael Pass and Elaine Shi. Hybrid consensus: Efficient consensus in the permissionless model. Cryptology
ePrint Archive, Report 2016/917, 2016. http://eprint.iacr.org/2016/917.
[55] M. Pease, R. Shostak, and L. Lamport. Reaching agreements in the presence of faults. Journal of the ACM,
27(2):228–234, April 1980.
[56] Michael O. Rabin. Efficient dispersal of information for security, load balancing, and fault tolerance. J.
ACM, 36(2):335–348, April 1989.
[57] Irving Reed and Gustave Solomon. Polynomial codes over certain finite fields. Journal of the Society for
Industrial and Applied Mathematics (SIAM), pages 300–304, 1960.
[58] Ling Ren, Kartik Nayak, Ittai Abraham, and Srinivas Devadas. Practical synchronous byzantine consensus.
CoRR, abs/1704.02397, 2017.
[59] Alexander Russell and David Zuckerman. Perfect information leader election in log∗ N + o(1) rounds.
In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, pages 576–,
Washington, DC, USA, 1998. IEEE Computer Society.
[60] S. Sen and M.J. Freedman. Commensal cuckoo: secure group partitioning for large-scale services. ACM
SIGOPS Operating Systems Review, 46(1):33–39, 2012.
[61] Alex Tapscott and Don Tapscott. How blockchain is changing finance. Harvard Business Review, March
2017. Available at https://hbr.org/2017/03/how-blockchain-is-changing-finance.
[62] The Zilliqa Team. The zilliqa technical whitepaper. https://docs.zilliqa.com/whitepaper.pdf, Au-
gust 2017.
[63] Hsiao-Wei Wang. Ethereum sharding: Overview and finality. https://medium.com/@icebearhww/
ethereum-sharding-and-finality-65248951f649, 2017.
31