Introduction To Quantum Cryptography
Introduction To Quantum Cryptography
Cryptography in Action
This syllabus contains papers discussed at a small sympusium called Cryptography in Action,
held on January 30, 2005, at Utrecht university. Presentation at the symposium and contribu-
tion to this syllabus were obligatory requirements in the course Cryptography of the Master of
Algorithmic Systems.
Sixteen presentations were given on the Symposium, arranged in three tracks as indicated in
the schedule:
The track on Foundations was chaired by Erik Jan van Leeuwen and contained the presentations
of Johnson Leow, Johan van Rooij, Petra Rutgers. Marco Streng, and Eric Ho (Chapters 1
trough 5). The track on Designs was chaired by Thomas Wolle and contained the presentations
of Maarten van Steenbergen, Michael Beye, Jules van Kempen, Teun Slijkeran, and Wouter
Slob (Chapters 6 through 10). The track on Systems was chaired by Marjan van den Akker
v
vi Preface
and contained the presentations of Jan-Pieter van den Heuvel, Meindert Kamphuis, Nelleke
Steendam, Jeffrey van Helden, Nico Munting, and Gerben Oostra (Chapters 11 through 16).
At http://www.cs.uu.nl/~gerard/FotoAlbum/F2005/Uithof1 a few photos of the sympo-
sium can be seen.
This photo shows most of the speakers. From left to right, they are Jeffrey van Helden, Nico Munt-
ing, Jan-Pieter van den Heuvel, Wouter Slob, Gerard Tel, Jules van Kempen, Nelleke Steendam,
Eric Ho, Petra Rutgers, Teun Slijkerman, Michael Beye, Marco Streng, Gerben Oostra, Johan
van Rooij, and Maarten van Steenbergen.
I hope that the reader will get an impression of what we did in the course and in the sympo-
sium.
Preface v
Contents vii
1 Introduction to Quantum Cryptography (Johnson Leow ) 1
1.1 Principals of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The BB84 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The B92 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Other Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Eavesdropping tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Technological Challenges of Quantum Cryptography . . . . . . . . . . . . . . . . . 8
Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Cryptography Using Bilinear Maps (Johan van Rooij ) 11
2.1 Non-Degenerate Bilinear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Applications in cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Concrete Pairings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Pairings and the Elliptic Curve Diffie-Hellman problems . . . . . . . . . . . . . . . 18
2.5 Short Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Provable Security in practical situations (Petra Rutgers ) 23
3.1 Provable security and practical security . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Needed basics from class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Security proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Factorization (Marco Streng ) 30
4.1 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 The Quadratic Sieve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4 The Number Field Sieve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5 How Improving makes it Worse (Eric Ka Yan Ho ) 41
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Paradoxical Improvements based on RSA . . . . . . . . . . . . . . . . . . . . . . . 42
5.3 Methodologies towards Secure System Design . . . . . . . . . . . . . . . . . . . . 45
Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
vii
viii Contents
One of the most important problems of cryptography is the key distribution problem; how do
sender and recipient agree upon a secret key while being sure that a third party cannot acquire
any information about it. Although public key cryptography provide algorithms like RSA or
ElGamal to solve this problem, there is still no mathematical proof of the non-existence of
a polynomial algorithm for cracking the key of those algorithms. Another point of concern
is that public key cryptography will become obsolete when it is possible to build a quantum
computer, because there already exists quantum algorithms which do factorization and discrete
logarithm calculations in polynomial time. However, the key distribution problem is provable
secure if quantum communication is used. The procedure that is used to agree upon a secure key
with the help of quantum communication is called ”quantum key distribution” and is generally
called ”quantum cryptography”. One of the most important features of quantum cryptography
is that a third party cannot acquire any reliable information by eavesdropping the quantum
channel and that any attempt to do so will be detectable. It is this last feature that makes
quantum cryptography very special, because non of the protocols of classical cryptography has
this property.
The origins of quantum cryptography dates back to the work of Stephen Wiesner in the sev-
enties who proposed to use single-quantum states as counterfeit-proof money. Wiesner published
his ideas in 1983 which Bennett and Brassard used in 1984 to create the first quantum cryp-
tographic protocol what is know known as the BB84 protocol . However, it wasn’t until 1989
that it could be implemented in a laboratory environment where polarized photons were used to
create a secure communication channel of 32 cm.
An advancement was made at the beginning of the nineties when Ekert proposed a new protocol
based on the Einstein-Podolsky-Rosen (EPR) paradox [Eke91]. Around the same period, in 1992,
Bennett published a protocol called the B92 protocol which could be implemented with single
photon interference [Ben92].
Although the technological difficulties are still a big issue in the present day context, there has
been a rapid development in the area of quantum computing and quantum cryptography in
solving these issues.
In this article i will first explain some principles of quantum mechanics and from thereon
continue with the quantum cryptographic protocols BB84, B92 and the basis of EPR based
protocols. Finally, i will conclude this paper with a few comments on eavesdropping strategies
and address some technological issues which still need to be solved before quantum cryptography
can be implemented in practice.
1
2 Introduction to Quantum Cryptography
The quantum states are elements of a Hilbert Space H and are generally called kets. The kets
are denoted by |labeli where the label stands for the name of the state we want to give. The
Hilbert space is defined as a vector space over the complex numbers C with an inner product
h , i : H × H −→ C
(See [Zei99] for more info on Hilbert spaces). Therefore, the qubit is just a ket in a two dimen-
sional Hilbert space. For example, if |0i and |1i denote a arbitrary orthonormal basis of a two
dimensional Hilbert space H, then each qubit can be written as
Because any scalar multiple of a ket represents the same state of a quantum system, we can
assume that |qubiti is normalized to unit length. In other words, we can restrict ourselves to
kαk2 + kβk2 = 1.
Given a Hilbert space H, we can define the space H∗ as H∗ = Hom(H, C). H∗ is a Hilbert
space, called the dual space of H and denotes the set of all linear maps from H to C. The
elements of H are called bra’s and they are denoted by hlabel|. We can now define a bilinear
map H∗ × H −→ C by (hψ|)(|φi) = hψ|φi ∈ C, were we call the last expression a bracket.
Furthermore, if we have a orthonormal basis |φi i of a Hilbert space H, then hφi | is an orthonormal
basis of the dual space H∗ . The two bases are related via
A† = A with A† := ĀT
1.1 Principals of Quantum Mechanics 3
A |φi i = λi |φi i
In the cases i consider here, the eigenkets form a complete orthonormal basis of the Hilbert space
H and therefore every state |ψi ∈ H can be written as a linear combination of those eigenkets:
X X
|ψi = ai |φi i with |ai |2 = 1
i i
From this last equation we can see that the completeness of A is expressed by
X
|φi ihφi | = 1
i
Therefore, we have
X
A= λi |φi ihφi |
i
[A, B] = AB − BA
The two observables A and B are then called compatible if they commute; [A, B] = 0 and
incompatible if they don’t commute. Finally, let △A = A − hAi, then Heisenberg’s Uncer-
tainty Principle states that
1
h(△A)2 ih(△B)2 i ≥ kh[A, B]ik2 with h(△A)2 i = ψ ¯(△A)2 ¯ψ .
¯ ¯ ®
4
This means that if A and B are incompatible, then we cannot measure A and B both with
unlimited precision. This property of quantum mechanics will be used a lot in the examples later
on.
We consider the example of the polarization states of a photon in a two dimensional Hilbert space
H. We take (for example) a orthonormal basis consisting of the ket states |li and |↔i and we
take a second orthonormal basis consisting of the states |րi and |ցi. The two bases are related
via:
1 1
|րi = √ (|li + |↔i) and |ցi = √ (|li − |↔i)
2 2
4 Introduction to Quantum Cryptography
and suppose we have a normalized state |ψi = |li. According to the quantum measurement
theory, the measurement of observable A of the state |ψi will return the eigenvalue λl with
probability
| hl |ψi |2 = | hl |l i |2 = 1
| h↔ |ψi |2 = | h↔ |l i |2 = 0
because of the orthonormality of the states. Suppose now that we want to measure observable
B of the state |ψi. Because we know that
1
|li = √ (|րi + |ցi)
2
the measurement of B will return the eigenvalue λր with probability
1 1
| hր |ψi |2 = | hր| √ (|րi + |ցi)|2 =
2 2
The second term is zero because of the orthonormality of the states. With the same argument,
the measurement will return λց with probability 21 . If we denote the state |ψ1 i as the resulting
state after measurement, then we see that |ψ1 i is either |րi or |ցi, which we (ambiguously)
write as
1
|րi = √ (|li ± |↔i)
2
If we measure observable A of the state |ψ1 i, then we will get λl or λ↔ with a probability 12 for
each. This illustrates Heisenberg’s Uncertainty Principle and from this example we can see that
measurements of observables can alter a state which effects future measurements. This principle
will be used in the next section to detect unwelcome eavesdroppers.
Suppose that Oscar is an eavesdropping third party. We have seen that if the alphabets chosen
by Alice and Bob respectively are the same, then Ai and Bi will also be the same. However, if
Oscar decides to eavesdrop on the quantum channel, then some Bi differ from Ai . For example,
suppose that Alice is using the alphabet |li to send the one bit to Bob while Oscar is performing
measurements using the other alphabet. The resulting state after the measurement will be either
a |րi or a |ցi. In any case, if Bob performs a measurement using the same alphabet as Alice’s
used, he will find the state |↔i with probability 21 . This means that Bi = 0 while Ai = 1. From
this example we see that Oscar’s eavesdropping will generate errors in the key of Alice and Bob.
6 Introduction to Quantum Cryptography
In the previous section we have assumed that communication via the quantum channel is noise-
free. But in practice, there will always be noise present which influences the error rate of the
sequences A and B. We therefore cannot tell whether the errors are caused by noise or by
eavesdropping of a third party Oscar. The solution to this problem is that we assume that all
the errors are caused by Oscar. Therefore, the final key will only be partially secret. To agree
upon a secret key, Alice and Bob must delete all errors from their raw key. One way to do this is
by dividing the raw key into blocks of a certain length and then perform parity checks on them.
This is the process of error correction. If the parity check does not agree, they will perform a
binary search for the error by bisecting the block into two subblocks and comparing the parities
of those subblocks. After comparing all the blocks, the step is repeated by performing a random
common permutation on their remanent raw key, dividing it into blocks and comparing parity
checks. Afterwards, Alice and Bob publicly select a random subset from their remanent raw key
to compare parities and employ the binary search strategy if an error is detected. If no error has
been found, then Alice and Bob can declare their remanent key as a reconciled key . Finally,
Alice and Bob select n random subsets of their reconciled key without revealing their content.
Their final secret key is then defined as the parities of those subsets. Because Oscar is assumed
to have caused all the errors, the number n is dependent of the error rate of the original raw key.
This last step is called privacy amplification .
In this section, we use the same relations between the bases as in the previous sections. Further-
more, we define two projection operators
We see that
¯ ¯ ® ¯ ¯ ® ¯ ¯ ® ¯ ¯ ® 1
l ¯P|↔i ¯l = ր ¯P|ցi ¯ր = 0 and ր ¯P|↔i ¯ր = l ¯P|ցi ¯l =
2
Now suppose that Alice and Bob want to establish a secret key, they can then start communicating
over the quantum channel using the B92 protocol as follows.
1. Alice and Bob generate a random bitsequence A and B which will be used as part of the
secret key, like in the BB84 protocol.
2. Alice sends her bitsequence A using the quantum alphabet and Bob measures the received
states according to his bitsequence B. If the bit Bi of B is 0 then Bob will make the
measurement with P|↔i and if Bi is 1 he will make the measurement with P|ցi .
3. If Bob measures the states Alice has sent, he will find one of the following results: the state
|li which represents the one bit, the state |րi which represents the zero bit or a state
1.4 Other Protocols 7
which isn’t defined in our alphabet like |↔i or |ցi. In the last case we will say that the
measurement failed and produced an erasure . We see that if Bi is different from Ai , then
Bob’s measurement will fail because Bob’s measurement operator is orthogonal to that of
Alice’s states. If Bi and Ai have the same bitvalues, then there will be a probability of 21
that a |li or a |րi comes out of the measurement. So we see that only one out of four
measurements will give a result. In all the other cases the measurement will give an erasure.
The erasure rate therefore, is 75%.
4. Bob tells Alice which of his measurements succeeded and Alice only keeps the bits from A
for which the measurement of Bob was a success. Alice and Bob now obtain a raw key.
5. Finally, Alice and Bob compare small portions of their raw key like in the BB84 protocol
to estimate the error-rate. If they discover any errors, they will know that a third party
has been eavesdropping on them.
We have seen that if Bob’s measurement passes, in other words, if result of the measurement
delivers a state in our alphabet, then Bi and Ai are the same. However, if Oscar decides to
eavesdrop by making a measurement, with for example, the P|րi operator, then some passed
measurements will have a Bi that differs from Ai . We can see this with the following example:
if Alice sends a one-bit state |li to Bob while Oscar is in the middle, then Oscar will alter the
state to a |րi or a |ցi. This means that if¯ Bob
® measures the resulting state he will get the
1
¯
zero-bit state |րi with probability ր P|↔i ր = 2 , instead of the usual erasure result when
¯ ¯
Oscar wasn’t present.
This variation of the BB84 protocol uses the so called Einstein Podolsky Rosen Paradox
(ERP), which Einstein, Podolsky and Rosen published in 1935 to challenge the foundations of
quantum mechanics. The idea is due to Artur Ekert who published this in 1991 [Eke91]. The
proposed scheme is to use two qubits instead of one. The two qubits will come from a common
source where one qubit will go to Alice and one to Bob. The qubits are so called EPR Pairs
, whose states are correlated in such a way that the measurement of a chosen observable A of
one the states automatically determines the result of the measurement of A of the other state.
This means that if Alice and Bob measure their own qubit with the same basis, they will get
a result which is correlated to each other. Furthermore, they know what the correlation is, so
they will be able agree upon a common key in this way. Because the ERP paradox reaches inside
the fundamentals of quantum physics, i will not go any further in the explanation of ERP Based
quantum cryptography.
8 Introduction to Quantum Cryptography
There is a large collection of variations on the BB84 protocol. For example, one can assume that
the two bases are not chosen with equal probability. This will lead to a bigger probability of
Alice and Bob choosing the same basis, but it will also be easier for Oscar to correctly guess the
used basis. Therefore, it is still not clear if the net result is better or not. There are many other
variations which i will not mention here. The net result of some of those are still questionable,
but there are other variations which makes the practical implementation easier.
For all these eavesdropping schemes, there exist algorithms to detect intrusion. However, what
we really want is an algorithm that can handle large classes of eavesdropping schemes instead of
one specific detection algorithm for one specific eavesdropping scheme. Therefore, the analysis
of eavesdropping schemes is still far from finished. (See [AEP94] for more info on this topic)
the pulse contains one photon. There are other ways to produce photons, but they still can’t
overcome the fact that the photon creation process is inefficient and that the devices are still too
complicated to use for practical situations.
After a single photon is created, the photon is send through a quantum channel. The quantum
channel can be a optical fiber or just the free space. Photon propagation using optical fibers
would be the most logical medium to consider, since they are widely used in telecommunications
and are of high quality. The wavelength used for photons in optical fibers is near 1300 or 1500
nm. For these wavelengths the attenuation of optical fibers is of the order of 0.35 and 0.20
dB/km, which means that half of the photons are lost after about 9 and 15 km respectively.
However, one major drawback of using optical fibers is that there still aren’t any detectors that
can detect photons above the 1000 nm wavelength. The development of these detectors is still
going on and i will come back later on this. Although there aren’t any good detectors for photon
wavelengths above 1000 nm, there are efficient photon detectors for wavelengths around 800 nm
which are also commercially available. However, if the 800 nm wavelength would be used, the
currently available optical fibers can’t be used as a quantum channel because of the wavelength
incompatibility. In that case, the photon propagation requires free space transmission or the
use of special fibers. Unfortunately, the quality of any special fibers isn’t as high as that of
optical fibers. Besides the qualitative inferiority, special fibers also have a practical disadvantage
compared to optical fibers, since there already exist optical fiber networks, while special fibers
are not commonly used.
Besides optical fibers, we can also consider free space transmission which has some advantages
compared to the use of optical fibers. The atmosphere has a high transmission window at a
wavelength of around 770 nm where photons can be detected using commercial, high efficiency
counting modules. Furthermore, the polarization state of a photon will not be altered by the
atmosphere at these wavelengths. However, free space transmission has a few disadvantages as
well. In contrast to optical fibers, the energy transmitted in free space will spread out, leading
to varying transmission losses. Furthermore, ambient light can enter the receiver, leading to a
higher error rate. Fortunately, these errors can be reduced by using spectral filters and timing
discrimination. Finally, free space transmission depend on atmospheric conditions and is only
possible with clear weather. Therefore, free space channels have to be improved before they can
be used in practice.
Although the safest bet is to use optical fibers as our photon carriers, we have seen that
a new detector has to be developed for this. The detection of photons can in principle be
achieved using techniques like photon-multipliers, avalanche-photodiodes, multichannel plates or
superconducting Josephson junctions. The ideal detector should fulfill the following requirements:
2. the noise rate, that is a signal creation without a photon arriving, should be small
3. the time jitter, that is the time variation between photon detection and the electric signal
generation, should be small.
4. the recovery time should be small to make high data rates possible.
Unfortunately, it is impossible to meet all the mentioned criteria. The best choice currently,
are silicon avalanche photodiodes. For photons below 1100 nm, there are commercial detectors
which have quantum efficiencies of 70% at a wavelength 700 nm, a time jitter of around 300
10 Introduction to Quantum Cryptography
psec, a maximum count rate of more than 5 MHz and noise rate of 50 Hz for temperatures of
-20 C. For photons above 1100nm, the only available avalanche photodiodes are photodiodes
made from Germanium or InGaAs. These detectors have a relative bad performance compared
to photodiodes made of silicon. Unfortunately, no industrial effort has been done to optimize
photodiodes to perform at wavelengths above 1100 nm. However, there is no physical reason why
photodiodes working at wavelengths above 1100 nm should be more delicate than below. The
practical reason for the lack of commercial products is that the bandgap of Silicon is too large and
therefore not sensitive enough for photon counting. Furthermore, the market for photon counting
is still not mature yet. But if these problems are solved, the future of quantum cryptography
will make a great leap forward.
This chapter is about the use of bilinear maps in cryptography. But since bilinear maps that
fulfill the security requirements are not very easy to comprehend I will focus on consequences
and applications of these maps. In section 2.1 I will introduce the concept of a bilinear map, and
the variant we use in cryptography: a pairing. Next I will demonstrate two applications of these
maps in 2.2 namely identity based signatures and a tripartite Diffie-Hellman protocol. In order
to introduce two concrete pairings in section 2.3.3 we will first need to introduce the mathematics
of finite fields and elliptic curves. These are all discussed in section 2.3. Next I will show that
these pairings have a great impact on the complexity of the Diffie-Hellman problems applied to
two different groups (2.4). And finally I present a signature scheme for short signatures and proof
it to be secure in 2.5.
Definition 2.1 For additively written groups G1 , G2 and a multiplicatively written group GT a
map f : G1 × G2 → GT is bilinear if:
It is important to remind yourself of the fact that in the above definition G1 and G2 are additively
and GT is multiplicatively written, for these are purely notational independent of the group
structure.
In cryptography we use non-degenerate bilinear maps, which we call pairings.
1. f is bilinear.
For a pairing to be of any cryptographic use we also need it to be efficiently computable, while
for any random x ∈ G2 or y ∈ G2 , and a z ∈ GT it should be infeasible to solve e(x, y) = z.
This all seems to be pretty straightforward, but there are no ’easy’ cryptographic pairings.
We know bilinear maps from linear algebra, but applied to Zp (p prime) this does not give use
an appropriate pairing. For example in:
f : Zp × Zp → Zp : (x, y) → cxy
We use the fact that addition in Zp equals addition in the field Fp (see: 2.3.1), and therefore we
can use the multiplication in Fp to define a function. But since we can also divide by anything
except zero in Fp it does not give us a useful pairing. It is bilinear and non-degenerate, but also
easy to invert and therefore useless to us.
The only suitable pairings found so far are derived from the Weil and Tate pairings, which are
defined over subgroups of elliptic curves over finite fields which take values in finite fields. These
are quite complex mathematical topics, which I will only discuss these superficially. Before going
into these mathematical details I will first introduce some applications.
Traditional cryptographic signatures schemes use a public-private key pair, where the binding
between the public key and the signer forms a new problem: How do we prevent an attacker from
supplying the verifier with a false public key? In practice this is done using certificates from a
Trust Authority. This is not only inefficient, but also still allows creation of false certificates if an
attacker can supply the verifier with a false public key from the Trust Authority. An algorithm
in which the public key can be derived from the signers identity would be useful indeed. Or
more accurately we need a signature scheme in which the public key from a signer can easily be
derived using a public deterministic algorithm from a string representing the signer’s identity.
The first signature scheme that implements this and was proven to be secure was proposed
by Hess [Hes] (Algorithm 2.1). It works on groups G1 , G2 and GT as in the previous section and
uses two hash function H : {0, 1}∗ → G1 \{0} and h : {0, 1}∗ × G2 → Z∗p for a large prime p.
2.2 Applications in cryptography 13
In the scheme the secret integer held by the Trust Authority, is the relation between the
Trust Authority’s two values and the relation between a signers public and private key. This
value cannot be computed from these values, since this is an discrete logarithm problem on an
elliptic curve. But this value can be used by the Trust Authority to give a signer his private key.
During the evaluation of the bilinear map this value is not used, but the relation it imposes is
implicitly transfered from the first argument to the second argument of the map. The rest of
the scheme is pretty straightforward and nothing new compared to what we have already seen
during our course.
An attacker who provides the verifier with a false public Trust Authority key still cannot
sign a message. For the verifier still has the correct public key from the signer, and the attacker
cannot create a false private key such that a forged signature will be accepted due to the false
public Trust Authority key.
The Diffie-Hellman protocol is an efficient way to give two participants a common key. It has been
used as a building block for many complex cryptographic tools. A generalisation of this protocol
would enable us construct many new and efficient protocols. The protocol I will introduced was
introduced by Joux in [Jou04].
When dealing with more than two participants we can construct a protocol based on use
the usual Diffie-Hellman protocol, but it will require multiple communication rounds and could
require some participants to be on-line at the same time.
What we want to do is create protocol where three participants choose integers a, b, c from
∗
Zp , and publish A = aP , B = bP , C = cP for a generator P of G1 . Next we need a map F
14 Cryptography Using Bilinear Maps
which allows us to calculate the key from the published values and the chosen integer. (K =
F (a, B, C) = F (A, b, C) = F (A, B, c)).
A straightforward approach would be to use F (x, Y, Z) = e(Y, Z)x for this results in:
F (a, B, C) = e(bP, cP )a
F (b, A, C) = e(aP, cP )b = e(P, P )abc
F (c, A, B) = e(aP, bP )c
The only problem is that both the Weil and Tate pairing can not be used in this context.
The Weil paring has the property that when G1 = G2 for arbitrary P ∈ G1 : e(P, P ) = 1,
which results in all keys to equal 1 (See: 2.12). Even worse: every linear dependent P and
λP cause e(P, λP ) = e(P, P )λ = 1. Therefore we use two different subgroups of a larger one,
with linearly independent generators P and Q. With PA = aP , QA = aQ etc. we now define
F (x, Y, Z) = e(Y, Z)x and get:
F (a, PB , QC ) = e(bP, cQ)a F (a, PC , QB ) = e(cP, bQ)a
abc
F (b, PA , QC ) = e(aP, cQ)b = e(P, Q) = F (b, PC , QA ) = e(cP, aQ)b
c
F (c, PA , QB ) = e(aP, bQ) F (c, PB , QA ) = e(bP, aQ)c
The Tate paring is defined somewhat differently, but also requires two elements in order to
avoid possible divisions by zero. I will not go into these mathematical details.
What should be clear is that we can create new protocols for obtaining a common key by
exploiting the properties of a pairing.
Definition 2.3 An Abelian group (G, +) is a set of elements G and a binary operation + :
G × G → G with the following properties:
Definition 2.4 A field (F, +, ·) is a set of elements F with two binary operations defined on it
·, + : G × G → G with the following properties:
2.3 Concrete Pairings 15
2. a · (b + c) = (a · b) + (a · c) and (a + b) · c = (a · c) + (b · c).
3. if 0 is the identity element from the Abelian group (G, +) then (G\{0}, ·) is an Abelian
group as well.
A field is nothing more than a set in which we can not only add as we do in an additive group,
but multiply as well without violating a few basic rules. A few examples of fields are the rational
and real numbers (Q and R). Only these are of no good for cryptography, since we can only
store elements from a finite set. Therefore we use finite fields.
Fact 2.5 For a prime number p we can create the finite field Fp by combining the operations of
addition and multiplication modulo p of Zp and Z∗p .
This can be done since Z∗p (p prime) contains all the elements of Zp except 0. An alternative
notation is GF (p) for Fp , where GF stands for Galois Field. These are not the only existing finite
fields, there is a generalisation of this concept, but we need some theory about polynomials first.
Definition 2.6 F [X] is the set of polynomials in X with coefficients from a field F .
Definition 2.7 A polynomial p ∈ F [X] for some field F is irreducible if there are no polynomials
f, g ∈ F [X] such that p = f g where f and g must be of degree at least 1.
Not for every polynomial f ∈ Fp [X] there is an r ∈ Fp such that f (r) = 0. Such an r is called
a root of f in Fp . Polynomials have the property that for any root r ∈ F of f ∈ F [X] we can
write f = g(X − r) for some other polynomial g ∈ F [X]. Therefore we know than an irreducible
polynomial f ∈ Fp [X] does not have any roots in Fp .
For example X 2 + 1 ∈ F3 [X] is irreducible and does not have any roots, while X 2 + 1 ∈ F2 [X]
does have a root: in F2 [X] we have (X + 1)(X + 1) = X 2 + 1 and 12 + 1 = 0.
We can use this to extend a field F by adding the roots of a irreducible polynomial. This
is done by calculating in F [X] modulo our irreducible polynomial f . The arithmetic modulo a
polynomial is not very different from modulo a number: if the degree of the polynomial is greater
or equal to the degree of the modulus we use the remainder of division of the polynomial by the
modulus. This new field is denoted as F [X]/(f ).
Following the previous example of X 2 + 1 ∈ F3 [X], we can form the field F3 [X]/(X 2 + 1).
This field consists of the following elements: {0, 1, 2, X, X + 1, X + 2, 2X, 2X + 1, 2X + 2}, and
by using the modulo X 1 + 1 arithmic and f = X 2 + 1, we get our root X: f (X) = X 2 + 1 = 0
(mod X 2 + 1).
Fact 2.8 The number of elements of a finite field always is a power k of a prime number p.
We write Fpk for such a field and it can be constructed by choosing a irreducible polynomial f of
degree k over Fp , and constructing Fpk ∼
= Fp [X]/(f ).
One can see that the X in the example above behaves just like the i in complex numbers. To
keep thing insightful on can remember that for any finite field, the polynomial variable could be
identified with some complex number.
Elements from these fields can be stored as a k-dimensional vector space over Fp . When using
F2k we can represent a single element from this field by a row of k bits.
16 Cryptography Using Bilinear Maps
The dots correspond to the points on the elliptic curve y 2 = x3 + x + 1 over F23 . We
demonstrate the addition: (3,10) + (9,7), the line trough both intersects the curve at (17,3)
and therefore the point opposite (17,3) namely (17,20) is the result.
The groups the pairings work on are subgroups of elliptic curves. Therefore we will now introduce
them.
Definition 2.9 An elliptic curve E/Fp (p > 3) is the set of solutions (x, y) to an equation of
the form y 2 = x3 + ax + b, where a, b ∈ Fp together with a point at infinity (∞) on which we
define a group law in the following way:
• ∞ is the identity of the group: ∀(x, y) : (x, y) + ∞ = (x, y) = ∞ + (x, y), and ∞ + ∞ = ∞.
The group law in the definition appears to be counter intuitive, but it has a metric interpreta-
tion (see: casus 2.2). First of all observe that for each x 6= 0 there are two y such that (x, y) ∈ E.
For two points (x1 , y1 ), (x2 , y2 ) we draw a line through both and find a third point intersecting
the curve at (x′ , y ′ ). Now (x3 , y3 ) = (x1 , y1 ) + (x2 , y2 ) = (x′ , −y ′ ) (the point opposite the third
intersection point). If this point does not exist (y1 = −y2 ) we take ∞ as the resulting point.
When doubling a point ((x, y) + (x, y)) the metric interpretation is only visible when taking an
elliptic curve over a continuous field such as R or Q. Since there are no two points to draw a
2.3 Concrete Pairings 17
line through, we take the limit of two points moving to our single point, and obtain the line
tangent to the curve at (x, y). The above definition contains these metric operations in the form
of formulae.
Definition 2.10 For an elliptic curve E/Fp we define E(Fpk ) to be the set of points (x, y) on
the curve with x, y ∈ Fpk .
We often refer to E(Fpk ) as being the elliptic curve, since the only difference with E is that it
specifies the points allowed.
In this section we will define the Weil and Tate pairing and the subgroups of elliptic curves
useful for constructing our required pairings. Since the exact formulas required to compute these
pairings require a lot more mathematics and showing how to compute them efficiently requires
even more math than I can tell in this chapter, I will keep things brief. For more information on
the Weil paring and its computation see [Mil04] and for the Tate paring see [PSB04].
Definition 2.11 We define G[p] be the subgroup of G of elements of which order divides p.
Fact 2.12 Let p be a prime number, E/Fq an elliptic curve and α an integer so that p | q α − 1,
and for all k = 1 . . . α − 1 : p 6| q k − 1. The Weil paring is a map ep : E[p] × E[p] → F×
q α with the
following properties:
• Identity: for all R ∈ E[p] : ep (R, R) = 1.
• Bilinear: for all R1 , R2 ∈ E[p] and a, b ∈ Z we have ep (aR1 , bR2 ) = ep (R1 , R2 )ab .
• Non-degenerate: If for an R ∈ E[p] we have for all R′ ∈ E[p] : ep (R, R′ ) = 1 then R = ∞.
So if P ,Q linearly independent: ep (P, Q) 6= 1.
• Computable: for all R1 , R2 ∈ E[p] the pairing can be computed in polynomial time.
We now use a results from Balasubramanian and Koblitz in [BK98] to construct the required
groups.
Theorem 2.13 Let E/Fq be an elliptic curve and P ∈ E(Fq ) be a point of prime order p with
p 6| q. Let α be so that p | q α − 1, and for all k = 1 . . . α − 1 : p 6| q k − 1. If α > 1 then there exists
a Q ∈ E(Fqα ) of order p that is linearly independent of P .
Using this theorem we can create groups G1 =< P >⊆ E(Fq ), G2 =< Q >⊆ E(Fqα ) and
GT = F× q α (Fq α with multiplication as the group operation) on which the Weil paring has the
required properties.
In practical situations we mostly use the Tate pairing, since it is much more efficiently com-
putable. The Tate pairing is a non-degenerate bilinear map:
It is even more difficult to explain the groups on which this pairing works, therefore I will
only use the Weil pairing.
What is missing in the above discussion is a claim or proof about the pairings being non-
invertible. We will do so while dealing with the Diffie-Hellman problems on Elliptic Curves
(lemma 2.15).
18 Cryptography Using Bilinear Maps
Lemma 2.15 A paring defined on two groups G1 , G2 for which co-Diffie-Hellman problem cannot
be solved efficiently and for which there exists an efficiently computable isomorphism φ : G2 → G1 ,
cannot be inverted effficiently.
Setup:
Pick a random private key x ∈ Zp and compute the public key v := g2x .
Sign: Message M.
Evaluate the hash function h = H(M ) ∈ G1 and compute the signature σ = hx .
Verify:
Evaluate the hash function h = H(M ) and accept if the co-Diffie-Hellman relation
goes for h, σ from G1 and g2 , v from G2 .
In order to prove that pre-images for the second argument should not be computable, we need
our map φ. Let a be an integer such that g = φ(h)a . We compute a new v : v := e(φ(h), h), now
e(g, hb ) = e(φ(h)a , hb ) = v ab . If we can solve y in e(φ(h), y) = v ab it will be y = hab from which
we can compute g b = φ(h)ab = φ(hab ) = φ(y).
Thus the pairing cannot be efficiently inverted. △
The required isomorphism exists for our groups G1 and G2 but since it is constructed using Galois
maps of Fqα over Fq , I will spare you the details.
A pair of groups on which the Decision co-Diffie-Hellman is easy but the Computational co-
Diffie-Hellman is computationally infeasible are called Gap Diffie Hellman groups. The (in)feasibility
of the different Diffie-Hellman problems allow us to construct entirely new protocols one of which
allows short signatures and will be the main topic of the next section.
We can create a signature scheme based on the fact that Computational co-Diffie-Hellman is
infeasible, but Decision co-Diffie-Hellman is easy. This was done by Boneh, Lynn and Shacham
in [DBS04] (Algorithm 2.3). Before applying concrete groups and the Weil paring to the scheme,
I will first prove it to be secure. We will prove that it is hard to create a existential forgery
even the attacker is given a number of signatures on messages of his choice. The proof will use
the Random Oracle Model [PS00], which means that we assume that the attacker cannot use
any information about the hash function used: it is a random, but gives the same answers to
identical queries (oracle). We will use the algorithm of the attacker to solve the Computational
co-Diffie-Hellman problem by simulating the hash function so that it gives answers with some
extra meaning, while we keep them uniformly distributed. This will become clear in the proof.
For more information about the Random Oracle Model or provable security in general, see 3.3.3.
Theorem 2.16 If G1 and G2 are Gap Diffie-Hellman groups on which we can only break the
Computational co-Diffie-Hellman problem with chance smaller than δ, then an attacker can only
construct an existential forgery with a chance smaller than ǫ = e(q + 1)δ, even if it is given q
signatures on messages of its choice. The forgery created by the attacker may of course not be a
signature he asked for.
Proof. Suppose we have an algorithm that constructs an existential forgery with chance greater
than ǫ. I will now show how to construct an algorithm that solves the Computational co-Diffie-
20 Cryptography Using Bilinear Maps
Setup We start by giving the attacker the public key u · g2r for a random r ∈ Zp , where p
is the order of G1 and G2 .
Queries to the hash function To keep our responses to the queries to the hash function con-
sistent we maintain an initially empty list of tuples (Mj , wj , bj , cj ). If a asked for a hash value
for a message Mi which is in the list we respond with the corresponding wi ∈ G1 . If Mi is not in
1
the list we pick a random bi ∈ Zp uniformly and an ci ∈ {0, 1} with Pr[ci = 0] = q+1 . Next we
1−ci bi
respond with wi := h · φ(g2 ) ∈ G1 and add (Mi , wi , bi , ci ) to the list.
Remember the isomorphism φ from our proof of the pairing being not invertible. And note that
wi is picked uniform in G1 and is independent of the attackers view.
Requested signatures When constructing a signature for a message Mi we query our simulated
hash function for the corresponding wi and ci . If ci = 0 we abort our construction, otherwise we
know ci = 1 and so wi = φ(g2 )bi . We create the signature σi = φ(u)bi · φ(g2 )rbi which equals wia+r
and therefore is a valid signature over Mi for the public key u · g2r = g2a+r . The signature is send
to the attacker.
Solving Computational co-Diffie Hellman When the attacker succeeds and produces his
existential forgery σf over message Mf , we once again query our own simulated hash func-
tion for a corresponding tuple (Mf , wf , bf , cf ). If cf = 1 we abort our construction, other-
wise cf = 0 and therefore wf = h · φ(g2 )b and σf = ha+r · φ(g2 )b(a+r) . We now construct
ha = σf · (hr · φ(u)b · φ(g2 )rb )−1 .
We now only need to show that this algorithm solves the Computational co-Diffie-Hellman prob-
lem with the claimed chance. Therefore we introduce three claims, which together proof the
theorem.
Claim 2.17 The probability that our algorithm does not abort during the production of signatures
as a results of the attackers requests is at least 1e .
Proof. We assume without loss of generality that the attacker does not ask for the same signature
twice. By induction we will prove that after the attacker requested i signatures the probability
1 i
that our algorithm did not abort is at least (1 − q+1 ) . For i = 0 this is true since we do
not abort when no signatures are requested. When signature i is requested the value of ci is
1
independent of the information of the attacker so the chance that we will abort is q+1 . Using the
1 i
induction hypothesis we now know that our chance of not aborting is at least (1 − q+1 ) . And
since the attacker is only allowed to ask for q signatures our chance of not aborting is greater
1 q
than (1 − q+1 ) ≥ 1e . △
Claim 2.18 If our algorithm has not aborted during the production of signatures on the attackers
requests, then the attacker can not know it is being used by our algorithm and hence it produces
a valid signature with chance greater than ǫ.
Proof. The public key given to the attacker is from the same distribution as a public key
produced by the signature scheme. Responses the the hash function are also uniformly distributed
in G1 and all responses to requests for signatures are valid. Thus there is no difference the attacker
can notice and he will produce a valid signature with a chance greater than ǫ. △
Claim 2.19 The probability that our algorithm aborts after the attacker gives us a valid forgery
1
is q+1 .
2.5 Short Signatures 21
Proof. Since the attacker did not ask for the signature he forged, he can only have some
information about cf from the simulated hash function evaluation. But the wf has the same
distribution whether cf = 0 or cf = 1 thus he cannot know whether our algorithm will abort or
1
not due to its forgery. Hence the chance of abortion is q+1 . △
You might have noticed that each claim only gives us any information if the event in the
foregoing claim happened. So if we did not abort claim 2.18 says something about our chance
of getting a forgery and if this happened as well claim 2.19 says something about our chance of
solving the Computational co-Diffie-Hellman problem.
Therefore the chance on solving the problem (not aborting) C:
1 1 ǫ
C> [2.17] × ǫ [2.18] × [2.19] =
e q+1 e(q + 1)
When δ is the upper bound on the chance of solving the Computational co-Diffie-Hellman problem
ǫ
(δ ≥ C ≥ e(q+1) ) then the attacker can never construct this forgery with a chance greater than
ǫ = e(q + 1)δ. △
If we truly want to proof this theorem nicely we also have to keep the time it takes to run
our algorithm into account, for it be infeasible to run. If one would do so on can see that our
algorithm takes only linear amount of time in the number of times we have to simulate the hash
function or supply the attacker with queries and the linear factor is approximated by the time
required for an exponentiation in G1 .
Short signatures allow new applications while keeping the security level high enough. They can
for example be used for product registration systems where a user must enter the key, or they
can be used when we want to encode a signature in a bar code.
In the previous section we have introduced a secure signature scheme based on Gap co-Diffie-
Hellman groups. We will now use the elliptic curves over finite fields and the Weil paring to
construct such a scheme while slightly modifing it so that it allows short signatures. The result
is also from Boneh, Lynn and Shacham [DBS04] and is shown in algorithm 2.4.
We use an elliptic curve E/Fq and P ∈ E(Fq ) a point of prime order p. And we use the result
of Balasubramian and Koblitz (theorem 2.13) to obtain a P ∈ E(Fq ) and Q ∈ E(Fqα ), both of
order p that are linearly independent and remind ourselves of the involved α. And once again
we take G1 =< P > and G2 =< Q >. Notice that we use only a single element from Fq for a
signature and reconstruct the elliptic curve ellement upon verification.
This due to the fact that this α causes a smart way of reducing the elliptic curve discrete log
problem to a discrete log problem in F× α
q α : a field of q elements (see 2.4).
By theorem 2.16 we know this signature scheme is secure if the Computational co-Diffie-
Hellman problem is hard enough. We can remind ourselves the requirements from 2.4: p and q
must be large enough to withstand generic discrete logarithm algorithm attacks, and the security
multiplier α must be large enough to defend against reductions to Fqα to witch the Number Field
Sieve algorithm can be applied.
A family of elliptic curves found by Miyaji et at. in [AMT01] allow for a security multiplier of
α = 6. It remains an open question whether higher security multipliers can be reached. There for
22 Cryptography Using Bilinear Maps
Setup:
Pick a random private key x ∈ Zp and compute the public key V := xQ ∈ E(FQα ).
Sign: Message M.
Evaluate the hash function R = H(M ) ∈ G1 and compute σ = xR ∈ E(FQ ).
The signature s is the x-coordinate of σ: s ∈ Fq .
Verify:
Find a y ∈ Fq such that σ = (s, y) is a point on E(FQ ).
If no such y exist or σ does not have order p the signature is invalid.
Evaluate the hash function R = H(M ) and accept if and only if
either e(σ, Q) = e(R, V ) or e(σ, Q)−1 = e(R, V )
example exists an elliptic curve with α = 6 over Fq with q a 168-bits prime. This allows 168-bits
signatures with a 166-bit prime p, so 166-bit security against a generic discrete log attack and
1008-bit (6 × 168) protection against the Number Field Sieve algorithm (see: 4).
Chapter 3
Joke An elderly Frenchman rises every morning at 5, goes out to the street in front of his house,
and sprinkles a white powder up and down the street. One day, a neighbour, who has watched
his routine for many years, confronts him. ’What is this powder you sprinkle on the street every
morning, Pierre?’ ’It is elephant powder, mon ami,’ the gentleman says. ’It keeps the elephants
away.’ ’But Pierre,’ says the neighbour. ’Everybody knows that there are no elephants in France.’
Says the older man matter-of-factly, ’I guess it must be working, then.’
The joke has a serious message. Cryptography is like elephant powder. If it seems to work,
and keeps the attackers, our elephants, away, then we trust it. This isn’t really very satisfactory.
[Mil04]
Since the beginning of public-key cryptography, many suitable algorithmic problems for
crypthography have been proposed. Then, many crypthographic schemes have been designed,
together with more or less heuristic proofs of their security. However, most of those schemes have
thereafter been broken.
The simple fact that a cryptographic algorithm withstands cryptanalytic attacks for several
years is often considered as a kind of validation procedure, but some schemes take a long time
before being broken. Therefore, the lack of attacks at some time should never be considered as
a security validation of any proposal.
Organisation of the chapter In the next section, I will explain what provable and practical
security is. In section 2 I will recall some basics we have seen in class, which are nessesarry for
provable security, namely the classical assuptions on which the security may rely. In section
3 I will describe several signature and encryption schemes with their security results. All the
information comes from [Poi02]
complexity theory, providing polynomial reductions. However, their aim was mainly theoretical,
and thus they tried to minimize the required assumptions on the primitives (one-way functions,
etc). Therefore, they just needed to exhibit polynomial reductions from the basic assumption on
the primitive into an attack of the security notion. However, such a result has no practical impact
on actual security of proposed schemes. Indeed, even with a polynomial reduction, one may be
able to break the cryptographic protocol within few hours, whereas the reduction just leads
to an algorithm against the underlying problem which requires many years. Therefore, those
reductions only prove the security when very huge parameters are used, under the assumption
that no polynomial time algorithm exists to solve the underlying problem.
For a few years, more effcient reductions have been expected, which provide more practical
security results. The perfect situation is reached when one is able to prove that, from an attack,
one can describe an algorithm against the underlying problem, with almost the same success
probability within almost the same amount of time. We have then achieved practical security .
Unfortunately, in many cases, provable security is at the cost of an important loss in terms of
efficiency for the cryptographic protocol, or relies on a weaker computational problem. Therefore,
a classical way to give some evidences about the security of an efficient scheme relative to a strong
computational problem is to make some hypotheses on the adversary’s behavior: the attack is
generic, independent of the actual implementation of some objects:
- of the hash function, in the ”random oracle model”;
- of the group, in the ”generic (group) model”. (I will skip this part)
• G - key generation
• E - encryption
• D - decryption
The ciphertext comes from c = Eke (M, r). The encryption key ke is public and a unique message
M satisfies the relation (with possibly several random r). At least an exhaustive search on M and
r can lead to M , but maybe a better attack is possible. So, unconditional secrecy is impossible
and we needed algorithmic assumptions.
Integer factoring
The most famous intractable problem is factorization of integers : while it is easy to multiply
two prime integers p and q to get the product N = p · q, it is not simple to decompose N into its
3.3 Security proofs 25
prime factor p and q. Unfortunately, it just provides a one-way function, without any possibility
to invert the process.
RSA problem : let N = pq be the product of two large primes of simular sizes and e an integer
relactive prime to φ(N ). For a given y ∈ Z∗N , find x ∈ Z∗N such that xe = y mod N . The RSA
assumption then says that this problem is intractable for any modulus N = pq, large enough: the
success probability SuccRSA (A) of any adversary A within a reaonable running time is small.
Discrete logarithm
Some other classical problems are related to the discrete logarithm . The setting is quite general:
one is given a finite cyclic group ℘ of prime order q and a generator g (i.e. ℘ = < g >). In such
a group, one consider the following problem (using the additive notations):
The Discrete logarithm problem (DL): given y ∈ ℘, compute x ∈ Zq such that y = g x , then
one writes x = logg y.
Trapdoor
The Discrete Logarithm is difficult and no information can help. We have the Computational
Diffie-Hellman problem (CDH): given two elements in the group ℘, a = g a and b = g b , compute
c = g ab . Then one writes c = DH(a,b). The textbfDecisional Diffie-Hellman problem (DDH) :
given three elements in the group ℘, a = g a , b = g b and c = g c , decider whether c = DH(a,b).
• The key generation algorithm K : the algorithm K produces a pair (x, y) of matching public
and private keys. Algorithm K is probablistic.
26 Provable Security in practical situations
• The encryption algorithm E : given a message M and a public key y, E produces a ciphertext
C of M . This algorithm may be probabilistic. In this latter case, we can write E(y, M ; r)
where r is the random type.
• The decryption algorithm D: given a ciphertext C and the private key x, D gives back the
plaintext M . This algorithm is necessarily deterministic.
Security notions. The goals of the adversary may be various. There a three security notions :
• The first common security notion that one would like for an encryption scheme is one-
wayness (OW) : with just public data, an attacker cannot get back the whole plaintext
of a given ciphertext. More formally, this means that for any adversary A, her success in
inverting E without the private key should be nigligible over the message space M.
• However, many applications require more, namely the semantic security . This security
notion means computational impossibility to distinguish between two messages, chosen by
the adversary, one of which has been encrypted, with a probabality better than one half:
her advantage should be negligible.
• A later notion is non-malleability (NM) . To break it, given a ciphertext, the adversary
tries to produce a new ciphertext such that the plaintext are meaningfully related.
On the other hand, an attacker can play many kinds of attacks, according to the available
information:
• Since we are considering asymmetric encryption, the adversary can encrypt and plaintext
of her choice, granted the public key, hence the chosen-plaintext attack (CPA) .
• She may furthermore have access to more information, modeled by partial of full access to
some oracles:
– a plaintext-checking oracle which, on input (M, C), answers whether C encrypt the
message M . This attack has been named the plaintext-checking attack (PCA)
– or the decryption oracle itself, which on any ciphertext, except the challenge cipher-
text, answers the corresponding plaintext. The scenario which allows adaptively cho-
sen ciphertexts as queries to the decryption oracle is the strongest attack, and is named
the chosen-ciphertext attack (CCA).
In public-key encryption is it difficult to reach CCA security. Maybe it is possible, but with
inefficient schemes. However, inefficient schemes are not useful in practice: everybody wants
security, but only if it is transparent.
RSA problem as well. Because of this determinism, it cannot be semantically secure: given the
encryption C of either M0 or M1 , the adversary simply computes C0 = M0e mod N and checks
whether C0 = C or not to make the decision.
The El Gamal Encryption Scheme . In 1985, El Gamal also designed a DL-based public-
key encryption scheme, inspired by the Diffie-Hellman key exchange protocol: given a cyclic group
℘ of prime order q and a generator g, the generation algorithm produces a random element x ∈ Z∗q
as private key, and a public key y = g x . The encryption of a message M , encoded as an element
M in ℘, is a pair (c = g a ; d = y a + M ). This ciphertext can be easily decrypted thanks to
the knowledge of x, since y a = g xa = cx , and thus M = d − cx . This encryption scheme is
well-known to be OW-CPA relative to the CDH problem. It is also IND-CPA relative to the
DDH problem. About OW-PCA, it relies on the new GDH problem. However, it does not
prevent adaptive chosen-ciphertext attacks because of the homomorphic property. As we have
seen above, the expected security level is IND-CCA. We wonder if we can achieve this strong
security with practical encryption schemes.
As already marked, one often has to made some assumptions about the adversary’s behaviour.
I will present a classical model, the random oracle model: this model was the first to be
introduced in the cryptographic community: the hash function is formalized by an oracle which
produces a truly random value for each new query. Ofcourse, if the same query is asked twice,
identical answers are obtained.
This model has been strongly accepted by the community, and it considered as a good one, in
which proofs of security give a good taste of the actual security level. Even if it does not provide
a formal proof of security it is argued that proofs in this model ensure security of the overall
design of the scheme provided that the hash function has no weakness.
More formally, this model can also be seen as a restriction on the advarsary’s capabilities.
Indeed, it simply means that the attack is generic without considering any particular instantiation
of the hash function. Here is a proof in the random oracle model. 2.16
The limits of provable security are the following: first does provable security not provide
proofs in the mathematical sense. Which means that proofs are relative (to computational
assumptions), that proofs often use ideal models and further that proofs are not formal objects
(time is needed for acceptance). But still has provable security provided some good guarantee
that a scheme is not flawed.
Chapter 4
Factorization
Written by Marco Streng
The RSA algorithm is based on the assumption that factoring large integers is impossible to do in
a reasonable amount of time. In this chapter, we will start with an introduction to factorization.
After that, we will turn our attention to the two most important factoring algorithms for RSA
numbers, the quadratic sieve and the number field sieve. Due to the technical nature of the
number field sieve, we will focus mostly on the quadratic sieve.
4.1 Factorization
A factor of an integer N is a number d which divides N . We call such a factor trivial if it is
either 1 or N itself.
In the RSA algorithm, we are dealing with an integer N of the form N = pq for two primes p
and q. A factoring algorithm is exactly what we need in order to find p and q, because they are
the nontrivial factors of N .
A factoring algorithm is also sufficient for finding the prime factorization of any integer,
because we can use it together with a prime test and repeat it on the found factor m and on
N/m until we only have prime factors left. We refer to [Beu03] and [Tel02, Sec. 5.2] for more
about prime testing.
Now suppose we are given a number N which we want to factor. As a first attempt, we could look
at the numbers 2, 3, 4, 5, . . . and try to divide N by them. This method is called trial division.
In the worst case, √
N is the product of two big primes and √ we would have to try to divide N by
all numbers up to N to find those primes. This costs N division steps, which is exponential
in the length log(N ) of our input N . Because of this huge runtime, this method cannot be used
to factor big numbers.
30
4.1 Factorization 31
Still, trial division can be very useful to find small factors of N . It takes only B divisions
to find all factors up to B, so it takes only polynomial time to find all factors up to log(N ).
Therefore, when designing a factoring algorithm, it is safe to assume that the input has no prime
factors smaller than its logarithm.
Instead of using division, we could try using the greatest common divisor (GCD). Given two
integers a and b, the greatest common divisor gcd(a, b) of a and b is defined to be the greatest
integer which divides both a and b. The greatest common divisor can be calculated in quadratic
time with Stein’s algorithm [Tel02, Sec. 5.2].
For any integer d, gcd(d, N ) is a factor of N . Now if d has at least one prime factor in common
with N , then we know that gcd(d, N ) is greater than one. Also, if N does not divide d, then we
know that gcd(d, N ) is not equal to N (because gcd(d, N )|d). So the trick to factoring a number
N now seems to be: finding a number d which is not a multiple of N , but does have at least one
prime factor in common with N .
Suppose we know two integers a and b whose squares are congruent modulo N . We then have
N |(a2 − b2 ) = (a − b)(a + b), which means that each prime factor of N divides either a − b or
a + b, so the number a − b might just be what were looking for in the previous section. There
were two conditions which that number should satisfy. Firstly, N should not divide it, which is
the same as saying that a and b should not be congruent mod N . Secondly, we need a − b to
have at least one prime factor in common with N . The last condition is satisfied as soon as N
does not divide a + b, so we want a not to be congruent to −b mod N .
Conclusion: In order to find a nontrivial factor of N , all we need is a pair of integers (a, b)
with a2 ≡ b2 mod N and a 6≡ ±b mod N . From this conclusion we can derive the following
theorem.
Theorem 4.2 Suppose N is an odd integer with at least two different prime factors and a 6≡
0 mod N . And suppose b is a random integer with a2 ≡ b2 mod N . Then (a, b) gives rise to a
nontrivial factor of N with probability at least 12 .
Proof. If gcd(a, N ) is not equal to one, then it is itself a nontrivial factor of N and we are done,
so suppose gcd(a, N ) = 1. Then both a and b are in the unit group Z∗N of the ring of integers
modulo N . Write N as N = pk m with gcd(p, m) = 1 and p prime. From the conditions of the
theorem, we know that p exists and that p and m are at least 3. By the Chinese remainder
theorem, we know Z∗N ∼ = Z∗pk × Z∗m . Now any b with b2 ≡ a2 modulo both pk and m will have
b2 ≡ a2 mod N . There are the two solutions a and −a modulo pk and at least the two solutions
a and −a modulo m, so there are at least 4 solutions modulo N . Only two of those solutions are
the trivial ones b ≡ ±a mod N , so at least half of the solutions give rise to a nontrivial factor of
N. △
Now if we can generate “random” pairs of integers with congruent squares, then the expected
number of pairs needed for a nontrivial factor is two. Many factoring algorithms (including the
quadratic sieve and the number field sieve) are ways to generate such “random” pairs.
32 Factorization
4.1.4 Powers
Before we spend time on finding such squares, let us first look at the conditions in the above
theorem. By trial division, we can assume N to be odd, but what if N has only one prime
factor? That would mean that it is of the form N = pk , where k is at least two. To eliminate √ this
problem, we will check if N is a power. For k = 2, 3, 4, . . ., calculate the integer nearest to k N
and call it x. If N is a k-th power, then N = xk , so x is a nontrivial factor of N and we are done.
If we find no nontrivial factors among the x’s, then we know that N is not a power. We only have
to check this for k up to log2 N , because N = xk implies 2k ≤ xk = N , so k ≤ log2 N . Because
we only check this for log2 N values of k and because we can use a fast numerical root-algorithm
to calculate the integer-precision k-th roots, the complete power check will only take polynomial
time.
All we had to do to factor a number N , was creating an algorithm which generates random pairs
of congruent squares. So we take a number a, square it, and then all we need is a random square
root of (a2 mod N ) in ZN .
For example, suppose we wanted to factor N = 493 and we have selected the integer a = 23.
Then a2 = 529. We already know the square root a = 23 of 529, and taking the same square√ root
we started with isn’t really that random. However, we also have 529 − 493 = 36 and 36 = 6,
which tells us 232 ≡ 62 mod 493, giving us the factorization 493 = 17·29. In general, when trying
to take the square root of a congruence class (y mod N ) ∈ ZN , we could look at representatives
y ∈ Z and take their square roots.√ This doesn’t always work. If we square (24 mod 493) for
example, we get (83 mod 493) and 83 = 9.110 . . ., which is not an integer. What we want is a
good chance that the representative y we take is a square. For that, we use the following lemma.
Lemma 4.3 The smaller a number in Z is, the greater the chance that it is a square.
√
Proof. The squares√below an integer m are exactly the squares of the integers below m.
Therefore, there are m squares below m and because the derivative of the square root function
decreases when m increases, so does the probability that m is a square. △
Now in order to have a good chance of finding a square root of (a2 mod N ), we want a2 to
be congruent to a small positive integer. The trick is now to take a number a, then take the
2
√ integer k ≡ a mod 2N and√then try to extract the square root of k. When a is smaller
smallest √
than N , we just get k = a and k = a, which is not useful, but when a is greater than N ,
we get a number k which is a bit more√“random”. Therefore, what we will do is look for squares
in the sequence a2 − N for a at least N .
Let’s do an example. Suppose we are given the number N = 943 to factor. We extract the
square root of 943 with some fast and imprecise floating point algorithm and get 30.7, so we
calculate
312 − 943 = 18
322 − 943 = 81 = 92 a square!
Now we have 322 ≡ 92 mod N and it is clear that 32 6≡ ±9 mod N . Therefore we calculate
gcd(N, 32 − 9) = gcd(943, 23) = 23 and conclude 943 = 23 · (943/23) = 23 · 41.
Let’s try this again with another number to factor. Suppose that this time we are given N = 1649
(example from [Pom05] ). Now we should look for squares in the sequence
412 − 1649 = 32
422 − 1649 = 115
432 − 1649 = 200
442 − 1649 = 287.
..
.
None of the above numbers are squares, and this list will go on for a while before the first square
appears. In fact, finding a square in this list may just take more time than doing trial division
34 Factorization
412 ≡ 32 (mod N )
432 ≡ 200 (mod N ),
we can conclude (41 · 43)2 ≡ 32 · 200 = 802 mod N , which gives us a pair of congruent squares
which we can use to factor N .
The last example shows that we do not have to go through the complete list of numbers of
the form a2 − n: we can multiply some numbers from the list with each other to create squares.
To find subsets of the list with product a square, we use the following characterization of
squares: A number y is a square if and only if every prime p has an even index in its factorization.
For example, factoring the above list gives
412 − 1649 = 32 = 25
422 − 1649 = 115 = 5 · 23
432 − 1649 = 200 = 23 · 52
442 − 1649 = 287 = 7 · 41
4.2.3 Sieving
We now know that the sequence of smooth numbers of the form n2 − N that we generate, should
be π(B) + 1 integers long. But how do we recognize the smooth numbers? We could use trial
division, because after B divisions, we know if a number is B-smooth, and if it is, then we also
know the factorization (and the exponent vector).
A better way of recognizing smooth numbers is based on the sieve of Eratosthenes, which is
an algorithm for generating a list of primes. The sieve of Eratosthenes starts with a list of the
integers from 2 to some bound Y . In each step, we take the first unmarked number p and mark
every p-th number after p. The numbers we mark are all multiples of p, hence not prime. So we
4.2 The Quadratic Sieve 35
start with p = 2 and mark every second number, starting with 4. Then we sieve with the number
3, marking every third number starting with 6. The number 4 is already marked, so we skip
it and sieve with
√ the next unmarked number, 5. When we have done this for every unmarked
number up to Y , all unmarked numbers must be prime.
Now, contrarily to the sieve of Eratosthenes, we are not interested in the unmarked numbers.
We are interested in numbers with small prime factors, so we want the numbers which are marked
a lot of times in the early stage of the sieve. Instead of marking numbers, we will divide them
by the prime we are sieving with. In addition, we sieve by higher powers of these primes as well.
After sieving with all primes up to B, all B-smooth numbers will be turned into ones. As an
example, let us sieve for 5-smooth numbers up to 20. We start with the list
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
and sieve for even numbers including 2 itself, dividing them by 2, which gives
1, 1, 3, 2, 5, 3, 7, 4, 9, 5, 11, 6, 13, 7, 15, 8, 17, 9, 19, 10.
Now we sieve for multiples of 22 = 4, and divide every fourth number by 2 as well:
1, 1, 3, 1, 5, 3, 7, 2, 9, 5, 11, 3, 13, 7, 15, 4, 17, 9, 19, 5.
The next power of 2 is 8, so we divide every 8-th number by 2. The last power of 2 below 20 is
16, so we divide the 16-th number by 2 again.
1, 1, 3, 1, 5, 3, 7, 1, 9, 5, 11, 3, 13, 7, 15, 1, 17, 9, 19, 5.
Then we go to the next prime, which is 3 and do the same:
1, 1, 1, 1, 5, 1, 7, 1, 3, 5, 11, 1, 13, 7, 5, 1, 17, 3, 19, 5,
1, 1, 1, 1, 5, 1, 7, 1, 1, 5, 11, 1, 13, 7, 5, 1, 17, 1, 19, 5.
The last prime which is at most 5 is 5 itself, so
1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 11, 1, 13, 7, 1, 1, 17, 1, 19, 1.
The numbers that are turned into ones are exactly the 5-smooth numbers up to 20. If we
remember the factors by which we divide the numbers, then we can create the exponent vectors
simultaneously without extra cost.
So how much time does this cost? When sieving for multiples of pk , we are going through
the list of integers up to Y with steps of pk . This takes Y /pk steps, so the time takes to find all
B-smooth numbers up to Y is proportional to
∞
XX
Y /pk ,
p≤B k=1
There now remains one important optimization problem: How should we choose our smoothness
bound B? On the one hand, if B is chosen to be small, then we need to find only a few B-smooth
numbers and the matrix will be small. On the other hand, the greater B is, the greater the chance
for a random number to be B-smooth. Therefore, we need to choose B optimally.
We will now sketch the reasoning which leads to the optimal choice of B as well as an estimate
of the time needed for sieving. Details can be found in [Pom05].
If a runs in the interval [n1/2 , n1/2 + nǫ ], then a2 − n is smaller than (n1/2 + nǫ )2 − n ≈ 2n1/2+ǫ .
Call this bound X. Let ψ(X, B) be the number of B-smooth integers in the interval [1, X].
Then ψ(X, B)/X is the probability that a random integer in [1, X] is B-smooth, so we need
approximately X/ψ(X, B) random numbers for each B-smooth one. We needed a bit more than
π(B) random numbers, so we should create about π(B)X/ψ(X, B) numbers of the form a2 − N ,
assuming that they are random enough. According to the previous section, it then takes time
π(B) log log BX/ψ(X, B) to check all these integers for B-smoothness. Now suppose B = X 1/u
for some fixed u. Then we can use the following estimate from [Pom05],
ψ(X, X 1/u )
≈ uu .
X
If we use this estimate together with the rough approximation π(B) log log B ≈ B, we approxi-
mate the sieving time by
X 1/u uu .
Now we need to choose u to minimize this expression. [Pom05] gives as optimal value
Now let’s get back to the matrix step. We have a binary matrix which is about B × B and all
entries are zero or one. In fact most entries are zero. There are numerous algorithms from linear
algebra for finding a dependency in such a matrix and we will not discuss them here. Instead,
we refer to [Pom05] for more details and note that, asymptotically, it is faster than sieving.
4.3 Overview
4.3.1 Complexity
In order to express the asymptotic complexity of current factoring methods, one usually uses the
function
Lx [v, λ] = exp(λ(log x)v (log log x)1−v )
4.3 Overview 37
with parameters v, λ. For v = 0, this function is just Lx [0, λ] = exp(λ(log log x)) = (log x)λ ,
which is polynomial in the length of x. But for v = 1, it is Lx [1, λ] = exp(λ(log x)), which is
exponential. It is common to write Lx [v, λ] instead of Lx [v, λ + o(1)]. Now for the quadratic
sieve, from (4.1), we have a conjectured complexity of LN [ 12 , 1], which seems to be somewhere
in between the quadratic and the exponential with v = 12 . Many other factoring algorithms also
have v = 12 , and this seemed to be a lower bound for the runtime of factoring algorithms. That
was until the introduction of the number field sieve.
Now before we get to the number field sieve, let us first look at the big picture so far. Many
algorithms, including the quadratic sieve and the number field sieve, work in a similar way. By
theorem 4.2, it suffices to be able to construct some solutions to x2 ≡ 1 mod N in an apparently
random manner. To construct solutions like that, many algorithms follow these three steps:
Step 1. Select a factor base. Select a finite set of non-zero elements a1 , a2 , . . . , aI ∈ ZN . We
call this set the factor base. In the quadratic sieve algorithm, the factor base was formed by the
congruence classes of the primes ≤ B.
Step 2. Collect relations. Collect relations between the elements of the factor base. By a
relation, we mean a vector v = (v1 , . . . , vI ) ∈ ZI for which
I
Y
avi i = 1. (4.2)
i=1
Stop as soon as the collection V of found relations contains a bit more than I elements.
Step 3. Find dependencies. Find dependencies modulo 2 among the elements of V . A
dependency is a subset W of V such that W v = 2 · w for a vector w ∈ ZI . If we take x with
P
(x mod N ) = Ii=1 aw
Q
i , we get
i
I I P I Y
vi
Y Y Y
2
(x mod N ) = ai2wi = ai W
= avi i
i=1 i=1 i=1 W
I
YY
= avi i = 1.
W i=1
where the vector v is the exponent vector of x2 − N . A dependency from step 3 then gives a
solution to a2 ≡ b2 mod N instead of x2 ≡ 1 mod N , but aside from that, the algorithm follows
the same principle.
Because the number of elements of V is greater than I, we can always find a dependency in
step 3. From the way the relations are generated in a good algorithm, we expect x to be random
and thus have probability 12 to yield a nontrivial factor of N . This assumption is backed up
by experiments for both algorithms mentioned in this text. Now if x does not give a nontrivial
factor of N , we can repeat step 3, finding another dependency in the matrix, without repeating
step 2. To make this possible, we created not just I + 1, but a little more relations in step 2.
Lots of factoring algorithms consist of choosing a factor base of I elements, followed by
collecting a bit more than I relations.
38 Factorization
In this section, we will try to give a basic notion of fields and number fields. For details or proofs,
we refer to any introduction to algebra or number theory.
By a ring, we mean a set with addition, subtraction and multiplication that satisfy some
axioms which we will not discuss here. Examples are the rings Z of integers, Q of rationals
and R of reals, and the ring of integers modulo a number n, Zn . A field is a ring in which
multiplication has an inverse, ie. a ring in which division is possible. Examples are the fields Q
and R and the finite field Fp = Zp for p prime.
By a field extension of a field K, we mean a field L which contains K. We call the extension
finite when L is a finite-dimensional vector space over K.
Definition 4.4 A number field is a finite extension of the field of rational numbers, Q.
The number field sieve uses smooth numbers in number rings different from Z, an idea that
was proposed by Pollard in 1988. The complete algorithm was introduced in 1990 by Lenstra,
Lenstra, Manasse and Pollard. Since then, many people have contributed to theoretical and
practical improvements.
4.4 The Number Field Sieve 39
The idea of the number field sieve is to construct a number field K = Q(α) and a map φ from
the ring Z[α] to the ring ZN . This map should be a ring homomorphism, which means that it
preserves addition and multiplication: φ(x + y) = φ(x) + φ(y), φ(xy) = φ(x)φ(y), φ(0) = 0 and
φ(1) = 1.
If φ(α) = (m mod N ) for a small number m, then we have φ(a + bα) = (a + bm mod N ). The
idea is to find coprime a, b where both a + bα ∈ K and a + bm ∈ Z are smooth. (We will leave
a definition of smoothness in K to more advanced texts). Then we can factor a + bα in K and
a + bm in Z. If we apply φ to the factorization of a + bα, we get a product of classes in ZN . On
the other hand, if we reduce the factorization of a + bm modulo N , we get another product of
classes in ZN . Now we have two products in ZN that are the same. If we divide those products,
we get a relation like (4.2).
The above suggests that we take a factor base containing φ(π) for all “small” primes π of
Z[α] and (p mod N ) for all small primes p of Z.
The first step of the number field sieve is choosing a polynomial f . Then we need the function
φ, for which there must be a class (m mod N ) = φ(α) for which (f (m) mod N ) = φ(f (α)) =
φ(0) = (0 mod N ), ie. f (m) ≡ 0 mod N . This means that we should choose f in such a way
that we know a small zero m of (f mod N ). Now there are already some questions that need to
be asked, like “how do we choose a polynomial?”, “what if the number ring does not have unique
factorization?” (casus 4.2) and “what are the primes of the number ring?” The answers to those
questions, as well as the actual sieving procedure, are too advanced for this text and a course in
algebraic number theory is recommended.
The number field sieve has the advantage over the quadratic sieve that it generates much
smaller numbers, which therefore have a much greater probability to be smooth. This means
that it needs to generate less numbers and is much faster. The number field sieve has a conjectured
asymptotic complexity of LN [ 13 , 1.9019] for arbitrary integers and LN [ 31 , 1.5263] for integers of the
special form for which it was originally created. The fact that it has v = 13 , where the quadratic
sieve and other algorithms have v = 12 , makes it the fastest algorithm for factoring integers of
more than 120 digits.
For an introduction to the number field sieve, we refer to both [LHL93] and [Ste05], which are
both very clear and provide different approaches to explaining the ideas behind the algorithm.
The first also provides some enhancements of the sieve and also contains the implementation
that was used for the record factorization of 2523 − 1.
40 Factorization
The main goal of this chapter is to raise the reader’s awareness when he or she is confronted with
(theoretical) cryptographic improvements that claim to lead to a more reliable, secure and robust
cryptosystem. As we will see, these claims are not always correct as the suggested improvements
may even harm the overall system security in different ways. We will discuss several kinds of these
enhancements and present some methodological ways to analyse the security of cryptosystems.
5.1 Introduction
The notion of security in a public cryptosystem is usually based upon some mathematical and
computational assumption. In order to verify the alledged amount of security that is offered by a
specific implementation of a cryptographic scheme, it is neccesary to investigate all the (compu-
tationally) possible attacks that can be applied to this system. Identifying these attacks allow us
to suggest improvements that could seal off the security gaps exploited by these attacks. Since
the introduction of public cryptographic systems, this process of detecting gaps and repairing
them has been and always will be a neverending story. It is therefore needless to say that up till
now countless improvements have been formulated for innumerable cryptographic mechanisms
in order to obtain even better security, reliability and useability than before.
However, contrary to the common belief that improvements always provide a better system,
some enhancements do not neccesarily imply that the level of security has increased. It may even
be the case that some of these improvements that appear profitable at first sight, can damage
the system in such a way that other attacks, more insidious and more difficult to identify, can
be unleashed.
There are many kinds of improvements thinkable. Some might be based upon some theoretical
ground while others might not even be related to the cryptosystem itself but, for instance, applied
to the hull of the system which envelops the specific cryptographic scheme. The article of [LN02]
is a good example of the second kind of improvement where we see that a cryptosystem’s security
is not only based on the cryptographic scheme itself but also on its interaction with the working
enviroment. Although there many more of such cases to think of, we will limit our scope to the
first kind of improvements.
41
42 How Improving makes it Worse
This chapter is primarily based upon [JQYY02] and the first part (5.2) of this chapter deals
with the content of this paper. The authors of this article confront us with several security
paradoxes that show us how a seemingly beneficial improvement to a cryptosystem may in fact
weaken it when used in practice. Their research findings are based upon the well known RSA
system due to its widely use and acceptability as a public key cryptosystem. Also, we would like
to focus your attention towards methodologies that can help us identify such paradoxes more
easily before they have any chance of being considered or implemented into public cryptosystems.
We will discuss this topic in the second part (5.3) of this chapter, which is based upon [Nao03]
and [SSSW98].
Our enviroment consists of different RSA versions, each being an universal improvement over the
previous version namely RSA in standard mode, RSA enhanced with CRT (Chinese Remaindering
Theory) and RSA enhanced with CRT and a Fault Analysis Prevention method. To show that a
specific improvement may allow attacks with weaker assumptions than before we need to define
the capabilities and resources of our adversary:
1. Induced Faults - Our adversary may inject faults into the system. To be more specific, she
is able to change bits to their complementary value.
2. Limited Feedback Channel - Our adversary can only receive limited feedback from the
system. Adversaries may thus receive indications about the kind of fault due to a change
of behaviour of the overall system but they don’t get actual outputs from the decryption
device.
Using this profile of our adversary, we will devise an attack on the three RSA systems. The
goal of our attacker will naturally be to uncover the value of the secret decryption key. We will
see that with each improvement, the attacker can use weaker assumptions to reach his goal.
Given the RSA system in standard mode, the encryption and decryption process are resp. defined
as c = me mod n and m = cd mod n, where c, m are resp. the cipher - and plaintext; e, d are
resp. the secret decryption key and public encryption key; n is the public RSA modulus whose
factors are the secret primes p and q.
5.2 Paradoxical Improvements based on RSA 43
Suppose the adversary introduces an error during the decryption process. More specifically,
the adversary flips a random chosen bit, di for instance, to its complementary value. The de-
cryption process will then output the corrupted value m′ = cd mod n. Now, let us rewrite the
′
our adversary can retrieve the value of dj and this attack is easily generalised to the case where
several bits of d have ’flipped’. Note that we do not need to know the original value of di . Of
course, this attack is only possible if and only if our adversary can get access to the value of m′ .
A RSA system enhanced with CRT allows the decryption process to be speeded up by a factor of
4 because we are able to spread the computations over the secret primes p and q. The decryption
process is defined as m = mp + p[p−1 (mq − mp ) mod q] where mp = cdp mod p, mp = cdp mod p,
dp = d mod (p − 1) and dq = d mod (q − 1).
Suppose we initiate the same kind of attack on the RSA - CRT scheme as we did in the
previous section. This time, such an attack may even have worse effects than in previous case!
Let us assume that our adversary has corrupted a value in the computation of mp but not in
mq . The CRT calculation will then be m′ = m′p + p[p−1 (mq − m′p ) mod q] instead of m. Since
m 6≡ m′ (mod p) and m ≡ m′ (mod q), it follows that gcd(m′e − c(mod n), n) gives us the secret
prime factor q.
This attack can be considered stronger than the one in the previous section because there is no
assumption about the kind of (induced) error: Any error during computation of the exponentatio
modulo one prime factor will result in success for our adversary. Given the secret prime factors
p and q, it is possible to derive the secret decryption key d. Note that the attack is still possible
if and only if our adversary can get access to the value of m′ .
In order to solve the problem of the previous subsection, Shamir has suggested an elegant method
to protect against failure models: Given a random small integer r, let the decryption algorithm
compute mrp = cdrp mod rp and mrq = cdrq mod rq, where drp = d mod φ(rp) and drq =
d mod φ(rq). Now, if mrp 6≡ mrq (mod r) then there is an error in the calculation and instead
of outputting the corrupted value m′ , the system outputs an error message. If this is not the
case, then the computation was faultless and the plaintext m is recovered by applying CRT on
mp = mrp mod p and mq = mrq mod q. Note that the probability of an error being undetected
is equal to 1/r so if r is a 20 - bit integer then the probability of this happening is smaller than
10−6 .
44 How Improving makes it Worse
A ← 1; B ← c;
for i from 0 to (k - 1)
if (di = 1) then A ← AB mod n; //Go to 5.2
B ← B 2 mod n;
end;
Given this suggestion we could conclude that RSA - CRT enhanced with the Fault Analysis
Prevention method of Shamir would be a logical choice. This is entirely not the case as we will
see next. This time, our adversary is capable of inducing faults as before but the feedback is only
limited to observing whether a ciphertext decrypts correctly or not. The attack is based on the
fact that RSA exponentiation is usually implemented with the square - and - multiply technique
where multiplication and modular reduction are interleaved to fit the word - size Ω = 2ω . These
two parts are depicted in 5.1 and 5.2.
Suppose an adversary wants to guess the value of bit di and that this value is 1. Also, let’s
assume that one or several bits of errors have been introduced into some words Aj of register A
where j > jτ and jτ is the current value of the counter j when the faults are introduced. Because
the words containing the errors are no longer required by the next iteration, the computation of
R = AB mod n will still be correct when you traceback to the output of 5.2. This result is then
restored into A = cd mod n of 5.1 where the output will be correct too. However, if di = 0 then
the if - statement to the interleaved multiplication will be bypassed and A will still contain the
error(s) while the calculation of B continues. This will result in an incorrect value of the final
result cd mod n.
Given the assumption that our adversary knows when a ciphertext decrypts correctly or not,
she will know the value of di dependent on whether the system returns an error message or not.
Note that this attack does not require our adversary to know the value of m′ and is therefore
considered to be an attack that uses weaker assumptions than the attacks mentioned in earler
subsections.
R ← 0;
for j from (t - 1) downto 0
R ← (R2ω + Aj B) mod n;
end;
We have seen in the previous subsection how improving the reliability of a system may weaken
it. In the paper of [JQYY02] we are confronted with two other examples comparable with the
one outlined above. These other two examples show us how:
1. a system (for example, ’OAEP’), which is more secure against active attacks may, in fact,
be weaker.
2. a system (for example, ’RSA for Paranoids’) with increased length in its parameters (which
possibly make the underlying problem harder) may, in fact, be weaker.
As the authors state, these paradoxes are mostly the result of the fact that the observability
of the system has increased. The enhanced system with Fault Analysis Prevention acts as an
oracle in all the cases, from which an adversary may collect secret information. One might remark
that we could recompute the answer instead of returning an error message but this will also be
noticed by our adversary since the computation will take longer to complete. This fact can then
be revealed by a timing analysis, leading to the same sort of attack.
This subsection deals with the content of [Nao03]. The author proposes a methodology which
allows us to classify computational assumptions based on the complexity of being able to falsify
them. This is done by creating a certain challenge to their validity.
When we want to evaluate the validity of an assumption then we would like to be able to
falsify it in a constructive manner. In this case, we should also consider the complexity of this
task with respect to the assumption in order to say something about the level of acceptability
of the assumption. The methodology describes the use of a public challenge, published by a
system designer, which is solvable in a time related to the time of breaking the assumption
when the assumption is false. The idea underlying this public challenge is based upon the
principle of public evaluation. Another important entity in this methodology is the falsifier
whose objective is to solve the challenges posed. What the adversary is to the assumption, the
falsifier is to the challenge. One particular difference between these two is that if an assumption
false, the probability of success might still be rather small, whereas a falsifier needs a much higher
probability of success, preferably close to 1.
Three levels of falsifiability are introduced in this paper namely efficiently falsifiable, falsifiable
and somewhat falsifiable. The differences between them are mainly the time it takes to verify a
solution to the challenge. An adversary α to assumption A with parameters (n, t, ε) means that
α working in time at most t can break the assumption on instances of size n with probability
better than ε. Another parameter we need is δ which is an upperbound on the failure of the
challenge (i.e. even if the assumption is false, there might be a chance, bounded by δ that the
challenge is not solvable).
- Sampling from Dn can be done efficiently, in time polynomial in n, log 1/ε and log 1/δ.
- If the assumption A is false then there exists a falsifier β that for a d ∈R Dn finds a solution y
for which V (d, y) is ’accept’ with probability at least 1 − δ and the runtime of β is t′ which
is polynomial in the runtime of α als well as n, log 1/ε and log 1/δ.
As an example, we discuss one - way permutations and ask ourselves in what class the
assumption ”this scheme is a secure implementation of one - way permutation” lies. Given a
functionf : {0, 1}n 7−→ {0, 1}n , is it a one - way permutation, i.e. no t time algorithm can find
x for a random y = f (x) with probability better than ε. The authors claim that the assumption
5.3 Methodologies towards Secure System Design 47
”‘f is a one - way permutation” is efficiently falsifiable. This is true for similar reasons as
factoring: Choose a set of challenges specified as the range of a function. Let H be a family of
pairwise independent hash functions, for h ∈ H we have h : {0, 1}n 7−→ 0, 1n−log n+2log ε+log δ .
The challenge is specified by h ∈R H and c ∈ {0, 1}n−log n+2log ε+log δ . A solution x is such that
f (x) = y and h(y) = c holds. Note that we are not checking the permutation part i.e. f is indeed
1 - 1 and length preserving.
There are still many open and challenging problems that, at the moment, are not handled
by this methodology. An example of a challenging problem is physical security: We have dealt
so far with the usual mathematical abstraction of a distributed system and ignored the physical
implementation. However, as we have seen in 5.2, many adversaries will probably take advantage
of aspects like power consumption, tamper resistance and observability which make the system
vulnerable. Ideally, we would like to tackle this problem somewhat in the same manner as we did
with computational assumptions. The proposed classification scheme also has a few drawbacks
in the sense that it cannot yet handle some rather important issues. An example of such an issue
is that implication is not preserved: If assumption A implies assumption B and assumption A
is efficiently falsifiable, then we cannot conclude that B is also efficiently falsifiable due to the
reason that we cannot use the challenges of assumption A for assumption B. What’s more, this
classification scheme does not yet highlight all major results in cryptography like the result of
HILL, stating that it is possible to construct a pseudo - random sequence generator from any
one - way function.
Looking closer at the physical security is done in [SSSW98]. In this paper we take a software en-
gineering analysis approach towards discovering weaknesses in a system, to prioritize weaknesses
in terms of relative dangers to the system and to determine cost - effective countermeasures.
These cost - effective countermeasures could be technical or non - technical of nature (for in-
stance, education). In order to do so we need models which describe our adversaries and the
vulnerability of the system during its lifespan.
Adversary Model
This model characterises adversaries in terms of resources, access, risk tolerance and objectives.
In other words, this model describes what the means / tools of the adversary are, what the
adversary is willing to do and what the goal is of the adversary. This model is dependent on the
system, of course. There exists many adversary variants but we assume that a rational adversary
follows the path of least resistance i.e. he chooses an attack that maximises his expected return
given his budget constraints of resources, access and risk. Note that financial resources are
not directly modelled in this representation of the adversary since they are used to buy either
capabilities or access.
Vulnerability Model
To map the vulnerabilities of the sytem, the authors use the metaphor of a vulnerability landscape
where the valleys and peaks represent respectively the vulnerabilities and countermeasures. To
understand this vulnerability landscape, the designer maps out all things related to the target
system, which includes the system itself, physical facilities, personnel and their responsibilities,
equipment and procedures (protocols). As the system evolves over time, the landscape also
evolves accordingly in response. The changes in this landscape are accumulative in the sense
48 How Improving makes it Worse
that changes in the system do not lead to the recreation of the entire landscape but can be added
to the existing landscape.
Any succesful attack has three steps:
One might reason that in order to protect the system, one needs to block only one of these
steps. The goal of the defender or the system designer is then to find the most cost - effective
way to block any potential attack.
The vulnerability model sorts out the weak points of the system by access. To identify all
the opportunities for adversaries to attack, the designer must consider the entire life cycle of
each component of the security environment. Assume for instance, an adversary who decides to
bug the telephones destined for a company’s offices. There are many attack opportunities and
shows that office equipment is vulnerable during its entire lifespan: From the drawingboard to
the manufacturing plant and from the loading docks to the work place. Even after disposal we
cannot guarantee that the level of vulnerability is gone with the system. This example extends
to the virtual working environment: Designers can leave exploitable bugs in the system and
adversaries can write virusses that are contained within electronic mails as an executable.
There are two access categories namely physical security and the trust model. Physical
security speaks of the problem of enforcing the notion of ownership and protecting your assets,
in physical as well as virtual sense. The trust model represents how an organisation determines
whom to trust with its assets, like employees. These two aspects are primarily handled by
indentification and signature schemes.
Methodology Overview
Given the above models, we can characterise attacks and choose rational countermeasures. This
methodology is based upon an ”attack tree model” which is an visualisation tool to enumerate
and weigh different attacks against the system and consists of five steps:
1. Create the attack trees for the system The system engineer creates an attack tree by
replicating the work of an adversary to find weak points in the system. The root of the
tree is the component that prompted the analysis. The child nodes are then created by
decomposing the parent node into its life cyle. Each phase in the lifecycke breaks down
into the two access categories of physical security and the trust model. This process can
be repeated and the analysis terminates in a series of vulnerability leaves.
2. Apply weights to the leaves For each leaf in the tree, quantitative values are assigned for
risk, access and cost to the adversary. For example, a possible node for encrypted messages
would be passive collection of brute force decryption of the ciphertext. Here, the risk and
access are low and the costs depend on the cryptographic algorithm.
3. Prune the tree so that only exploitable leaves remain Executing the attack.
Finally, the desinger attempts to deploy the most cost - effective set of countermeasures.
An extensive example is discussed in [SSSW98]. Of course, this methodology only provides a
means for enhancing the creativity of the analyst in finding vulnerabilities and countermeasures.
Achieving optimal coverage is still very hard to do.
E-Cash
Written by Maarten van Steenbergen
E-Cash is a collective name for all sorts of electronic money. A small amount of data can be used
to buy things on the internet, or be passed along to other consumers. Smart encoding is needed
to ensure privacy of the user and protection against fraud.
50
6.2 The Brands system for Off-Line e-cash 51
Definition 6.1 Finding the unique index logg h ∈ Zq with respect to g ∈ Gq \ {1} is the Discrete
Log problem. An algorithm is said to solve the Discrete Log problem is, for inputs g 6= 1, h gener-
ated uniformly at random, it outputs logg h with nonnegligible probability of succes. The Discrete
Log assumption states that there is no polynomial-time algorithm which solves the Discrete Log
problem.
Definition 6.3 Again, let q be prime and Gq a subgroup of Z∗p . Let g1 , . . . , gk , h ∈ G. The
representation problem is to find a representation of h with resprect to (g1 , . . . , gk ).
A we said before, the first step of the Brands system is setting up a bank. The bank needs
to choose a large prime p, and a large but smaller prime q which divides p − 1. The bank also
52 E-Cash
chooses three generators g, g1 , g2 from Gq . With q being prime every element of Gq is a generator,
so this shouldn’t be a problem. The bank then selects a secret key x, and publishes its public
key h = g x mod p. Next, the bank needs two collision-free hash functions H and H0 which are
defined as follows:
The user also chooses a random number u1 from Gq , and shares the public key hu = g1u1 mod p
with the bank. The bank stores hu along with other user identification information, and both
the bank and the user will use a number z in future calculations:
Using definition 6.2, the representation of z in terms of (g1 , g2 ) is (u1 x, x) because z = (g1u1 g2 )x =
g1u1 x g2x .
In the Brands system, money is transferred to the user in “coins” which are worth a fixed amount
of money. This is a problem, because a user should be able to pay any amount of money. Ideally,
the system would be altered to allow for arbitrary payments. Such a system does not yet exist
as far as I know, so we’ll have to live with this limitation at the moment.
The coin should contain information which only the bank can generate; no shop would accept
a coin which can be generated by anyone. The shop owner wants to be sure it can trade in the
electonic coin for real money at the bank, so the shop has to be able to proof the bank generated
the coin.
First, the bank chooses a number w from Gq , and sends the following to the user:
a = g w mod p (6.4)
b = (hu g2 )w mod p (6.5)
Then the user generates random numbers s, x1 and x2 from Gq to calculate:
a′ = au g v mod p (6.9)
b′ = bsu Av mod p (6.10)
The user then computes c′ and c as follows:
6.2 The Brands system for Off-Line e-cash 53
c′ = H(A, B, z ′ , a′ , b′ ) (6.11)
c = c′ /u mod q (6.12)
The result c is sent to the bank, which responds with r:
r = w + cx mod q (6.13)
The user wants to check if she has received a valid e-cash coin, and checks if indeed:
When the user want to buy something from the shop, she sends the coin {A, B, (z ′ , a′ , b′ , r′ )} to
the shop. The shop returns a challenge d:
r2 = ds + x2 mod q (6.18)
Now the shop will have to check two things. They will want to check if the coin is valid and
signed by the bank and also if the user has signed it, which is a requirement from the bank,
before they pay real money in return for the e-coin. The shop checks to following equations:
g r = a′ hc mod p
′ ′
(6.20)
54 E-Cash
Ar′ = b′ z ′c mod p
′
(6.21)
These equations can be rewritten as:
Ad B = (hu g2 )sd g1x1 g2x2 mod p
= g1u1 sd+x1 g2sd+x2 mod p
= g1r1 g2r2 mod p
′
gr = g v+ru mod p
= g v+(w+cx)u g xcu mod p
= au g v hcu mod p
a′ hc mod p
′
=
′
Ar = (hu g2 )s(v+ru) mod p
= (hu g2 )sv+us(w+cx) mod p
= (hu g2 )swu (hu g2 )sxcu+sv mod p
cu
= bsu Axcu+v mod p = bsu z ′ Av mod p
[bsu Av ]z ′cu mod p = b′ z ′c mod p
′
=
The shop now has enough information to return the coin to the bank, and receive real money
in return on its banking account. The coin cannot be used to pay another shop. However, an
extension to this system does exists that allows a coin the be transferred to another user.
The shop now has the coin {A, B, (z ′ , a′ , b′ , r′ )} and the user’s signature (r1 , r2 ) for this specific
purchase. The shop gives this information along with its shopid and the datetime of the purchase
to the bank. The bank can use equations 6.19, 6.20 and 6.21 to check if this is a valid coin. The
bank checks a database to see if this coin was previously spent. If it’s not in the database, the
shop gets paid and the coin is stored in the database as {A, B, datetime, r1 , r2 }.
Ik the coin is already in the database and the datetime is the same, the the shop is trying
the cash the same coin twice, and the deposit is rejected. If the datetime is different, then the
user has spent the same coin twice. The bank now has the following information about the two
deposits: {A, B, datetime, r1 , r2 } and {A, B, datetime′ , r1′ , r2′ }. Now the bank can extract the
identity of the user as follows:
Cryptographically secure The system is built on the Discrete Log problem. And while there
is no proof that the DL is computationally secure, it is regarded as save, as long as the
primes p, q are long enough.
The bank knows the identity of the user when the coin is given to her, because the bank
has to know what account to get the money from. But the bank doesn’t know the
identity of the user when the coin is returned, except when the user has double-spent
the coin. Also, the shop can’t link different payments by the same user.
The user doensn’t need to know the shopid that the bank and shop agreed upon, so the
user can’t pretend to be the shop and try to make the bank think the shop is trying
to cash the same coin twice.
The shop can’t link different payments to eachother or to the user. However, the shop
can find out the identity of the user if the user signs two different challenges d (See
formula 6.16). It would be better if only the bank is able to establish the identity of
the user in such a case.
Fraud prevention In every step, messages are signed either by the user, or by the bank. There-
fore, the bank can’t debit the account of the user without supplying a real coin; every
withdrawal must be signed by the user. Therefore a user cannot deny having made a with-
drawal; the bank has the user’s signature on the withdrawal. Furthermore, if for example
in a criminal investigation the police want to prove that the user has made a particular
payment, they can use the user’s private key to check if the signature matches the signature
that the shop received earlier. So the user cannot deny having made a particular purchase.
Of course, the user will have to be forced to give up her private key of sign the message with
her private key, but the police can check that the private key is authentic by comparing a
newly generated public key to the known public key of the user.
Multiple spending In the standard system, a coin can only be spent once. The system
can be extended to allow for coins that can be spended k times in such a way that
only after spending it k + 1 times, the identity of the user is revealed. In this extension
however, the k payments can be linked to eachother.
Observer protocol The system can be extended with a fourth party, the observer O. In
practise, this could be a chip on the user’s banking card or a device that is attached to
the user’s computer. Every coin is also signed by the Observer chip, and the Observer
stores spent coins in permanent memory. This chip can prevent double-spending. Of
course, if the chip’s secret key is somehow discovered by the user, the user can still be
tracked down by the bank after double spending.
Checks In the system explained in this chapter, it’s only possible to pay with coins of
one fixed amount. The system is extendible to allow the use of checks, of which an
arbitraty amount can be used for one payment. However, different payments of the
same check can be linked together by the shop or the bank. But as long as the user
doesn’t overspend the check, her identity remains conceiled.
Transferability Another extension exists which allows a user to transfer money to another
user without intervention of the bank.
56 E-Cash
Analougues in RSA The system can be modified to make use of multiplicative groups, like in
the RSA system. Only the withdrawal protocol is not adaptable to the RSA system.
6.4.1 Standards
The acceptance of standards is much more important in off-line cash than it is with online cash.
For example, when used in a standard webbrowsing environment many online systems work with
standard http techniques: the shop links to a “bank”, for example the way2pay.nl website. The
user validates the payment on this more or less trusted site, and way2pay links the user back to
the shop website, so the shop knows the payment has been made.
With Off-line cash, the transaction needs to take place between the user and the shop directly.
So the browser needs an extention which can do the calculations. Standard Java of Javascript
solutions don’t work, because to do the calculations these programs should be able to read the
user’s private key, but it’s not secure for an applet supplied by a shop to be able to read to user’s
private key.
Installing an extra extension to the browser in order to make the transactions more secure, is
a hurdle most users wont take until offline cash is in wide acceptance and the benefits are large
enough to overcome this hurdle. This presents a chicken-and-egg problem. Also, if additional
software needs to be installed, there has to be a worldwide standard on what this software should
do exactly.
6.4.2 Patents
As explained in the previous subsection, online payment systems are relatively easy to implement;
every CS student can come up with a system such as way2pay or paypal. Most of the effert
companies like this have to do lies in the marketing and client support. But marketing and
management techniques can’t be patented. Software techniques can be patented in the US, and
maybe in the future also in the EU.
Patents may hinder the acceptance of offline cash. Suppose the inventors of for example the
Brands system (which relies in turn in previous inventions) choose not to ask money for their
invention, and establish the FreE-Cash foundation. If they themselves are threatened by patent
claims from other companies, they have to be able to pay lawyers, even if the claims are incorrect.
So if a patent system on software exists, the inventors are almost forced to patent and ask
money for the system in order to be able to defend themselves legally. This in turn can hinder
acceptance by shops and software companies because they have to pay to the inventors of this
particular e-cash system.
Even if an offline e-cash system is cryptographically safe, the public will still have serious doubts
about accepting some bits as a payment. Even though the Brands system is safer in many ways
than a creditcard for example, the creditcard is something people are used to and can understand.
Summary and Conclusions 57
And even if fraud can only occur when a user is sloppy with her private key u1 , stories will
emerge about insecurity of the system, and hinder its acceptance
If a coin is double-spent, the user for whom the coin was made can be tracked. But it is quite
possible the user’s private key u1 is stolen. Then the thief can theoretically use the same coin
thousands of times on the internet before the fraud is detected. And even then, a system should
be in place which distributes this information to all shops, so that shop owners can reject coins
that are already double-spent.
The damage is much larger than with online systems: if creditcard information is stolen, the
thief can max out the creditcard, but she can not spent thousand times as much as the card
limit.
Also, if a thief tries to pay with a creditcard, the payment information is passed directly to
the creditcard company. They can track suspicious behaviour, and trace a thief quite fast.
As said before, an offline system when used for payments on the internet, needs a software
addition on the client side, unlike many online payment systems. But this software addition is
not enough. When a computer which stores a private key u1 gets cracked, the thief has full access
to the legitimate user’s money.
A safer implementation needs a hardware device attached to the computer which stores the
private key u1 , and does the calculations without sharing u1 with the software on the computer.
This device could for example have a display and a button, to ensure no payments can be made
without consent of the user.
Every user of the system must first have such a device before being able to make use of the
system securely. This could be yet another hurdle for acceptance of off-line cash.
Design of Blockciphers
Written by M.R.T. Beye
Blockciphers have become a popular way to encrypt sensitive information and have remained so
for decades. A multitude of implementations have been developed in recent years and some of
these algorithms have become widely used. However, a number of successful attacks have been
devised, to attack inherent weaknesses present in many of these ciphers. This paper will attempt
to give insight in the design of cryptographically strong blockciphers.
First of all some basics regarding blockciphers will be introduced (and for some revisited).
The focus will be on ciphers that use the so-called Feistel structure, which will be decomposed
into its components. The most common and successful attacks and exploits will be identified
and related to weaknesses within these components.
A definition of desired properties, such as confusion and diffusion, will be given, after which
it will be made clear how these properties work to protect an algorithm against aforementioned
attacks. Finally, some requirements to guarantee the desired properties are given, and the CAST
approach will receive some attention as a good means of building robust blockciphers that satisfy
these requirements.
Blockciphers have been around for years and are a cheap and efficient way to encrypt data
for secure communication on networks. Back in 1949 Shannon [Sha49] introduced his notion of
secrecy and proved that the Vernam cipher, which used very large keys and needs synchronization
between sender and receiver, is an algorithm which provides unconditional secrecy, although it
is not very practical. Shannon also showed that unconditional secrecy cannot be achieved in a
cheaper or more efficient fashion.
The implication of his results is that the blockciphers created since have been aiming for
empirical secrecy in stead of the unconditional kind. This means that these ciphers are not
provably immune to any and all attacks and that it has been a matter of waiting for cryptanalysts
to devise a way to attack and defeat the principles behind them. Still, we would like to continue
to use blockciphers, for the fact that they are much faster that asymmetric (i.e. public key)
58
7.2 Inner workings of Feistel based ciphers 59
algorithms and can be implemented in hardware with limited capabilities. A good example of a
blockcipher is DES, which has been very widely used and is well documented and studied.
In 1972 the National Bureau of Standards called for algorithms to be submitted, in order to
establish a new Data Encryption Standard. IBM responded with the winning algorithm in 1974,
after which the National Security Agency revised its design. This resulted in a version with a
reduced blocksize and keylength and possibly also a revision of the S-boxes that are used.
In 1975 the algorithm was publicly released, including details of its inner workings, to the
displeasure of the NSA. The fact that the NSA had made adjustments, made cryptanalysts
curious and gave rise to an increase in cryptographical research. In 1977 DES was officially taken
in use as a standard and has remained so for decades.
Criticism was mounting however, as both the NSA and other parties claimed that DES was no
longer safe, this because of the limited keylength. Unfortunately, for years, no safe replacements
were available and DES remained in use, albeit in the form of 3DES. The problem with the
other candidates seemed to be their vulnerability to so-called differential attacks, as introduced
by Biham and Shamir [BS91] in the early 90s. It turned out that the NSA and IBM had known
of this type of attack, but prepared DES against it and kept the information secret.
Matsui [Mat94] was inspired by the idea of differential cryptanalysis and devised his own
attack: linear cryptanalysis. This attack is able to break the 16 rounds of DES faster than
exhaustive key search, in contrast to differential attacks. The methods of both of these attacks
will be explained in detail in section 7.3, as will some other weaknesses and attacks.
In this paper we will focus on ciphers that use the Feistel structure, because much relevant
research has been done in this field and many popular ciphers use this structure.
A Feistel network in general takes a piece of data with a length equal to half the blocksize
and combines it with another piece of data of the same length. This is usually done by means
of a simple XOR (⊕) operation, but other bitwise functions can be used equally well. The other
piece of data that serves as input to the XOR function is derived from the remaining half of the
data in the block and the round key.
Blockciphers iterate this type of function a number of rounds, in order to ensure good inter-
mixing of the plaintext and the keys. The number of iterations will prove to be an important
aspect in protection against attacks, as shown in sections 7.3 and 7.4.
A graphical representation of a single iteration of a Feistel network is given in the figure
below:
60 Design of Blockciphers
Ciphers of the Feistel type can be decomposed into four main components:
• S-boxes
• Framework
• Key Schedule
• Round Function
The function which mixes half of the data with the key is called the round function F . Usually,
this function uses S-boxes (S for substitution), which we would like to resemble a random function
as closely as possible, in order to correctly mask patterns in the plaintext. S-boxes are usually
implemented using lookup tables, whose rows and columns represent part of the input, thus
computing the output. To make sure that S-boxes in round i are influenced by multiple S-boxes
in round i − 1, a P-box (P for permutation) is used to shuffle bits between rounds.
The key schedule determines what round keys are generated, based on the primary (randomly
chosen) key. This generating of round keys is usually implemented using shift registers.
Aforementioned components must fulfill some demands separately and as a whole in order to
make the resulting cipher cryptographically strong. Next we will show where the faults can lie
that make a cipher vulnerable to attacks.
Exhaustive key search is the most straightforward attack that can be launched on a blockcipher.
Although it can easily be discouraged by increasing the keylength, the existing DES algorithm
7.3 Attacks and weaknesses 61
for instance is no longer safe against this type of brute forcing. If one needs to run a cipher on
a piece of hardware that is designed for DES, increasing keylength is no option. One solution
that was found is 3DES, where the plaintext is encrypted three times, using normal DES. This
effectively increases the keylength and the properties of DES make 3DES a good form of encryp-
tion, without pitfalls. However, it is also quite expensive in terms of computational workload.
A good alternative to protect DES against exhaustive key search has been advocated by Kilian
and Rogaway [KR96] in the form of DESX.
The principle behind this approach is defined by the following equation:
Note that the overhead this has on the original DES algorithm is minimal and that DESX can
easily be implemented on existing DES hardware. Kilian and Rogaway suggest that this cipher
is harder to break by a ’generic key-search adversary’, who attempts to decide if a ciphertext was
outputted by the cipher under study. This is a generalized way of looking at key search. They
state that the fact that decision becomes harder, so does determining the specific key used for
encryption. Their result states that the effective key length of DESX would be k + n − 1 − log m,
where k is the keylength, n is the blocksize and m is the number of plaintext/ciphertext pairs
the adversary possesses.
However, DES is still vulnerable to linear attacks, and no mention is made as to whether
DESX makes any kind of improvement when it comes to linear attacks in stead of exhaustive
key search.
Linear attacks are known plaintext attacks that look for special qualities in the behavior of the
cipher. The principle underlying linear attacks is to find a linear approximation that relates
subsets of plaintext, ciphertext and key bits in the following way:
Pi1 ⊕ Pi2 ⊕ · · · ⊕ Pia ⊕ Cj1 ⊕ Cj2 ⊕ · · · ⊕ Cjb = Kk1 ⊕ Kk2 ⊕ · · · ⊕ Kkc (7.2)
Where i1, i2 · · · ia, etc denote fixed positions in plaintext P , ciphertext C and key K.
If P l is the probability that equation 7.2 holds. The effectiveness of the linear approximation
is related to the magnitude of |P l − 1/2|. If |P l − 1/2| is large enough and sufficient plaintexts-
ciphertext pairs are available, one can determine one key bit that is equal to the XOR of the
key bits in the righthandside of equation 7.2 as the value that most often satisfies this equation.
Such a linear approximation is made by combining a number of S-box linear approximations
in different rounds, so that all terms that do not involve plaintext, ciphertext or key bits are
cancelled out.
The number of plaintexts in a linear attack is inversely related to |P l − 1/2| so an increase
in the needed number of plaintexts can be attained by selecting S-boxes so that |P l − 1/2| is
minimized.
The probability of finding a linear approximation is the product of the probabilities of finding
all the required S-box linear approximations. This probability can be decreased by either making
sure more S-box linear approximations are needed (by using more rounds in the algorithm) or by
increasing the non-linearity of the S-boxes, thereby decreasing the odds of finding the needed S-
box linear approximations. The latter approach is preferred, because adding more rounds makes
the algorithm slower in both encryption and decryption, while merely patching the problem,
whereas improving the design on S-boxes solves the problem at the root (and leaves the addition
of further rounds as an extra option). In section 7.4, the concept of non-linearity will be further
62 Design of Blockciphers
explained and it will become apparent how to ensure the S-boxes are sufficiently non-linear. Also,
section 7.5 will provide a comparison of |P l − 1/2| between CAST and DES ciphers.
Differential attacks are chosen plaintext attacks that exploit the inability of a cipher to map
input to output differences in a statistically uniform way. This fact leaks key bit information
into the ciphertext. An example of a differential attack (in this case against DES) is given by
Tel [Tel02]. Section 7.4 will show how a property called diffusion can prevent this type of attack
from being successful.
Related key attacks are attacks where one makes use of ’related keys’, in the sense that keys that
are derived in a deterministic way based purely on the primary key and previous round keys are
’related’. If one knows one round key, one can use this information to find out information about
other round keys, or even the primary key.
We have seen how simple exhaustive key search can be made more difficult, but now we will
define the properties that provide a defense against the more refined attacks.
As we have seen, the attacks described in section 7.3 make clever use of intrinsic properties
(or faults) in the S-boxes or key schedule of the cipher. If we want to create a cipher that does
not possess these properties, we must have a definition of what the desired properties are. The
concepts of confusion and diffusion, as introduced by Shannon [Sha49], provide us with such
a definition, as does non-linearity. Other terms one comes across in literature are avalanche,
Strict Avalanche Criterion (SAC) and Bit Independence Criterion (BIC). We will discuss the
first three of these properties, which are related to the inner workings of the S-boxes and the
Round Function that employs them.
7.5 Correct design of components 63
7.4.2 Confusion
Confusion measures the degree to which a cipher masks input bits in its output by means of the
chance that an inversion in an input bit causes an inversion in an output bit. In an ideal case,
this probability should be 1/2 for any input / output bit pair. Another way of looking at it is
that the cipher is not biased towards ones or zeros in its mapping of input to output bits.
7.4.3 Diffusion
Diffusion is the property that states that a given plaintext bit can affect output bits. The higher
the diffusion, the more output bits have a chance to be affected by a certain input bit. Katos
[Kat05] presents a way of measuring diffusion by means of diffusion instances, which are matrices
that can be statistically analyzed, to reveal the degree of diffusion of the corresponding cipher.
The statistical tests include, among others, equality of the number of ones and zeros in the
matrices, as well as the randomness of their distribution.
7.4.4 Non-linearity
where si is the m-bit boolean function of the ith output bit of S-box S. This means that the
non-linearity of an S-box is the minimum non-linearity of over all non-zero linear combinations
of the m-bit boolean functions it employs.
Lee, Hays and Tavares [LHT97] support the claim that a non-linearity of 2m−2 is sufficient
for m × n S-boxes. The examples in the following sections will use 8 × 32 S-boxes, so this would
lead to a minimum required non-linearity of 64.
Lee, Hays and Tavares proceed to prove that if we require a non-linearity of at least 2m−2 , we
have a good chance of randomly selecting an acceptable S-box. They show that for 8×32 S-boxes
the chance of selecting a faulty S-box (so with non-linearity less than 64) is smaller than 2−11 .
By testing candidate S-boxes for non-linearity, certainty can be given that the selected S-boxes
really meet the criteria. Experiments have apparently not found any randomly generated S-boxes
with a non-linearity smaller than 72.
The CAST system, as explained by Adams [Ada97], shows us a way to find S-boxes that pos-
sess the required amount of non-linearity, in order to make differential cryptanalysis as hard as
exhaustive key search. The choice of S-boxes in CAST focuses on these being highly non-linear.
This is done by choosing linearly independent binary bent vectors, that together make the matrix
of an S-box. The Hamming weight and Hamming distances in this matrix are a measure for the
success of chosen set of vectors. If they are highly non-linear and close to SAC-fulfilling, they
also guarantee Output Bit Independence Criterion. Adams states that a good set can be found
in a few weeks of running time. When we compare the performance of S-boxes of this type to
those used by DES, we arrive at the following table:
The author would like to remind the reader that the procedure described in section 7.4.4 also
yields S-boxes with high degrees of non-linearity and does this much faster. Random selection
and tests for non-linearity can be done fast, and with a chance of encountering a bad S-box in
the magnitude of 2−11 , not much time is likely to be wasted. It is not clear in how far these will
be SAC-fulfilling. If this does not follow from the high degree of non-linearity, the proof given
by Adams cannot be applied to randomly selected S-boxes. However, the work of Lee, Heys and
Tavares is also aimed towards use in CAST-like ciphers, even published in the same journal as
Adams’ work, suggesting good compatibility.
CAST uses the same type of framework as DES does, namely the Subsitution-Permutation Net-
work (SPN), as suggested by Shannon. CAST uses the Feistel structure to implement this scheme,
which is well studied and does not seem to possess inherent structural weaknesses, like the ”tree
structure” implementation does seem to have. CAST leaves the parameters for blocksize and
number of rounds open to the preference of the developer of individual implementations. It must
be noted that any SPN system has its security increased, by increasing the number of rounds
used. Adams also notes that experimentation has found taught us that simply choosing S-boxes
that are resistant to differential attacks in isolation do not make for a resistant cipher if the
framework is not chosen in such a way that it makes specific use of the properties of the S-boxes.
For instance, when putting S-boxes with relatively flat output XOR distributions (good confu-
sion) into DES, the resulting cipher becomes more vulnerable to differential cryptanalysis than
the original DES algorithm. To ensure equal treatment of both data halves, an even number of
rounds is needed. Regarding the number of rounds, a comparison with DES might be in order.
Looking at both differential and linear attacks, DES would now seem to need 18-20 rounds in
order to be as strong as its keysize. Section 7.5.4 will show the advantages of CAST over for
instance DES when it comes to the number of rounds needed.
7.5 Correct design of components 65
Although CAST uses a key schedule typical for Feistel networks (generating round keys from
a primary key), it pays attention to additional constraints that make for a good resistance
to key clustering attacks. Without going into details, these are known as key/ciphertext Strict
Avalanche Criterion and Bit Independence Criterion (BIC). It is necessary to use different subsets
of primary key bits to generate round key i and i + 1. For attaining good avalanche it is also
wise to use all partial keys to generate the round key for round N/2 and reuse them in the
remaining rounds. The novelty that CAST adds is the fact that round keys are not generated
using a complex function, such as DES’s slide registers. CAST makes use of a simple function
and a dependency on a set of key schedule S-boxes. This makes recovering round key bits by
means of linear attacks less useful, because there is no known clear way of using round key bits
to rediscover primary key bits, as can be done with other ciphers like DES. Adams gives the
following example for an 8 round cipher:
Let the primary key be KEY = k1 k2 k3 k4 k5 k6 k7 k8 , where ki represents the ith byte.
Partial keys Ki′ are selected as follows:
K1′ = k1 k2
K2′ = k3 k4
K3′ = k5 k6
K4′ = k7 k8
After round 4, KEY is transformed into KEY ′ = k1′ k2′ k3′ k4′ k5′ k6′ k7′ k8′
by means of the following transformation step:
k1′ k2′ k3′ k4′ = k1 k2 k3 k4 ⊕ S1 [k5 ] ⊕ S2 [k7 ]
k5′ k6′ k7′ k8′ = k5 k6 k7 k8 ⊕ S1 [k2′ ] ⊕ S2 [k4′ ]
After this transformation, the remaining partial keys are generated:
K5′ = k4′ k3′
K6′ = k2′ k1′
K7′ = k8′ k7′
K8′ = k6′ k5′
One can verify that the partial keys are influenced by all primary key bits from round 4 and
up. Having constructed the partial keys, they are used as input to the key schedule S-boxes, in
′ ′ ′
order to create the round keys: Ki = S1 [Ki.1 ] ⊕ S2 [Ki.2 ], where Ki.j denotes the j th byte of key
′
Ki .
Key schedules for systems with a different number of rounds or a different blocksize can
be constructed in similar ways. Adams proves that this way of generating keys rules out the
existence of weak and semi-weak keys, because no key in this scheme can have an inverse key.
One step in his proof depends on the specific properties of the S-boxes used. Therefor, the proof
is not general and cannot be blindly applied to specific implementations, because the choice of
S-boxes is not fixed in CAST.
One data half of length n is combined with the round key by a function, usually XOR addition.
By splitting the result into pieces of size m and feeding these into separate S-boxes, thereafter
combing them through another binary operation (again, usually ⊕), the input and output length
of the round function will be equal, even though the m × n S-boxes themselves result in data
expansion. This makes iterating the function over multiple rounds easy to implement. The terms
being so vague, it is hard to formally prove confusion, diffusion and avalanche properties of the
66 Design of Blockciphers
round function. Adams acknowledges this, yet makes the claim that CAST ciphers seem to
possess good statistical properties after only 2-3 rounds, where DES needs 5-6 rounds to attain
similar properties. This supports the claims that diffusion, confusion and avalanche are better,
resulting in a better security / number of rounds ratio. What can be proven is the fact that the
round function, when chosen to specifications, exhibits both highest-order SAC and highest-order
BIC.
Adams states that the use of a simple XOR operation in the round function is already good,
but that some improvements are possible. The most important demands on the operation of
choice ’a’, are that it maps input to output in a non-linear fashion and that it must hide in-
formation about the keys used, using an operation on a single input, or pair of inputs. These
demands defends the round function against differential attacks. The demand that the operation
be such that subset sum operation must not be distributive over a(x, y), provides protection
against linear attacks. A suitable candidate is suggested by Adams: If the round function uses
⊕ for all operations a, b, c and d, it can be seen as:
F (R, K) = S1 (B (1) ) ⊕ · · · ⊕ S4 (B (4) ), where B = R ⊕ K and B (j) is the j th byte of B.
The alternative suggestion is similar, yet computes B as:
B = ((R ⊕ K1 ) ∗ K2) % n
This should require only a factor 2 more computation compared to the version that utilizes
⊕ for all its operations. Apparently providing intrinsic immunity against linear and differential
attacks, the alternative method appears the right choice. However, there is evidence that any
CAST blockcipher with a blocksize and keysize of 64 bits, four 8 × 32 S-boxes and 32-bit round
keys would require more chosen/known plaintext for successful differential and linear attacks,
than exist for a 64 bit blocksize when 12 rounds or more are used. If this turns out to be true,
the choice offered above loses most of its importance.
DES modes
Written by Jules van Kempen
Block ciphers like DES, 3DES, AES, FEAL-8 and many more are well-known. They encipher
blocks of data of a particular size (DES: 64b, AES: 128b, FEAL8: 16b) into blocks of equal size.
However, normally one wants to encipher a text which is a lot larger than this block size. Then
more needs to be done to successfully encipher this data. This is were modes of operation come
into play. They define a method to map large plaintext files into ciphertext by using the block
cipher. Typically these modes achieve this by chopping the plaintext into blocks of the desired
size and then applying the block cipher directly on each part. However, just applying the block
cipher directly on these blocks gives serious security problems. Therefore, modes use interaction
with the previous ciphertext and/or plaintext to garble the encryption.
In this chapter we shall mainly focus on modes of operation for DES, but these modes can
actually be applied to any block cipher. First we will discuss the 4 standard modes of operation as
proposed by the introduction of DES, these modes are still used often and not only in combination
with DES. Each of them has its own advantages and disadvantages. We shall then also discuss
the so called counter mode (standard in AES) and hybrid forms between these modes. Finally,
we will end with showing how DES modes can be applied to 3DES, since DES offcourse is not
secure anymore.
8.1.1 ECB
67
68 DES modes
cipher(n) = DES(plain(n))
8.1.2 CBC
Cipher Block Chaining is one the most used and popular modes.
It solves the problem as described above with the insertion of blocks
by creating interaction with the previous ciphertext during encryption.
More formally, before the plaintext is encrypted by DES it is first XOR-
ed with the previous ciphertext. As a result of this an attacker cannot
modify the ciphertext without garbling the decryption of the next
plaintext and therefore these attacks will be discovered. For encryption
of the first block an Initialization Vector (IV) is needed to act as the
previous ciphertext. This vector is a random 64 bit block which is sent
in the clear to the receiver. In formula this mode is described in the
following way:
Using CBC, transmission errors will only corrupt 2 blocks, namely the block containing the
error and the next block. This can easily been seen by looking at the formula of this mode:
cipher(n) = DES(cipher(n−1)⊕plain(n)). Here we can see that the n-th plaintext is completely
determined by the n-th and n − 1-th cipher and therefore only transmission-errors in these blocks
will cause corruption.
8.1.3 OFB
Theorem 8.1 In OFB, if for a block the plaintext is p and the ciphertext is c, then for this block,
the plaintext p′ corresponds to the ciphertext c′ = c ⊕ (p ⊕ p′ ).
Proof. Clearly p = c ⊕ X, for some fixed value X (this is the result of DES n (IV )). Now if
one feeds the decrypter c′ instead of c in this block he will calculate the corresponding plaintext
by c′ ⊕ X. Which is equal to c ⊕ (p ⊕ p′ ) ⊕ X, by transitivity of the XOR we rewrite into
(c ⊕ X) ⊕ p ⊕ p′ , which is equal to p ⊕ p ⊕ p′ , which is the same as p′ . △
Furthermore, changing the ciphertext of a block doesn’t influence the decryption of other
blocks, since there is no interaction between ciphertexts and plaintexts, just a keystream is used.
Another disadvantage of the OFB method is that there is a chance, albeit very small, that
one might accidentally choose a key and initialization vector that will cause the keystream to
repeat in a short loop.
8.1.4 CFB
70 DES modes
Counter mode is actually a mode associated with AES, it is the proposed standard mode for it.
But it is offcourse also applicable on other modes, such as DES. Counter mode is quite similar
to OFB, the DES block cipher is used to generate a keystream based on an initial value. Then
this keystream is used to XOR the plaintext with. However, in OFB the block was repeatedly
encrypted with DES to generate the stream, whereas in this mode, a counter function is used to
generate a sequence of blocks and each of these blocks is then encrypted by DES just once. This
makes sure that there is no risk of the keystream falling into a short loop, as there was in OFB.
In Formula:
is necessary to increase the security against such attacks. But it would be better to make these
attacks impossible by proper construction of the mode of operation.
Another potential danger in the counter mode is the possible reuse of a counter with the same
key. This will result in two identical keystream segments, a so-called two-time pad situation. This
is a very dangerous situation since such a two-time pad situation reveals information about the
plaintexts, as is shown in the NSA’s VENONA project [Cro]. Therefore, it is necessary to
guarantee counter uniqueness. As a result of this the used initialization vector is not random as
it was in previous modes. Since random IV is often not completely random as a result of the
system random generator not being completely random. They often use pseudo-random methods
which could lead to the discussed two-time pad situation. So, to avoid potential failing counter
mode does not use a random IV.
One might then wonder what the advantage of the counter mode is. The advantage is the
proven amount of security of the counter mode in contrast with the proven amount of security
of the 4 basic modes, the concrete security as it is named. Bellare et. al. [BDJR00] have proven
tighter bounds on the security of the counter mode than the security of the 4 basic modes. They
do so by assuming that the used block cipher is indistinguishable from a random permutation
function, the so called random oracle model. Then they look at the advantage an attacker can
still achieve based on looking at results from the encipherment. They define this advantage as the
capability of an attacker to distinguish a cipher from a random permutation. This is a standard
definition of confidentiality in theoretical cryptography. It is used since the user will not be able
to tell anything from a cipher if it is just a random number to him. Therefore, the user will need
much more encrypted data in counter mode before he could potentially gain the same advantage
as in the basic modes.
8.2.2 CTR-OFB
To get the best of two worlds hybrid modes of operation have been developed, they have the
same concrete security as the counter mode in the model of Bellare et. al. and they do not have
the problems that they might accidentally create a short cycle. The first of two of these hybrid
modes we shall discuss is CTR-OFB for Counter - Output Feedback. During the encryption
the plaintext is XOR-ed with a keystream as in OFB, but in contrast with OFB were the new
keystream block was created by applying DES directly on the previous keystream block, here
we let a part of the block consist out of a counter and let the other part consist of part of the
previous keystream block. This way accidentally choosing the same counter does not have to
lead to a two-time pad situation since part of the block consists of a random value, which will
create different keystreams. In formula:
8.2.3 CTR-CFB
The second hybrid mode that we will discuss will be the CTR-CFB for Counter - Cipher Feedback
mode. It is very similar to the CTR-OFB mode, but instead uses interaction with the previous
ciphertext while encrypting a block. The difference between CTR-OFB and CTR-CFB is exactly
similar to the difference between OFB and CFB. Here the DES block cipher is executed on the
previous cipher to generate part of the new block. In formula:
that by applying multiple modes to 3DES the effective keysize could be increased. In 3DES
the effective key is only 112 bits (out of 168 bits), as a result of the meet-in-the-middle attack.
Therefore, it is justified to further go into this possibility. Applying multiple modes to 3DES is
mostly done by defining three modes which should be used, the first mode is built around the
first DES encryption, the second mode is built around the second DES encryption and the third
mode is used around the third DES encryption. As a result of this it is believed by some that
effective keysizes of up to 168 bits can be achieved, a significant increase! However, until this
moment there is no proof of this 168 bits effective keysize being achieved, even worse, many of
the multiple modes of operation for 3DES have been proven to weaken the key to the level of
single DES! Which means that they are not safe. We will give an example of such a multiple
modes system for 3DES and show how it can be broken under Chosen Ciphertext Attack (CCA).
The question which we would like to see answered in the end is in which way one could apply
modes to 3DES best.
In this section we will give an example of how applying multiple modes of operation to 3DES
can result in the key being broken. We will do this for the example where the three modes used
are ECB, CBC and CBC, in that order. We will perform a Chosen Ciphertext Attack on the
first key with the use of differential cryptanalysis. First, we will explain this mode. A block
of plaintext p(n) is first enciphered directly by DES, giving the intermediate ciphertext ci1 (n)
then this ciphertext is used as input for the CBC mode, so it is first XOR-ed with the previous
ciphertext ci2 (n − 1) and then the second DES cipher is applied onto it. This gives the second
intermediate ciphertext ci2 (n), then CBC is applied onto this cipher as well. This results in the
final ciphertext c(n).
Under Chosen Ciphertext Attack the attacker will be able to choose the cipherblocks and
can get the plaintext corresponding to this ciphertext. He thus is able to choose tuples of blocks
(c0 , c1 , c2 ) and get the corresponding plaintexts (p0 , p1 , p2 ). He will use this ability to form pairs of
74 DES modes
Figure 8.1: Flow of the difference between the 2 decryptions through the
system if (c0 , c1 , c2 ) and (c0 ⊕ ΩT , c1 , c2 ) are entered
these tuples (c0 , c1 , c2 ) and (c0 ⊕ ΩT , c1 , c2 ) and receive the corresponding plaintexts (p00 , p01 , p02 )
and (p10 , p11 , p12 ). One should be aware that the chosen ciphertexts thus only differ in their first
block, and the difference between these first blocks is ΩT . As we now look at how the data will
flow through the system in the decryption we see that the first intermediate value of block 3 will
differ by the ΩT we had chosen in the beginning. Also, we know the plaintexts corresponding to
block 3. With this information a differential cryptanalysis can be performed on the first key.
However, some details need to be taken care of in order to apply the standard differential
cryptanalysis to the full 16-round version of DES, since in this analysis it is assumed that both
the plaintexts and ciphertexts are known, whereas we only know the plaintexts and the difference
of the ciphertexts (we have no idea what the actual ciphertexts at the end of round 1 look like,
we only know their difference). But with some additional calculation Eli Biham [Bih98] claims
that this can be done in complexity lower than single DES, and practically achievable. So we
can break the first key, next we need to break another key, which can also be done in complexity
of single DES, we will omit the details about it, since the most difficult and interesting task is
to find the first key as one can probably imagine. So, concluding, instead of giving us a larger
effective key, this mode gives us a far smaller key than 3DES, even worse, the mode can be broken
in practice!
In his paper [Bih98] Eli Biham further shows how the 3DES modes CBC—ECB—CBC,
CBC—CBC—ECB and CBC—CBC—CBC can be broken under Chosen Ciphertext Attack in
complexity of 258 , which is far too close for comfort. Therefore these modes are not safe and
should not be used.
Summary and Conclusions 75
As we have shown some multiple modes of operation for 3DES are not safe and the other modes
of operation are not known to be strong or weak (they have an effective keysize between 56 and
168 bits). In contrast, using 3DES with one mode of operation is known to give an effective
keysize of 112 bits, which is secure. Therefore, to us there seems to be no reason on this moment
to risk weakening the key by applying multiple modes of operation, since this might result in an
unsafe encryption. As for the best mode to use on 3DES we would say that this is the same as
for single DES, since they are similar systems.
Steganography
Written by Teun Slijkerman
In the year 2000 the USA Today reported that terrorist groups were using big public websites
such as eBay to give orders, distribute plans etc. The idea was that the information was hidden
in images posted on these websites.
In ancient times messages were written on the back of wax writing tables or tatooed on the
heads of slaves.
The preceding examples are two different forms of steganography. This chapter aims to give
an introduction to the field of steganography.
9.1 Introduction
The goal of steganography1 is to send a message to someone (over a public communication
channel) without anyone else knowing the message was sent. In contrast with in cryptography,
were the goal is to hide the content of a message from an outsider, steganography tries to hide
the very existence of such a message.
One of the standard problems in steganography is the prisoners problem. Suppose there are
two prisoners, Alice and Bob, who want to plan an escape. For this escape to be successful they
have to communicate. The only problem is that they can only communicate via the warden
Ward, who will read all messages. If the warden gets suspicious, both Alice and Bob are placed
in solitairy confinement. So how can Alice and Bob send eachother messages coordinating the
excape without raising the suspicion of Ward. They can’t use any cryptography since Ward will
see a message wich he can’t read and knows immediately something is going on. The sollution
is to hide the actual messages within seemingly innocent messages.
This chapter will be devided in two parts, sections 9.2 and 9.3 will explain some basic steganog-
raphy methods, section 9.4 will illustrate an attack on the two steganographic systems described
in section 9.3.
1
στ ǫγανoς γραφǫιν: covered writing
76
9.2 Null Cyphers 77
These two messages were sent during the 1st world war by the German embassy in the United
States. If you take the fist letter of every word in the first message or the seccond letter of ervey
word in the second message. You can read the actual message which is:
When you need to send an image to someone, it is possible to create a powerpoint presentation
and hide the image behind another image. It is also possible to hide bits in the source code of a
webpage by encoding a 0 as a space and a 1 as a tab at the end of a line.
All of these methods are extremely simple, but can be very effective as long as nobody suspects
anything. As long as you send enough messages/powerpoint files to someone, one message with
extra content will not raise any suspicion.
9.3.1 Introduction
In this digital era allmost all media is stored digitally. This has resulted in enormous amounts
of images, audio files and videos which are stored on the world wide web and every day new files
are added. This results in a gargantuan amount of possible files to hide our message in. To be
able to hide something in a file, it is imperitive to know the structure of the file. Otherwise you
could end up with an image that won’t work.
For the remainder of this chapter, we will focus on the BMP and GIF file-formats for the
simple reason that hiding something in a these types of files is easy. In principle all file-formats
have a potential for hiding a message is, but the algorithms can become very complicated. Take
for instance the JPEG format. These files do not just describe the colors of every pixel, but the
image is stored as a sequence of parameters for discrete cosine transformations (DCT’s).
78 Steganography
9.3.2 S-Tools
In a BMP file every pixel is encoded by 8 (Greytone) or 24 (RGB color) bits. Let’s take a random
pixel with 24-bit encoding:
This pixel has a Red value of 10010110 (150), a Green value of 11010011 (211), and a Blue value
of 00100101 (37). Suppose we want to hide a bit in the green value of this pixel but change the
value as little as possible. The obvious way to do it is to use the least significant bit (LSB) to
encode our bit as it will possibly change the value to 10010111 (151).
S-Tools uses this technique to store up to three bits of data per pixel. The potential size of
the secret message for a 1024∗768 Bitmap with 24-bit color encoding is 1.024∗768∗3 = 2.359.296
bits (3 bits per pixel), which is 294.912 bytes.
To provide some extra security, S-Tools uses a semi-random generator to generate the order to
hide the data. The only things needed by the receiver are the image with the message embeded
and the password used by the semi-random generator. Figure 9.1 shows an example of blueprints
of an airport hidden by S-Tools in an inconspicuous image.
This one of the simplest ways to hide data in an image. When S-Tools has to hide data in a
GIF-image, things get a little more complicated. The used algorithm and the problems raising
are described by Bailey and Curran [CB].
9.3.3 EzStego
When creating a GIF image, every used color is stored in a palet which can hold up to 256
different colors (Figure 9.2). Instead of storing the RGB information of every pixel, every pixel
is described by a pointer to the palet. This method results in the use of 8 bits per pixel instead
of the 24 needed in a BMP.
9.4 Steganalysis: Visual attack 79
The embedding function of EzStego uses a copy of the palet of the image which is sorted
by luminance. This results in a palet where the human eye can hardly distinguish between two
adjacent colors. The embedding function works line by line, starting from top left and ending at
the bottom right of the picture.
For every pixel of the image the function matches the LSB of that pixel with the bit to be
embedded, and replaces the colour by its neighbour in the sorted palette if necessary.
Figure 9.3 shows a simplified example of the embedding function of EzStego. Suppose we
have a pixel with color 1. If we want to embed a 0 we change the color to color 4, if we want to
embed a 1 we do nothing.
Extracting the hidden message from the image is pretty simple. All we have to do is get the
LSB of the index in the sorted palet for every pixel.
9.4.1 Introduction
The idea behind a visual attack at LSB steganography is that the LSB’s do reflect the main
features of the image to some extend. When data is hidden in a file, the first thing to do is
80 Steganography
compress the data because of the limited storage space. You may want to encrypt the data also
if it is important to keep the contents hidden even if the steganography is broken.
Both of these processes (compressing and encrypting) result in an allmost equal distribution
of 0’s and 1’s closely resembling random data. When embedding these bits into the carrier image,
the main features represented in the LSB’s get overwritten with ‘random’ data.
To execute a visual attack on an image possibly containing a hidden message, the first thing we
have to do is to create a filtered image. The filter should only take the LSB’s of the original
image. Figure 9.4 shows the result of such a filter used on a picture with no hidden data and
with a message with the maximum length embeded.
To execute a visual attack on EzStego the filter function has to be modified a bit. Figure 9.5
shows the way the filter function works. First of all the palet is sorted the same way EzStego
does when embedding a message. Then the colors of the palet will be changed to only black and
white. This is done is such a way that the LSB of the index of a color determines the color.
Figure 9.4 shows the result if EzStego has embedded a small message in an image. Because
EzStego stops embedding data after the message ends, there is a clear line indicating that there
is something embedded.
9.4.4 Conclusions
The visual attack seems to work quite nicely on the examples provided above. Unfortunately
this is not allways the case. Take figure 9.6 for example. Can you determine whether there is
something embedded in this picture by looking at the filtered image?
Summary and Conclusions 81
Figure 9.4: A BMP image (l), the same image with only the LSB’s (m), the im-
age with a maximum message embedded(r) and the image with a small message
embedded by EzStego (bot)
Niels Provos and Peter Honeyman from CITI [PH] have done some research on the use of
steganography on eBay. The method they used was a statistical attack. When used on the image
in figure 9.6 the result is that there is a message embedded in the image.
Inadequate communication between health workers is the cause of many medical mistakes. That
is shown as result of a research done by TNS NIPO by order of NICTIZ (Dutch National Health
ICT-Institute). The consequences of these mistakes can be very serious for the patient: remaining
physical problems, emotional problems and even impossibility to work. About 50,000 people
can’t work anymore because of medical communication problems. An EHR can prevent many
problems.
About 800,000 people above 18 years have experienced medical communication problems.
They received the wrong medicines, a wrong surgery or treatment, or couldn’t be treated because
of insufficient medical information. The research estimated the total financial costs per year at
1.4 billion euro.
An Electronic Health Record (EHR) can prevent many problems. The NICTIZ is trying to
create such an EHR. Some hospitals are already experimenting with an EHR. The goal is to have
a national medication record at the end of 2005. The final goal is to record and make available
all treatment-relevant, medical history of all inhabitants of The Netherlands.
All current versions of the electronic records of the care provided to patients are based on the
traditional medical record, which is the paper version. The phrase ”medical record” may be
applied in differing ways based on the actual healthcare practitioners providing the care, but
most persons understand the medical record to be a history of the care they received from various
clinicians. An important characteristic to keep in mind is that EACH practitioner keeps its own
medical record for each of its patients. There is no integration of the data from the various
clinicians treating the patient. Therefore, as a patient, you most likely have many different
medical records.
83
84 EHR: Electronic Health Record
The original objective for computerization of an individual’s health record can be found in the
Computerized Patient Record (CPR), which was defined as a computer-based record that includes
all clinical and administrative information about a patient’s care throughout his or her lifetime.
The documentation of any practitioner ever involved in a person’s healthcare would be included
in the CPR, extending from prenatal to postmortem information. One of the expectations for
the CPR included its role in decision support. Over a decade ago, the term CPR was used to
distinguish the concept from the more traditional medical record to incorporate administrative
and financial data often excluded from the patient medical record.
1. The Patient
2. The Health Worker (the person or institute that threads the patient)
3. The Health Insurance (those with whom the patient is insured with)
10.2.1 Patient
The patient is a person who is in need of medical treatment and visits the health worker. The
patient wants to be treated as fast and as well as posible. Furthermore he wants to be treated
in confidentiality with maximum security to his personal data. In accordance with law he must
have the right to view his personal file and have the opportunity to erase (parts of) it.
10.3 Relations and Demands on EHR 85
The health worker is the person or institute who treats the patient. To treat the patient he will
often need access to the medical history of the patient (of course only the relevant parts). In case
of emergency a doctor may need immediately access to the patient’s history, without having the
time to ask permission.
The health worker is responsible for documenting all the treatments (dossier duty) and he is
also responsible for the security and accuracy of this data.
10.2.3 Insurance
The Health Insurance wants information about the performed treatments, so they can verify that
the compensation the patient claims is covered in his insurance and needs to be paid out.
- A patient need access to his own EHR and must be able to allow access for physicians.
- A health worker must be able to access new data to the EHR this data must be accessible
only for the Health Worker, the patient and people authorized by one of them.
- Every action according the EHR must be logged and must be verifiable .
The data in the personal overview contains information all parties often need access to. It
contains administrative information about the person. This information is not medical related
and is just needed to handle all the administrative parts. Because everyone needs access to this
information it is not needed to secure this per user. We can say that everyone who has access
to some part of the EHR also has access to the personal overview. (NOTE: a patient must still
have the opportunity to hide information in this part whenever he wants to)
The Personal Overview is initialized when the patient visits a Medical Care for the first time.
The Personal Overview is stored at that institute. Whenever the personal data changes in the
future, the up to data version will be stored at the institute which made the changes (and so
becomes responsible for the data).
10.4.2 Patientpart
Every time a patient visits a health worker, a new patientpart is created. The patient part
contains all information about the treatment. (like: kind of treatment, medicines, responsible
health worker, date, etc.) It is, of course, very important to secure the data in the patientparts.
Not all users need access to all the data in all the records. If anyone can access the data, the
health worker can’t control the information the patient has given him in trust.
10.4.3 Storage
But where do we store the EHR? There are 4 concepts for a system:
1. Central Database
The first idea is to store all the data into a single database. The advantages are that we’ve got
all the data together and immediate access to a patient’s medical history. The disadvantages
are that if an attacker hacked into the database he has access to all the data. And another
10.5 EHR-security with Biometrics 87
disadvantage is that the healthworker might forget to update his record and, as a result, we get
an incomplete record in the database.
2. Chipcard
The second idea is to store all data onto a kind of chipcard. The advantage is that the patient
has great supervision about the people who access his data. The disadvantage is that he can
forget / lose his card and so nobody has medical information.
3. Online storage
The third idea is to store the data online and give the patient control over it. The advantage is
that the data is always available, but the disadvantage is that it isn’t very safe on the internet
and because a patient can edit his own medical record, a health worker hasn’t got the certainty
that the data is correct.
4. Decentral storage
The fourth and final idea is to store the data where they are created: with the health workers.
In that case the health worker can be sure of the security of the data and whenever the health
worker is hacked, the attacker has only the information of 1 treatement. The only thing we need
is a referral table with information about where all the parts of a EHR are stored.
For the identification we need a system which satisfies the following demands:
• Verification: After the identification of a user it is important to verify that id. It is easy
to tell a physician that you’re name is ’Alice’. If a physician believes everyone immediately
it is very easy for Oscar to say that his name is ’Alice’. It also important to check it because
you might accidentally give a wrong identification to the physician and he might give you
the wrong treatment on base of another person’s EHR.
Biometric comprehends analyzing biometric data with mathematical and statistical methods.
The system to verify the biometric data exists of:
1. Template
The template T is created by measuring body characteristics B with a sensor. We first encrypt
the sensor readings into digital data and store the data into a template.
There are two kinds of templates: the ’personalized template’, which is unique for every
person and the ’anonymous template’ which contains only the last k digits of Enc(B) (the same
specification as used with bankcards).
A ’personalized template’ can also been as an unique identifier for a certain person. And
because privacy is very important in the EHR we prefer to use the ’anonymous template’.
The idea behind the ’anonymous template’ is that it is almost impossible to fake all biometric
parts of a human (remember that we take random biometric data to verify). And because the
template is unique it is also impossible for hackers to match an EHR with a person once they
hacked into the EHR-system. So the change of abuse with an ’anonymous template’ is very small.
3. Verification
In the operational stage we use the stored biometric template to verify the identity of a user.
10.5 EHR-security with Biometrics 89
With a sensor we measure the body characteristics again and we create a new template T ∗ . The
verification application checks if there is a match: T ≈ T ∗
We use a margin in the progress, cause there is always some deviation in two measurements
of the same person and a exact match is almost impossible.
Human characteristics which can be used for biometrics can be divided into two categories: the
physical characteristics and the behavior characteristics. With physical characteristics we mean
data like fingerprint, blood vessel patron on the retina, iris-, finger-, hand and face geometry.
With behavior characteristics we mean data from analyzing signature drawing dynamics and
voice signals. The choice between on or more human characteristics for biometrics uses comes
with a couple of assessments. We consider the reliability of the readings, the user-friendliness
and public acceptance of the scanning procedure, the storage limitations of the Healt Card, the
implementation costs and whether a scanning of characteristics needs a special act of the user.
Biometric access-security is generally reliable. Most of the used systems are hard to mislead;
for example they can differ a living finger from a not-living finger. One of the most trustworthy
ways to identify and verify a person is by the use of iris recognition; the details of the eye are
unique and stable. An other reliable way is realizable with fingerprint recognition. Other, less
reliable, ways are using signatures or palm of the hand patterns. Voice recognition is the least
reliable (and some people also believe a signature can be faked easily.
Because we want maximum security for the EHR, we choose biometric identification with
measuring the iris or the fingerprint of a user. (Or, even more preferable, a combination of those
two).
Although manufacturers say the chance of physical damage nil, people don’t fancy looking into a
scanning device for an iris scan. Contact less methods, like taking a picture from some distance
and using it for face recognition, are preferable, but unfortunately not very reliable. That’s one
of the main reasons a complete system with biometric security will take a while.
90 EHR: Electronic Health Record
In the operational-fase, the stored biometric template is used to verifiy the identity of a user.
With a sensor we rescan the body characteristic and create a new template. The verification
application compares the new template with the previous stored one. If the value of the new
template matches within a certain margin with the stored one, the user is allowed. (We use a
certain margin, because there are always some deviations when measuring biometrics)
We can store the template in two different ways:
Decentralized storage
If we store the template decentralized, the template is stored on a chipcard (Health Card). And
we use the chipcard to verify the identity of a person. In this case we can still verify a person
when the rest of the system is down.
Central storage
In this case the templates are stored into a central database. To verify a person we can query
the database. We’ve got two kinds of searches onto the database: the direct and the indirect
search. With an indirect search we compare the template of the user with all templates stored
into the database (personalized template). With a direct search we compare the template of the
user with a specific template (anonymous template).
Owing to certain circumstances verification with biometrics might fail: the new encrypted tem-
plate don’t match with the original stored template, even though it is the right person (false
rejection). There also might be a template made from a person who isn’t in the database, but
the system finds a match (false acceptance). False rejection or acceptance can be caused by
certain physical paramaters (temperature, dust) or by certain biological circumstances (sweat).
If the chance on false rejection is low, the chance on false acceptance might be high. That’s
why we want to combine biometrics (identification based on ’being’) with the use of a loose
information carrier (identification based on ’having’). The change that an attacker can create a
card to access someone else’s data is very small.
10.6 Resources
• http://www.nictiz.nl
• http://www.hl7.org/
• http://www.iom.edu
• http://www.hipaadvisory.com/action/ehealth/
• http://www.epddag.nl/
• http://www.health.tno.nl/
• http://www.memag.com/memag/article/articleDetail.jsp?id=143144
Chapter 11
C2000
Written by Jan-Pieter van den Heuvel
In this chapter the security and cryptographic aspects of the Dutch C2000 communication system
will be discussed.
11.1 Introduction
11.1.1 What is C2000?
Communicatie 2000 (C2000) is a mobile communications system used by the Dutch public safety
services. To serve the public better and to respond to serious incidents faster, emergency services
in more and more regions are sharing a single emergency centre. C2000 is used primarily by fire
department corps, ambulance services, police and the military police. Other parties may use this
network for public safety but they have to apply first at the Ministry of Interior. The network
replaced over 100 individual analog radio systems used by the public safety services.
The system was built on the order of the Ministry of Interior and Kingdom Relations, the
Ministries of Defense and Public Health, Welfare and Sports. The Ministry of Interior is respon-
sible for the contruction and the management of the whole infrastructure (base stations, fibre
glass network, link stations, etc.). The users of the C2000 system are responsible for the purchase
and the management of the user equipment (mobile stations, alarm receivers, etc.). The actual
execution of the project will be performed by the ICT Service Cooperation Police, Justice and
Safety (ISC). ISC is an agency of the Ministry of Interior and Kingdom Relations.
The C2000 network consists of 400 base stations, 15 link stations, a fibre glass network and
equipment in a central control station. The fibre glass network used is owned by the Dutch
Department of Defense. That is the only network that could guarantee the operation of the
system during environmental disasters, war and terrorist attacks. A gateway is present to connect
the C2000 system to the telephone network. Eventually 70.000 mobile stations will be connected
with the network.
91
92 C2000
11.1.3 TETRA
C2000 is based on the European TETRA standard. As GSM is the standard for mobile tele-
phony, TETRA is the standard for mobile communication of the public safety services. TETRA
is the only official open European standard for these services. An open standard is developed in
cooperation with users and industry by an official institute, in this case The European Telecom-
munications Standards Institute (ETSI). Because TETRA is an open standard, all manufacturers
can produce equipment for a TETRA network. This is positive for the technical development of
the equipment and, because of competition, costs will be lower. Because TETRA is a European
standard, cross-country communication between public safety services is possible. The TETRA
standard describes the use of one of the TEA1, TEA2, TEA3 and TEA4 sets of encryption al-
gorithms developed by the Security Algorithms Group of Experts (SAGE) and are not public
property. In the C2000 system TEA2 is used for the Air Interface encryption.
11.2 Air Interface Encryption 93
11.2.1 Authentication
Authentication is neccessary to prevent illicit use of the system. The authentication is mutual,
which means that the Mobile Station (MS) will prove to the network that it has knowledge of a
shared secret key and the network will prove to the MS that it too has knowledge of the shared
secret key. Authentication and provision of keys for use at the air interface shall be linked by
a common set of algorithms. The algorithm set shall include a means of providing cipher keys
over the air interface. The algorithm set used in the C2000 system is TAA-1/Dimitra System
developed by Motorola. The secret key has a length of 128 bits. Authentication will be repeated
every 24 hours to change the Derived Cipher Key used for encryption.
The authentication procedure is shown in Figure11.2. The Switching and Management In-
frastructure (SwMI) shall use the shared secret key K and a random seed RS with algorithms
TA11 and TA21 to generate the pair of session keys KS and KS’. It shall then send arandom
94 C2000
challenge RAND1 to the MS together with random seed RS. The MS shall run TA11 to generate
session key KS, and because the authentication is mutual it shall also run algorithm TA21 to
generate a second session key KS’. Both MS and SwMI shall run algorithm TA12; the MS then
sends response RES1 back to the SwMI. The MS also sends its mutual challenge RAND2 to the
SwMI at the same time. The SwMI shall compare the response from the MS RES1 with its
expected response XRES1. The SwMI shall then run TA22 to produce its response to the MS’s
challenge RES2. RES2 is sent to the MS, which shall also run TA22 to produce expected response
XRES2. The MS shall compare RES2 with XRES2 and if the same, mutual authentication will
have been achieved.
Algorithms TA12 and TA22 produce DCK1 and DCK2 respectively; these shall be combined
in TB4 by both MS and SwMI to produce a DCK which has therefore been created as a result
of challenges by both parties.
If the first authentication (of the SwMI) fails, the second authentication shall be abandoned.
Because the shared secret key K is never transmitted and the complete process can be generated
by one party, this protocol is a Zero Knowledge Protocol.
11.2.2 Keys
Communication is being encrypted with the Encryption Cipher Key (ECK) using a KeyStream-
Generator (KSG). The ECK shall be derived using algorithm TB5 from a selected Cipher Key
(CK). The CK shall be one of DCK, CCK, MGCK and SCK. TB5 combines CK with Carrier
Number (CN), Colour Code (CC) and Location Area (LA) identifier to produce ECK. This is to
prevent attacks on the encryption process by replaying cipher text to eliminate the keystream,
and to prevent keystream replay within the repeat period of the frame numbering system. The
whole procedure is shown in Figure 11.3.
Four types of keys are being used to encrypt data that is being sent over the air interface:
End-to-End encryption is supported by TETRA and is mainly of interest to specialist users, for
example drug squads and the military, who have very demanding security requirements. End-to-
End encryption is used in combination with Air Interface encryption, because only the voice and
data are being encrypted with End-to-End encryption and not the control information. TETRA
signalling and user IDs remains protected to AI encryption standard. To ensure the decrypting
engine in the receiver maintains synchronism with the encrypting engine in the transmitter,
a synchronization vector is sent periodically. TETRA provides a mechanism known as frame
stealing to facilitate this.
96 C2000
11.3.2 Algorithms
National security and export considerations often restrict enhanced grade encryption algorithms
to a particular country. Therefore ETSI has not included a standard algorithm in the TETRA
specification, but they reccommend the use of IDEA. An agreement has been made with Ascom
Systec AG (Switzerland) to operate IDEA licences at a reasonable terms. The algorithm can
simply be replaced by other algorithms (for example 3DES).
11.3.3 Keys
• Availability
• Integrity
11.4.1 Confidentiality
Confidentiality is the ability of the system to keep user data and traffic secret. The Air Inter-
face encryption prevents eavesdropping by hackers, criminals and terrorists and the End-to-End
encryption prevents eavesdropping by hostile government agencies and System Administrators
collaborating with an attacker.
11.4.2 Availability
Availability is the continuance of the service reliably and without interruption. Denial of Service
attacks (preventing the system from working by attempting to use up capacity) are stopped with
the ability to disable Mobile Stations over the air. Jamming (Using RF energy to swamp receiver
sites) can not stopped immediately, but until the source is located the Mobile Stations are able
to connect directly to other Mobile Stations and traffic can be routed thru a Mobile Station to
another Base Station.
Summary and Conclusions 97
11.4.3 Integrity
Integrity is the systems strength in ensuring user traffic and data is not altered. Masquerading as
a legitimate user may occur often if terminals can be cloned and the subscriber identity copied.
To prevent this, the shared secret key may be placed on a smartcard and including a chip on the
smartcard to perform computations on the shared secret key. Manipulation of data may occur if
an intermediary can capture the message and change it. This is prevented by encryption.
All encryption depends on the availability of random keys. The best example for this would be
the one-time pad encryption method [Tel02]: Every bit, from the information that needs to be
encrypted, has to be XOR-ed with a corresponding bit from the key. This means that the length
of the key is as long as the length of the information. One-time pad is a strong encryption as long
as the key is truly random. If patterns were to arise, while investigating part of the key, an attack
on the rest of the encrypted information would be less difficult. It is of the utmost importance
that no pattern can be found within keys and thus be truly random, if possible. This section
tries to give a deeper understanding of randomness and which types of generators/verifiers are
interesting for this getting this understanding. For in most cases true random numbers are not
at hand.
12.1.1 Incoherent
When one would ask a program to return eight random bits and it would output a byte containing
only zeros, most people would at least be a little suspicious. It is never wrong to be suspicious,
but why would any other string be more random? No strings shown in casus 12.1 would qualify
being random, using incoherent as the definition.
Of course, eight fewer strings will not make that much a difference as there are 248 ( =
8
2 − 8 = 256 − 8 ) strings left to choose from.
Truly satisfying this surely is not and why would a seemingly random string 01001011 be
incoherent. Every random string can have a description that shows the coherence of the bits.
For this particular string, the coherence could be: the binary depiction of the number 75, which
shows that the last statement stands.
The better definition for random would be:
98
12.1 Defining random 99
string coherence
1. 00000000 xn = 0
2. 11111111 inverse of 1.
3. 01010101 if xn is ODD then 0 else 1
4. 10101010 inverse of 3.
5. 00001111 if n < length(x) ÷ 2 then 0 else 1
6. 11110000 inverse of 5.
7. 11001100 ...
8. 00110011 inverse of 7.
12.1.2 Unpredictable
Without knowing the coherence, which would be the case with randomness, no such problem
would arise as with the former definition. Even if one were to know the first seven bits for the
strings in table 1, the last bit could still be either zero or one. The unpredictability of every bit
makes all 256 strings possible and the chance for any of the strings to be generated the same
(1/256).
As clear and as simple as this definition might be, in general it will only suffice as just being
a definition. For how can one label a string to be random if all strings are possible? There is no
clear way to check whether a string was generated randomly or not. The chance of a bit to be
zero or one should be the same. For when either of the two would have gotten a higher change
of being generated, the unpredictability of the string would vanish.
Making a more usable definition, the following definition makes use of a feature of randomly
generated information:
12.1.3 Uncompressible
Uncompressible means that there is no smaller way to represent the original in the same language
as the original. The perfect compression program should be able to All the strings in table 1 are
most probably uncompressible, as the binary description of the easiest string to compress (‘eight
times’ zero or ‘8 · 0’) would probably exceed the eight bits. Taking a string long enough would
erase this problem, see casus 12.2.
4. The compression would be: 1000000 1111111 0 (which is smaller than 64 bits)
(This is only a simple example: Real compression algorithms are far more complex)
100 The importance of being random
Thus by this definition eight times zero is random where sixty-four times zero is not. This
may be desired: the chance for eight zeros to be generated is far greater in comparison with a
sixty-four zero string.
The set of unpredictable strings is formed by all strings possible, for every string has a
probability of being made. This shows that both the set of incoherent and uncompressible
strings are subsets of the set of unpredictable strings. The set of incoherent strings is a subset of
the set of uncompressible strings as the definition of incoherence can be extremely tight, making
it even possibly the empty set.
Entropy is a concept in thermodynamics, but here we are more interested in how entropy is a
concept in information theory. Information comes in streams, like in books: There is information
already read and information soon to be read. Entropy is about how much the given information
depicts the upcoming information.
As one is reading a normal English book, every character read gives information about the
upcoming character. For instance: When one reads an ‘y’ the probability of the next letter being
also an ‘y’ is quite minimal. An upcoming character is of course not affected by only the last
read character, for instance: After reading a complete sentence a ‘.’ has a high probability.
Now what makes a ‘source of entropy’ ? A source of entropy is a source of information where
the book example does not hold. Any given information does not change the probability of
any upcoming information. The best example of a source of entropy is a radioactive source.
Radioactive decay is totally unpredictable and sampling it gives pure random information. Not
everybody has its own radioactive source sampler connected to his or her computer which makes it
even better that there are online services which send streams of bits generated this way (encrypted
if necessary). Another way to get a good source of entropy is to sample ‘noise’. Noise as in when
apparatus like your television or radio are not tuned properly. The ‘background’ information
send through the unused frequencies is supposed to be totally random. There arises only one
problem with this kind of sampling: These frequencies can easily be altered by anybody who
knows the frequency and has any type of broadcaster. Still there are ways to check for this type
of alterations and sampling noise can be done by anybody with a computer with a microphone,
radio or web-cam connected.
One last source of entropy that should be explained is ‘the user’. When working with com-
puters, the user does many things that are supposedly totally unpredictable: Like the movement
of his or her mouse. A user is very unpredictable and as there are a lot of users one should
always try to use the resources at hand as best as possible. Nevertheless, carelessly sampling for
instance this mouse movement does not make a good random generator for mouse movement is
buffered and there are certain patters might arise. One strong pattern for Windows OS users is
for a user to move its mouse to the bottom left corner of the screen, where the ‘start’ button is
commonly situated. This does not mean that using the user as a source of entropy is a bad idea,
for when the implementations is correct real random numbers can be generated this way.
12.3 Mixing and skewness correction 101
Offset 0 1 2 3 4 5 6 7 ...
Seed 1 3 7 2 0 1 0 9 2 ...
Seed 2 6 4 4 0 2 0 8 4 ...
Seed . . . ...
There should not be a direct correlation between the different seeds. Here for instance the
different seeds are generated by taking the numbers of Seed 1 and multiplying them with
the current seed number, modulo 10.
When all obvious patterns are avoided, the sampling generates pure randomness. Or is this
world more discrete than most would think and is every source of entropy just another pseudo
random generator.
As said in the beginning of this section, there are discrete algorithms, which try to generate
random numbers. They try, for discrete algorithms can never create real randomness, as they
are discreet and on some level, how deeply buried this may be, there is a pattern to be found.
Numerous algorithms have been created to generate numbers as random as possible. The basis is
this: The user (computer or person) gives a seed and a starting point (offset) and the algorithm
generates as many numbers as needed from these two inputs (see 12.3).
Pseudo random generators can be strong or weak. What makes a pseudo random generator
strong or weak is the degree of difficulty to recreate the starting input(seed, offset) from a given
string of numbers generated by this generator. When it takes more than polynomial time to do
this, the generator is truly strong else; it will be classified as being weak.
12.3.1 Mixing
By mixing different generators, it is possible to create a stronger one. Not all generators can
be mixed together to make a stronger generator. If two or more generators are correlated is
some way, the mixed string might even be weaker as the mixing can by accident enhance the
correlation.
A simple way to mix different outputs is to XOR the different strings of data. When using
this method at least 50% of the original strings is lost.
102 The importance of being random
The output of real random generators should, if the string s long enough, have a normal distrib-
ution of zeros and ones in their binary representation. Any generator can have a small bias for
either of the two. There are several skewness correction algorithms, which try to overcome this
problem. One of the simplest algorithm break the ‘to be corrected’ string into parts of two bits.
Every pair of bits that have two zeros or two ones in them are discarded. For the rest of the
pairs, only the first bit is used.
12.4.1 Entropy
As described before, entropy tells you in what degree information depends on other information
within a string.
Another way to look at this is to say that entropy is information density. In an information-
dense string, every bit has a meaning and isnt dependable on other bits in the string. There
would be no way to predict one bit, even if all other bits are known.
In Ent the entropy is given in how many bits are meaningful for the byte they together make
up. For the jpeg file it is almost eight and this would indicate that it is a highly dense file,
making it almost uncompressible.
The chi-square distribution is calculated for the stream of bytes in the file and ex-
pressed as an absolute number and a percentage which indicates how frequently a
truly random sequence would exceed the value calculated. We interpret the percent-
age as the degree to which the sequence tested is suspected of being non-random. If
the percentage is greater than 99% or less than 1%, the sequence is almost certainly
not random. If the percentage is between 99% and 95% or between 1% and 5%, the
sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate
the sequence is “almost suspect”.
Note that our JPEG file, while very dense in information, is far from random as revealed by the
chi-square test.
The arithmetic mean is calculated by summing all the decimal numbers expressed by every byte
and then dividing that number by the total of bytes contained in the file. The decimals are from
zero to 255, so the arithmetic mean should be about 127.5 for a quite long random string.
This algorithm takes three bytes as a X-coordinate end the next three as a Y-coordinate. The
point (X, Y) is then being mapped on a two-dimensional square plane and when this point is
within the circle enclosed by this square, it is called a ‘hit’. This process repeats by taking the
next six bytes as (X, Y) until no bytes are left. Using the correlation between all hits against
all non-hits, an approximation of pi can be calculated. The closer the calculation leads to pi the
less correlated the stream would be. This algorithm is adaptable for using bigger blocks as X
and Y and why not include Z or even more dimensions. As for this two-dimension test, the jpeg
file looks quite uncorrelated.
This is the simplest of all tests and just shows how much a byte is dependent on the previous
byte. On pure random bytes, this would be close to zero and the bytes from the jpeg also seem
to be independent of their predecessors.
Further Reading
• http://en.wikipedia.org/
• http://www.faqs.org/
• http://www.fourmilab.ch/
• http://www.7-zip.org/
• http://www.cs.berkeley.edu/~daw/rnd/
• http://ei.cs.vt.edu/~history/VonNeumann.html
• http://www.cse.ohio-state.edu/cgi-bin/rfc/rfc1750.html
• http://www.ciphergoth.org/crypto/unbiasing/
• http://www.random.org/
• http://www.softpanorama.org/Algorithms/random_generators.shtml
• http://www.merrymeet.com/jon/usingrandom.html
• http://www.lavarnd.org/
• http://random.hd.org/
• http://www.practicalotp.com/trng.htm
Chapter 13
Off-line Karma
Written by Nelleke Steendam
Do you upload as much as you download? Many people don’t do that. To obligate the users of
a peer-to-peer system to share as much as they use, we can use micropayment schemes. I will
tell short what micropayments are and what they are used for. Therefore I used the sources,
[Nie00] , [Shi00] and [JK00]. Then I will tell about the implementation of micropaymentschemes
in Karma and Off-line Karma.
13.1 Micropayments
What are micropayments? The most common definition on the internet is: A micropayment is
an electronic transaction consisting of the transfer of a very small sum of money. This is a very
global definition because there are many system that say they are micropayment scheme, with
different definitions of it. Micropayments are used for sale of tangible and non-tangible goods
on the internet. The most important thing with micropayments is that the speed and the costs
of processing the payment aren’t to big. Because the goods that are transferred through the
internet are usually very small and cost little.
Micropayments are used in different sectors. They can be used in closed networks, for example
in peer-to-peer systems, where they can use there own currency. Micropayment is also used for
systems based on dollars or euros, like telephone cards. This kind of micropayements are mainly
used in monopoly-markets. Because monopoly-players can force the use of micropayments. In
peer-to-peer systems (systems where user share resources with other users) it is in the benefit of
the user to use such a system. Because the system can monitor and control the transactions of
users to ensure that every user shares at least as much as it download. As well as there are a lot
of definitions there also is a wide variety on implementations of micrpayment schemes. Each of
these implementations focus on one of the properties of micropayment. There are many desirable
properties in a micropayment scheme, I shall mention first the basic ones:
Divisibility; The user should be able to pay every amount of cash that wants, and it should
be possible to receive exchange. Duration; We don’t want that the valuta is able to expire, as
happened for example with the gulden. Portability; The system should be usable at all networks
and all operational systems. Offline-capability; The micropayment systems should not force the
105
106 Off-line Karma
user to contact a Trusted Third Party (TTP) during every transaction. This is a very important
criterium considering the very hard latency requirements for the micropayment transactions.
This where the most basic properties.Second are extended features:
Transferability; Micropayment cash should be transferrable without any registration. Fault
tolerance; Failure in software or hardware shouldn’t influence a transaction between to parties.
The following properties are for security reasons:
Authenticity; For all parties it should be possible to confirm the origin of a message. Integrity;
Integrity of payment means that it isn’t possible to fraud the payment data. Interruption tol-
erance; The transaction should be completed for both parties or not at all. Non-repudiation;
That it isn’t possible to deny a transaction history for all parties. Third there are are two main
desirable properties just for the system:
Scalability; When the market grows, the system should be able to handle the growth. Universal-
ity; The system should not be restricted in some way by a financial contract for example.
There are many different micropayment implementations, like CyberCoin, DigiCash, NetBill,
PPay. They all have a big disadvantage, that the system needs a central bank or broker. Such a
central point can become a bottleneck as the system grows. The only system originated before
off-line Karma that doesn’t need a central bank or broker is Karma.
13.2 Karma
Karma is a system for a peer-to-peer network, desgined by Vivek Vishunmurthy, Sangeeth Chan-
drakumar and Emin Gün Sirer ([VVS01]). Peer-to-peer systems are designed to share resources
through the internet. Most systems assume that all peers will contribute resources to system.
In real life this isn’t the case, we’re dealing with freeloaders. Studies have shown that in the
old Napster network 20 to 40% and in the Gnutella network 70% from the users are freeloaders.
Freeloaders are people, who consume many more resources than they contribute. Karma is a
framework for avoiding freeloaders in a peer-to-peer system. The main idea behind Karma is,
that it keeps track of what the participants contribute and consume. They give each participant
a scalar value, their karma, which tells the system, what the participant has contributed and con-
sumed. So if the participant consumes something, then it will decrease, and when he contributes
something to the system his karma value will increase. Usually a user receives a amount of karma
when he joins a system. A group of nodes, called the bankset, keep track of the karma belonging
to the users. A transaction can’t not be completed when the karma of the consumer is to low.
This forces the participant to upgrade their karma by sharing more resources. Karma is called
a economic framework, because it works by keeping track of the resource purchasing capability
of each peer. The system Karma possesses the properties of non-repudiation, certification and
atomicity. That means that Karma protects a peer-to-peer system against malicious provider
who promises to a resource but don’t deliver it completely. Also against malicious consumers that
receive a resource and later on claim that they didn’t received it. And finally against transient
states of the system where participants can observe intermediate states in process of transferring
karma from one account to another. Karma uses a atomic transaction scheme that provides the
consumer with the key to decrypt the resource at the same time that provider receives the certifi-
cate of the receipt. Beside that, karma also limits the effect of inflation and deflation, by making
karma corrections periodically. With these corrections the total amount of karma available in the
system is controlled. The design of the system Karma is based on achieving three fundamental
properties. First, the system is designed for peer-to-peer systems and therefore karma needs to
be completely distributed and requires no centralised functionality or trust. Second, because in
13.2 Karma 107
a loosely organised network of peers there are no failure-proof components, account data needs
to be replicated, possibly extensively, to insure against loss and tampering. Third, because the
system has to perform well, the coordination among the replicas has to be minimal.
Earlier, I said quickly that the bank-set is a group of nodes. I meant that bankA keeps
track of the karma of node A. BankA itself exist of several nodes, other participants, which
are chosen with a hash function. Participant A has a nodeId(A), and de nodes closest to the
Hash(nodeId(A)) form the bank-set of A. Members of bankA are equally responsiblefor keeping
track of the required information of A. The members of bankA store the amount of karma that A
possesses, the recent transaction with other nodes, (this can be used as proof), sequence number
(against reply attacks) and the current epoch number. Epoch is a fixed period of time, usually
several months.
The files in the system are sorted in fileId’s, that exist out of nodes that possesses the similar
file. DHT stores the nodes under fileId, which is the MD5 hash function of the file name. In the
file-set is the actual file information replicated at a set of nodes. The file-set consist out of the k/2
nodes closest to the fileId in both direction. (This a possible definition) When node A joins the
network, it sends a message with all the files it contributes. The file-set who receive a message
add the node to the file-set. After a fixed time A will be dropped from the file-set. So every
participant has to update his file communicating frequently. Arrival and departure of nodes can
easily be handled, because of the frequently refreshment of the file-set. Epochs are important
for the system to handle inflation and deflation of per-capita karma (the total karma divided by
the number of active users). We speak of inflation when we nodes leave the system when they
decrease their karma and of deflation when nodes leave when they have increase their karma.
You don’t want inflation and deflation, because you want that the framework keeps control of
the total amount of Karma available in the framework. If there is no control of this total amount
cheating is far more easily done by malicious people. To handle this the outstanding karma will
108 Off-line Karma
checks if node A possesses enough karma for the transfer. Second, is to make sure that every
member of bankA receives the request of A. This is solved by letting every member, who receive
a request, forward it to a random other member of bankA . If this is not done, then node A could
just send exactly a majority the request. (Example, send one majority a request for file 1 and
the other majority a request for file 2, with minimum overlay, then after these requests are done,
the majority will give a higher karma then they should give.) This protocol depends on the fact
that it expects nodes to participate in banks and to work on the behave of other nodes.
There are several attacks that can be used to the system.
Replay attack: These are ruled out as mentioned before by sequence number and signature’s.
Malicious Provider: When the provider accept a payment but doesn’t deliver the file, the
money can be transferred back. Because of transfer information the bank keep track of.
Malicious consumer: When a consumer claims it didn’t receive the file, while he did. The
provider simply can prove that he delivered with the certificate that his bank-set gave when the
transfer was completed.
Corrupt Bank-set: To prevent that a malicious node can control a bank we need secure entry
algorithms.
Denail of Service Attack: When a malicious node tries to sabotage a transfer by sending
non-confirming messages (NACKs) it will not work. Because beside the conforming message it
also asks for an authencation.
In the ideal case we would like to prevent double spending, only in the general case that isn’t
possible. Therefore want to detect it, and for a usable karma system for P2P and grid applica-
tions we demand the following properties:
Scalability; Transaction cost should be independent of the size of the network. No centralised
control; The system should not depend on one or several central bank, broker or special nodes.
There shouldn’t be a predetermined hierarchy in the system. Load of Balance; The load of nodes
should dependent on the number of transactions a node is involved with. Availability; Transac-
tions between users will not be interrupted, when users are going off- and on-line. Even when a
110 Off-line Karma
group of users loose connection the transactions will not be interrupted. Double-spending detec-
tion; The system should detect double spended coins, and correct the fraudulent node backwards.
Off-line karma has three tasks minting the coins, transfer of karma coins and detection of double-
spending.
There isn’t a central bank, so the users have to mint the coins. They can do that by finding a
collisions on the hash function. Finding of a collision will cost much time, and therefore it is
easier to share resources. The minted coin gets the name of the minting user a sequence number,
this makes each coin unique. The coin gets beside this a time stamp, which gives the time that
coin was minted. Every user is allowed to mint 2q karma coins.
Find a collision y such that h1 (x) = h2 (y) and x 6= y, where:
x = u||sn||ts
sn =serial number: |sn| ≤ q
ts =timestamp
New karma coin: k0 = (x, y)
Transfer of a karma coin takes place when user pays with karma for resources of an other user.
When an user sends a coin, he has to sign it with his signature. The receiver will verifies his
signature and stores the coin. At each transfer an extra signature is add to the coin. The
signature’s of a coin, can be seen as the history of a coin. This is used by detecting double
spended coins. When to coins have the same identity, they look at the history to find the corrupt
user. He can be found by tracing where the history of users has been split. To prevent that the
corrupt node spends the same coin twice by the same user, we need an extra step in the protocol.
The receiver m of the coin has to send a random challenge z to the consumer s who wants to
spend coin ki . Then the consumer s has to send ki+1 = {ki , z, m, Cs }s to the receiver m. Where
Cs is the certificate of user s. When a corrupt node spends the same coin twice at the same user,
the coin changes because of the challenge z.
A coin has to be reminted after a while for detecting double spending and clearing the history of
the coin. If the history is not clear after some time the size of the coin becomes very large. The
time stamp on a coin makes sure that it isn’t possible to avoid the reminting of the coin, because
de time of the timestamp plus a fixed period of time gives the expire date of a coin. That every
coin will be reminted, means that every double spended coin will be detected. The reminting is
done by set of users, we call reminters for the coin. The reminters are chosen, such that at least
one is not a corrupted node and honest reminters possess the history of the previously reminted
coins with the same identity. With saved identities the reminters can check if a coin is al ready
be reminted and trace back the cheater. The reminters are chosen on the identity of the coin
by using a hash function. The remint set of coin k is Rk = ℵr (h(id(k))). This ensures us, that
nobody can predict or control the reminters. All users in the reminter set have to verify the
authenticity of the other user in the set. If the verification is succeeded, they can place their
multisignature at the coin, knew = {XY (k), ts, Rk , CRk , u}Rk
We said that it was possible to leave and join the system at any time. It can be possible that a
node leaves, that is in the remint set. Therefore the remint set will change in time, by the fact
that users go off and online. The remint set will exist out the r users which indentity is closest
to h(id(k)). This makes fraud possible, because you can construct your own remint set. And say
later that all other users where off-line at that moment. To prevent this we need the density of
the set R;
d(Rk ) = maxI∈Rk |i − h(id(k))|
The density shows how much the remint set is spread, so what the maximum deviation of user
13.4 Protocol 111
is to h(id(k)) of the coin k. It is not likely the amount of user will heavily fluctuate in a short
period of time. When we limited the change of the density of the remint set, it is not possible
anymore to choose completely your own remint set. Let α be the maximum rate for change for
the density of the overlay network, d(t) the density at time t, and if T is the maximum time
before reminting, then for t′ between t and t + T we demand that α1 ≤ d(t′ ) ≤ αd(t).
To prevent loss, nodes will update their database when ever a node leaves or joins the system,
so there won’t be loss of history. Another advantage of the timestamp is, that after a certain
time, it isn’t possible any more to have duplicates in the system of the coins. I mean that the
timestamp makes sure that every coin will be reminted after a while. To keep the database in
bounds, the history of the coins of which the expire date has past, will be deleted.
13.4 Protocol
Minting:
For a user u:
Initially: Ku := ∅; snu := 0
snu := snu + 1
ts := now()
x := u||snu ||ts
Find y statisfying: h1 (x) = h2 (y)
k := (x, y)
Ku := Ku ∪ {k}
Spending:
User u spends a coin at user m:
m : picknoncez
m→u:z
u : selectk ∈ Ku
u → m : {k, z, m, Cu } → k ′
m : check(m, k ′ , z)
u : K − u := Ku {k}
m : Km := Km ∪ {k}
Reminting:
User u remints a coin k = {k̃, z, u, Cs }s :
u : R = ℵr (h(id(k)))
u → ri : k, now(), R → k ′ , t′ , R′ ∀i : ri ∈ R
ri : t′ ≈ now()
. R′ = ℵr (h(id(k)))
. check(u, k ′ , ⊥)
. verif yHistory(k ′ )
. knew = {XY (k ′ ), t′ , R′ , CR′ , u}
R ↔ u : {knew }R → kR (this is a three-round protocol)
u:checkBase(u, kR )
With:
Check(u, k, z) :
If is Base(k) then checkBase(u, k)
112 Off-line Karma
else {k ′ , z ′ , u′ , Cs }s := k
. return(z ′ = z ∨ z =⊥)
. ∧ u′ = u
. ∧ validSig(k, s, Cs )
. ∧ check(s, k ′ , ⊥)
checkbase(u, k):
if is reminted(k) then
. {k ′ , newts, R′ , CR , u′ }R := k
. returnu′ = u
. ∧ R′ = R = ℵr (h(id(k ′ )))
. ∧ newts ∈ [now() − T, now()]
. ∧validSig(k, R, CR )
else
. (x, y) := k
. u′ ||sn||ts := x
. returnh1 (x) = h2 (y)
. ∧ u′ = u
. ∧ ts ∈ [now() − T, now()]
audit(k, k ′ ):
{{ko , z1 , u1 , C0 }u0 , zm , um , Cm−1 }um−1 := k
′
{{ko′ , z1′ , u′1 , C0′ }u′0 , zm ′ ′
′ , um′ , Cm′ −1 }um′ −1 := k
′
′
for(i = 1 to min(m, m )) do
. if(zi 6= zi′ ∨ ui 6= u′i )then return ui−1
verifyHistory(k):
Hcoin := {k ′ ∈ H|id(k) = id(k ′ ) ∧ ts(k) = ts(k ′ )}
For each k ′ ∈ Hcoin do
. B := B ∪ {audit(k, k ′ )}
H : H ∪ {k}
return Hcoin =∅
The only difference in the protocol for the off-line version is the checkbase: checkbase(u, k):
if is reminted(k) then
. returnu′ = u
. ∧ R′ = R
Digital Visual Interface (DVI) is a type of cable and connection created in 1999 by the
Digital Display Working Group. It delivers exceptionally high quality video; a quality nearly
as good as the original content. This previously unattainable quality has raised concern from
Hollywood executives who fear video could be mass produced and illegally distributed, as what
has happened within the music industry. In an effort to protect this video content, a method of
security encryption was developed.
A summary of the protocol will be given in 14.2. A drawback of the HDCP authentication
protocol is a weakness that can be used to recover the master key. Once you know the master
key, you can decrypt any movie, and even create new HDCP devices that will work with the
’official’ ones. If this master key is ever published, HDCP will provide no protection any longer.
In 14.3 the main weakness will be described, together with a possible way to divert the master
key from 40 public keys. The most famous person to break HCDP was the Dutch cryptographer
Fergunson. The reason he did not publish his work is described in 14.1.
114
14.1 Freedom of speech: the Digital Millennium Copyright Act 115
Another victim of DMCA is the Russian programmer Dimitri Skliarov. He was arrested in the US
after illustrating a method on a hackers conference to bypass the copyright protection of electronic
books (Adobe). He is charged with violating the DMCA (American law) while performing his
work in Russia as an employee for a Russian firm. What he did was perfectly legal in Russia, and
in most other countries in the world. The law works contra productive, as participation of critical
cryptography researchers from all over the world is a necessity in the challenge to develop save
digital protection in the future. In the long run, the DMCA will make it much easier to create
illegal copies, because when research is forbidden, no one will develop good copyright protection
schemes.
specification can be found in [Int03]. The explanation of the protocol consists of three main
parts, Authentication, Data Encryption and a Cipher.
14.2.1 Terminology
Each HDCP Device contains an array of 40, 56-bit secret device keys which make up its Device
Private Keys (DPK), and a corresponding identifier called the Key Selection Vector (KSV). The
KSV is a 40-bit binary value, and can be seen as the public key. A KSV contains exactly 20
ones and 20 zeros in a ”random” sequence. The DPK’s and KSV’s of communication devices
will hereafter be referred to as u and v, respectively. Furthermore, na is a pseudorandom nonce
send by the transmitter A, which is used together with a one-way function h, to verify the result
of the key agreement process.
14.2.2 Authentication
The communication exchange allows for the receiver to demonstrate knowledge of a secret device
key. When devices A and B wish to communicate, in a practical situation this could for example
be a DVD player with a flat-panel television, they exchange va and vb . Device A computes the
dot product K = ua ∗ vb and B computes K ′ = ub ∗ vb and they use this as their shared secret for
the rest of their interactions. The way va is derived from ua is unknown; but the trusted authority
uses some secret information so that the above K is equal to K ′ . To verify that the key agreement
process has been successful, A sends na, which is replied by B by sending R′ = h(K ′ , na ). When
h(K, na ) = R′ holds, it strongly suggest valid keys.
Now both HDCP Devices have generated a shared secret value that cannot be determined by
eavesdroppers on this exchange. The shared secret can then also be used as a symmetric key to
encrypt HDCP content. Without encryption, the data exchange would go as follows: The digital
data stream is sent to the Transition Minimized Differential Signalling (TMDS) transmitter,
14.2 The protocol 117
where it is encoded in a digital DVI-signal. After it is sent over the DVI-cable to the receiver,
the TMDS receiver decodes the data stream to the proper digital data to eventually display
it on the screen. By adding encryption to the protocol the signal sent over the DVI-cable
becomes unrecognizable. HDCP Encryption is applied at the input to the TMDS Encoder and
decryption is applied at the output of the TMDS Decoder. HDCP Encryption consists of a
bit-wise exclusive-or (XOR) of the HDCP Content with a pseudo-random data stream produced
by the HDCP Cipher. As XOR is a reversible function, the receiver can fulfil the same steps to
decrypt.
The Cipher is a generator producing an independent data stream to be used for stream encryption.
The structure of the HDCP Cipher consists of three layers. The block module operates as a block
cipher during the authentication protocol. The block module and the output function are used
together to produce the 24-bit pseudo random sequence that is used to encrypt the HDCP
Content. Each layer is shortly described below.
1. A set of four Linear Feedback Shift Registers (LFSR’s) are combined to one bit. This one
bit feeds into the middle layer.
118 High-Bandwith Digital Content Protection
2. The middle layer consists of two separate ”round function” components. One of these
components, Round Function K, provides a key stream for the other component, Round
Function B. Each of these two components operates on a corresponding set of three 28-bit
registers. For Round Function K, a bit of the register takes its input from the LFSR module
output. Both round functions produce three 28-bit registers.
3. The final layer takes four (out of six) 28-bit register outputs from the round functions
through a compression function to produce a 24-bit block of pseudo-random bits for every
clock pulse.
More about linear feedback shift registers and stream encryption can be found in [Tel02].
14.3 Weaknesses
The author travels regularly to the U.S.; therefore, he does not want to risk any sort of penalty
regarding the DMCA. As a chapter about a relatively weak encryption algorithm will not be
complete without touching the subject of a possible attack, only a brief summary is given in
this chapter. It needs to be emphasized that the following method is completely based on
other published information; comprehensive mathematical details of the attack can be found in
[CGJ+ 01] and [Irw01].
The designers made a well-known cryptographic design mistake, that can easily be exploited
by attackers. As the shared secret generation is entirely linear, the attack only needs 40 pub-
lic/private key pairs such that the public key pairs span M; the module generated by all public
keys. After recovering the private keys of 40 devices, one can attack every other interoperable
HDCP device. The HDCP specification does not describe the key generation process but, based
solely on the properties of generated keys, all possible key generation strategies can be charac-
terized. This implicates that there is no secure implementation possible that follows from these
properties.
The authority’s secret information can be recovered by an attacker. As the mapping from public
keys to private keys is linear, it can be represented by a matrix S. The secret can be captured in
a 40*40 matrix, and linear algebra techniques are sufficient to recover it.
Recovering S can be done with a collection of 40 key pairs (v,u) such that the vi span the
module spanned by all KSV’s. Then the system of 40 equations with 40 unknowns: U = SV
can be solved by applying for example the Gaussian elimination technique. One has to take into
account that computations in HDCP are done mod 256 , so not all basic facts from linear algebra
hold.
Possessing the matrix S, the private key belonging to any KSV can be computed.
keys via reverse engineering. Another, more legal approach, is to buy a set of key pairs. Device
manufacturers can buy 10000 key pairs for $16000. If there is a set of 40 keys that span the total
module spanned by the KSV’s, and if anyone with bad intentions possesses such a set; the entire
system collapses. Eavesdrop device communication, to device cloning, producing new accepted
device keys, etc., everything is possible.
The U.S. (law system) does not do any effort to avoid future catastrophes. By making
it impossible to publish articles about weaknesses of cryptographic systems that are meant to
protect content, they break a set of main human rights, such as freedom of speech and publishing.
Above all, they eventually dig their own grave by leaving holes in their own content protection.
Chapter 15
This chapter will look into the development of the successor of the DVD format, focusing on the
encryption techniques used for various purposes. At the moment two formats are being developed
and promoted by different companies. Both systems hope to match or even exceed the success of
the DVD format. The new standard also aims to provide better security features, which proved
a big flaw in the DVD format.
First the security flaws of the DVD will be discussed in order to gain a better understanding
of the demands that should be fullfilled by a successor. Then the specifications for the Advanced
Access Content System, as far as known to date, will be discussed. This chapter will conclude
with some thoughts on the success of this new content protection system.
• Authentication key (ka ). A secret present in both the drive and the player used to
negotiate the session key.
• Session or bus key (kb ). This key is used to encode the title and disk keys before sending
them over the bus between the drive and the player. An eavesdropper might otherwise
obtain these keys by reading from the bus.
• Player key (kp ). Player keys are issued by the DVD Copy Control Association (DVD
CCA) to manufacturers of hardware or software DVD players. Each player is issued a few
of the total of 409 available keys. A player key is required in the process of decrypting the
disk key.
• Disk key (kd ): The disk key is used to encrypt the title keys.
• Sector key (ks ): This key is present in the plaintext header of a sector for optional
additional encrypting of the sectors within each title.
• Title key (kt ): The title key is XORed with the session key and used to encrypt the data
within a sector.
Every DVD player is equipped with a number of player keys. Each DVD disk has encrypted
actual content (e.g. a movie) and a hidden area containing ed = Ekd (kd ), which is an encryption
of the disk key using the disk key itself, and ep1 , . . . , ep409 , which is a table of encrypted disk keys,
where each epi = Ekpi (kd ); i.e. an encryption of the disk key with the i-th player key.
The DVD player is able to decrypt the encrypted content on the disk after completion of the
following process:
1. The player and drive verify each other’s identity using the authentication key and a simple
symmetric authentication protocol. In this authentication process a session key is also
established.
2. After successful authentication the drive sends the hidden data on the disk to the player.
The player uses one of the player keys (kpi ) to decrypt the corresponding entry in the
table (epi ) and obtains the disk key, i.e. kd = Dkpi (epi ), and verifies this: kd = Dkd (ed ).
Remember that ed = Ekd (kd ) and thus decrypting ed using kd should yield kd again. In
case the chosen player key does not produce the correct result, another one of the several
possibilities can be used until the decryption of the disk key is successful. In case the
operation does not succeed the player has been revoked by the DVD CCA.
3. Now that the disk key is successfully decrypted the title key can be obtained using the disk
key.
4. Each sector that is sent from the drive to the player can be decoded by the player using
the title key and the sector key present in the header of each sector.
Obtaining the disk key is a big step towards the decryption of a DVD’s content. However, some
additional work needs to be done to get to the actual content. The content is encrypted on the
disk using a stream cipher. The stream cipher is implemented by Linear Feedback Shift Registers
(LFSRs), which produce a pseudo-random bit stream. This bit stream is than XORed with the
plaintext content to produce the ciphertext content.
122 Advanced Access Content System
A basic LFSR works as shown in figure 15.1. The LFSR is seeded with a number of bits,
in this case eight. At each clock the row of bits is shifted to the right and bit 1 is used as the
output value. A new bit is shifted in on the left, which is a XOR of the tap bits. In figure 15.1
the taps are on bit 1, 4 and 7. When designing a LFSR of length n the taps should be chosen to
get maximum cycling length 2n − 1. Furthermore, a LFSR should never be seeded with all bits
set to zero. XORing the zeroes from the taps will always produce a zero and hence the output
will only produce zeroes as well.
CSS uses two LFSRs to produce the pseudo-random bit stream; a 17-bit LFSR and a 25-bit
LFSR (shown in figure 15.2). The first LFSR is seeded with the first two bytes of a key, the
latter is seeded with the other three. Bit number 4 is extra and is always set to one to prevent
null-cycling. This is all quite normal, but the LFSRs implemented in CSS have a somewhat
peculiar design. The bits shifted out on the right, which are normally considered the output, are
treated as garbage. Instead, the output uses the same value as the bit being shifted in on the
left. After 8 clocks the 8 bits of both LFSRs are added using 8-bit addition and these form an
output byte. This output byte is used for encryption by XORing it with a key or the data. For
a more detailed description of this procedure please look at casus 15.3.
Decryption is of course very straighforward, since (x ⊕ a) ⊕ a = x. Thus repeating the
encryption procedure on the ciphertext content or key will yield the plaintext content or key.
15.1.3 Weaknesses
• Key length. The key length of 40 bits was most likely chosen due to the export restrictions
on cryptographic material out of the United States. The DVD Forum probably hoped
that keeping the Content Scrambling System secret, would prevent the system from being
broken. But even without the design flaws the 40-bit key was too short. DES, which uses a
56-bit key was broken by brute force the same year in which the first DVD players appeared
on the market.
• Security through obscurity. The DVD Forum has chosen to keep the Content Scram-
bling System a secret for reasons mentioned above. This did not stop the system from
being broken by open source users, trying to make their DVDs play on their operating
systems. An anonymous posting of the source code for CSS sped up the process, but there
is no doubt that the system would have been broken anyway, see [Fel04]. Even the Digital
Millennium Copyright Act could not have prevented this. CSS can be considered another
clear example of the fact that security through obscurity simply does not work.
124 Advanced Access Content System
• LFSR implementation. The chosen implementation of the LFSRs is not the only defect
in the design of CSS, but certainly one of the most important ones. As described, the
chosen implementation enabled much easier prediction of the LFSRs’ output by guessing
the initial state and clocking forward or backward. This feature was exploited for the best
way to break CSS completely: obtaining the disk key. Remember that a disk contains a
hidden area with ed = Ekd (kd ). [Ste99] describes a way to calculate the disk key using ed
with a complexity of 225 . This attack has a significant lower complexity than should be
expected from a 40-bit key and takes about 20 seconds with a Pentium III, which was a
typical system around that time.
In conclusion, CSS was far from being set up in the best way possible and breaking it was
bound to happen.
However, adopting AES as their encryption system certainly does not solve all problems. The
key management system used in AACS had to be designed by themselves. Key management is
of course always the most important security issue in an encryption system and this will proof
not to be an easy task. The reason this is difficult is because the people who need the keys to
decrypt the contents will potentially try to compromise the system.
The key management system in AACS will rely primarily on the ability to revoke compromised
device keys when necessary as stated in [Adm04]:
[. . . ] Each compliant device or application is given a set of secret Device Keys when
manufactured. The actual number of keys may be different in different media types.
15.2 Advanced Access Content System 125
1. Byte substitution. Each byte in the array is substituted using a lookup table.
2. Row shifting. Each row in the array is shifted a defined number of steps.
3. Column mixing. The four bytes in one column in the array are combined using a
linear transformation. This step is omitted in the final round.
4. Round key adding. Each byte in the array is XORed with the corresponding byte
of the round key.
These Device Keys, referred to as Kdi (i = 0, 1, . . . , n − 1), are provided by AACS LA,
and are used by the device or application to process the MKB to calculate Km . The
set of Device Keys may either be unique per device, or used commonly by multiple
devices. The license agreement describes details and requirements associated with
these two alternatives. A device shall treat its Device Keys as highly confidential, as
defined in the license agreement. [. . . ]
[. . . ] The Media Key Block (MKB) enables sytem renewability. The MKB is gen-
erated by AACS LA, and allows all compliant devices, each using their set of secret
Device Keys, to calculate the same Km . If a set of Device Keys is compromised in a
way that threatens the integrity of the system, an updated MKB can be released that
causes a device with the compromised set of Device Keys to be unable to calculate
the correct Km . In this way, the compromised Device Keys are “revoked” by the new
MKB. [. . . ]
CSS was also designed with the ability to revoke the similar player keys, but the DVD Copy
Control Association never got the chance to put it to good use. The disk key could be calculated
from the information stored in the hidden area, without using a player key, in effect making every
player compromised. Utilising AES for key encryption will probably not allow this to happen as
easily.
• Device key (kd ). Each licensed player contains several of these device keys. A device key
allows the disk key to be extracted from the Media Key Block by the player.
126 Advanced Access Content System
• Subsidiary device key (ks ) and processing key (kp ). Keys used in the process of
calculating the media key using the device key.
• Media key (km ). Each disk or medium contains a Media Key Block, which allows players
to access the disk key, given the correct player key. A hash of the disk key, combined with
the volume identifier and usage rules, is used to decrypt the title key.
• Title key (kt ). Each title on the disk is encrypted using the title key, which can be either
different or the same for each title on the disk.
The arrangement of a disk protected by AACS is very similar to the layout of a DVD disk.
The disk contains the MKB, the encrypted title keys and the encrypted content. The Media Key
block contains the following elements:
• Verify Media Key record. Contains cv = Ekd (v), with the first 64 bits of v equal to
0123456789ABCDEF (in hexadecimals) padded with a 64-bit random number to a total
of 128 bits. The device can verify whether it has obtained the right kd by calculating
v ′ = Dkd (cv ) and verifying that the first 64 bits of v ′ are the correct value.
• Explicit Subset-Difference record. This part of the MKB contains information on how
to perform the bit expansion used to obtain the processing key. This information is stored
in a number called a subset-difference.
• Media Key Data record. The Media Key Data record contains encrypted media keys
for each corresponding subset-difference.
The exact manner in which the media key is calculated is not discussed here, but can be read
in the [Adm04]. However, the following describes briefly how the key is obtained. The device key
is used in a bit expansion function, which expands the number of bits from 128 to 384 bits. The
subset-difference information from the MKB lists whether the bit expansion should repeat with
the first or the last 128 bits, or whether the middle 128 bits should be used as the processing
key. In the latter case the bit expansion is finished and the media key can be decrypted using
the processing key and the media key data corresponding to the subset-difference used in the bit
expansion.
The encryption and decryption of the keys and the content is done with a number of AES based
functions. Some of these are implementations of NIST approved AES uses described in [Dwo01],
others appear to be designed by the AACS.
For encryption and decryption of keys the Electronic Codebook mode of AES is used, which
is the “normal” mode of AES described in casus 15.4 and outlined in [Dwo01]. Every block
of plaintext is encrypted to ciphertext independent of the other plaintext blocks. Likewise,
decryption of a ciphertext block can occur independently of the other ciphertext blocks, if the
key is known.
Content is encrypted and decrypted using the Cipher Block Chaining mode of AES, which is
also covered in [Dwo01]. Encryption is done as follows: c1 = Ek (p1 ⊕ iv) and cj = Ek (pj ⊕ cj−1 )
for j = 2, . . . , n, where ci is the i-th ciphertext block, pi is the i-th plaintext block and iv is the
initialisation vector used for the first encryption. Decryption is straightforward: p1 = Dk (c1 ) ⊕ iv
and pj = Dk (cj ) ⊕ cj−1 .
15.2 Advanced Access Content System 127
AACS also uses a hashing function, shown in figure 15.5, which is based on the AES decryption
mode. The plaintext x is divided into i pieces of 128 bits. x1 and h0 are used as the first input,
where h0 is a given initial value and h1 = Dx1 (h0 ) ⊕ h0 is the hash of these two values. The
subsequent hashes, if necessary, of x2 , . . . , xi are performed with h1 , . . . , hi−1 respectively.
The hashing function described above is important in calculating the title key from the media
key. The media key, as h0 , is used in a hash function together with the volume identifier, as x1 .
The volume identifier is a unique random number for every instance (i.e. disk) of a specific title
or just for a certain title, depending on the implementation the content provider wishes to use.
The result of this hash h1 is used in a hash with the usage rules x2 resulting in h2 . The result
of this last hash is the key with which the encrypted title keys can be decrypted. The obtained
title keys can then be used to decrypt each corresponding title as needed. Note that hashing the
volume identifier and the usage rules can take more than one round of the hashing function, but
for simplicity it is assumed that one round is enough.
Despite the improvements of AACS over CSS, there are still factors which form potential weak-
nesses and can make AACS a failure for the new optical storage disks as CSS was for the DVD.
See also [Per05]
• Key management. As stated before, key management is the biggest issue in content
protection. The devices have to be able to decrypt the encrypted content, which means
that they should be able to obtain the key for doing that in one way or another. In AACS
the media key is obtained using the device key. This device key is stored in the hardware
or software, depending on the type of implementation. However, anything that is stored in
a “hidden” area can be reverse engineered.
AACS counts on this by being able to revoke certain device keys for newer media after
a device key is compromised, which implies that certain devices will stop playing newer
content after its device key is compromised, until after an upgrade takes place. If device
keys are compromised at a high enough rate, the consumer will be stuck with a piece of
software or hardware that needs constant upgrading, which is annoying in both cases and
could be impossible in the latter case.
• Decoded content. Generally speaking, a consumer would buy a movie to view it. This
implies that the contents will have to be decoded at some point. No matter how late the
128 Advanced Access Content System
final decoding takes place, it is inevitable that the content will be decoded at some point
[Per05]. Even possible implementations of trusted computing can not prevent this2 .
AACS might count on the Digital Millennium Copyright Act to protect them, but this
will probably not prevent unauthorised copying, it simply makes it illegal. In the United
States CSS was protected under the same law, but the copying of DVDs is probably as
common in the United States as it is in Europe, where such a law has not yet been enforced
everywhere.
• Professional piracy. Consumer recorders implementing AACS are not allowed to write
into the Media Key Block area of the DVD, except for its extension, which should prevent
bit-by-bit copying. Professional pirates usually have access to the same machinery with
which the official disks are made. The Blu-ray Disk Association and HD-DVD forum will
try to anticipate this by giving a very limited number of factories the right to make the
disks. This sounds like security through obscurity, which has not proven to be a very
successful concept in the past.
2
Whether AACS will require a trusted computing platform is not yet clear, but see [Zag04]
Chapter 16
Bluetooth Security
Written by Gerben D. Oostra
Bluetooth is a wireless communication technology for short ranges, to eliminate wires between
nearby devices. Bluetooth allows for wireless data communications within 10 to 100 meters,
without requiring a line of sight. Because it is easier to attack wireless communication, Bluetooth
defines link level authentication and encryption. It is claimed that the used frequency hopping
spread makes it more difficult to ’tap the wire’.
In this chapter it will be shown that the frequency hopping spread spectrum can be exploited,
that the authentication can be deceived to impersonate devices, and that the optional encryption
cannot prevent these attacks.
Bluetooth works in a broadband frequency range of 2.45 GHz with the ISM (International Scien-
tific, and Medical Band) spectrum [LFE+ 01]. ISM allows multiple devices to communicate on the
same frequency, and avoids interference with a frequency hopping technology. The broadband is
divided into 23 or 79 (depending on the country code) narrow-band frequencies, through which
the devices switch in a pseudo-random manner. In the U.S. and most other countries 79 bands
have been assigned for use by Bluetooth devices, in Spain and France only 23 [JW01].
Bluetooth implements the frequency hopping with Gaussian Frequency Shift Keying (GFSK) [LFE+ 01].
To successfully use the hopping scheme devices must transmit and receive at synchronized time
moments and time intervals. They also must transmit and receive on the right frequency. This
is achieved with the frequency selection module (FSM), which uses the country code, the device
address, and the clock as input. Which device’s address and clock are used as input depends on
the phase of the connection establishment.
129
130 Bluetooth Security
Each device has a unique 48 bit address (BD ADDR), consisting of a 24 bit lower ad-
dress part (LAP), a 4 bit upper address part (UAP) and a 20 bit non-significant address part
(NAP) [LFE+ 01]. Besides this unique address, each device has a self determined unit key. This
is calculated with the E21 algorithm, using the BD ADDR and a random number as input. This
key is generated when the device is started for the first time. Each device also has a 28 bit clock
(CLK) running at 3.2 kHz and thus cycles once in each 23.3 hours. In each communication, the
master transmits in even time slots, and the slave may response in the following odd slots [Kg03].
Next to the avoided interference of signals, it is stated that the pseudo random frequency
hopping sequence improves security because it should be harder to listen to a conversation. In
section 16.2.2 it will be explained how the frequency hopping sequence can be exploited using a
relay attack.
To establish a connection to another device, the device first has to be discovered. This is done us-
ing an inquiry [Kg03]. To create a communication link between two devices, the paging procedure
is executed.
Inquiry
The inquiry is used to discover unknown devices in the area [Kg03]. Each device has a general
inquiry access code (GIAC), which is used to inquire that device. A device can be stated to only
be temporarily (Limited discoverable), or not at all (Non-discoverable) discoverable. Depending
on the state, a device listens for inquiry packets containing its GIAC on different frequencies. If
the client receives such packet, it responds with its BD ADDRESS and clock setting. Because
the devices are not synchronized at this moment, the inquirer (master) does not know the exact
clock setting. The FSM of the inquirer uses the GIAC as address and the clock of the inquiring
device as inputs. The sequence consists of 32 frequencies and has a period of length 32.
Paging [Kg03]
When a device is discovered, paging is used to establish a connection. A device can be non-
connectable; it then will never scan for page requests. The FSM uses the device address and
clock of the slave as input, and its sequence consists of 32 frequencies, and has a period of length
32.
The paging procedure starts with a page request. The master guesses the slaves clock, and
determines the corresponding frequency and the frequencies belonging to the previous eight and
next seven clock ticks. This is called the A-Train. The master quickly sends ID packets containing
the device access code (DAC) of the paged slave on these A-train frequencies. If no response
is received, the master sends the packet again on the remaining frequencies (the eight before
and after the A-Train). This is repeated for either 128 or 256 times, until the slave’s response
is received. The slave scans for ID packets containing its DAC on different frequencies. After
reception of such ID packet, the slave responds with the same ID packet. When the master
receives the slave response, both devices are synchronized because they both know on which
frequency to listen (and the clock from which it is derived). The master responds with an FHS
packet containing its own device address and its own actual clock setting. The slave responds to
this with another ID packet. When the master receives the second slave response, it is sure that
both know the master’s address and clock setting.
After paging, the master’s address and clock setting are used by both the master and slave
as input to the FHS, such that both use the same pseudo-random frequency sequence.
16.1 Bluetooth specification 131
Then, just as if they have met before, the two devices authenticate each other with their
(temporary) link key. Then they force to change link keys. During communication, all packets
are encrypted by exclusive or’ing it with the (temporary) link key (KAB ).
an authenticated ciphering offset (ACO) which can be used in generating an encryption key.
If the devices have exchanged link keys, the connection is dropped such that a new authenti-
cation phase is started.
datacipher cipher
A−B = dataA−B ⊕ Kcipher , dataB−A = dataB−A ⊕ Kcipher (16.14)
dataA−B = datacipher cipher
A−B ⊕ Kcipher , dataB−A = dataB−A ⊕ Kcipher (16.15)
The E0 key stream generator consists of four linear feedback shift registers (LFSR), with a
total length of 128 bits, and a four bit finite state machine, the blender FSM [FL01]. For each
outputted bit, the LFSRs are clocked once. The two most significant bits (MSB) of their sum
are used as input to the blender FSM. The least significant bit of the sum is XOR’ed with the
blenders output. The BD ADDR, the clock and the encryption key are used to generate the
initial state. This is shown in figure 16.2.
When an attacker wishes to minimize the chance to be detected, he can use a passive attack.
The attacker determines the frequency hopping sequence used for communicating. This sequence
depends on the clock and Bluetooth device address of the master. Jakobsson and Wetzel [JW01]
show how the master’s clock and address can be found.
In the inquiry response, the slave device reveals its clock and device address. These are the
seed for the paging hopping sequence. During paging the master reveals its clock and device
address.
Because transmitted messages do not necessarily are transmitted in one time-slot, it is not
sufficient to listen to all paging frequencies. However, by scanning through the 32 inquiry fre-
quencies, and eavesdropping on the response messages, the paging hopping sequence seed can be
determined. This then can be used to determine the master’s clock and address.
The attacker then can eavesdrop on any further transmitted message. This attack needs 32
parallel frequency listeners. If the two devices have not previously met, then the attacker is able
to calculate the PIN, and all derived keys. These keys then can be used for further impersonation
attacks, or to eavesdrop on encrypted communication.
PIN determination
If the initialization key is the first unknown key for the attacker, then the security relies on the
PIN (which has no minimum length).If the PIN is of a regular size of 4 digits, then the key space
is limited to 10,000 different values. This is an easy brute force attack.
To verify off-line whether the PIN is correct, the attacker needs one challenge-response au-
thentication transcript and the random number generated for the initialization key [JW01].
These can be simply retrieved by the passive attack as described above.
Then an exhaustive search can be done on all PIN values up to a certain length. For each
PIN the possible initialization key can be generated, for each initialization key the SRES’ can
be generated using the BD ADDRESS and the random value, and this can be verified with
communicated SRES.
An attacker can participate in the communication between two devices, giving him the power to
alter messages and calculate the PIN. To calculate the PIN code, the initialization key seed and
a challenge-response tuple is required. In the previous section 16.2.1 it is shown how to perform
a brute force attack on the PIN. It is also possible to initiate such attack.
PIN determination
An attacker can start communication with the device to impersonate, and transmit a random
number as seed for the initialization key. The attacker sends another random number as challenge,
and retrieves the SRES’ response. The attacker then can perform a brute force attack on the
PIN, verifying that the resulting SRES value is equal to the received SRES’.
and eavesdrop on all communication from the master [LFE+ 01], [JW01]. Lamm et al. [LFE+ 01]
refer to this as unit key stealing.
Relay attack
Another active attack is the relay attack. The attacker relays every message to the other device,
impersonating both devices. This attack can be started by initiating or intercepting communi-
cation between the two devices to impersonate [JW01]. The general two-sided relay attack is
shown in figure 16.3.
Initiating relay attack The attacker initiates a relay attack by contacting each one of two
devices to impersonate or eavesdrop. The attacker either makes both of them slaves or both
masters. This is needed to let the two devices follow different frequency hop sequences. Therefore
they will not see the messages they transmit to each other. The attacker then can relay every
message to the other. Because the attacker initiates the communication, it will first become the
master. If one attacked device is already the master of a piconet, it prefers to have the attacker
as a slave in the piconet to prevent jamming. The attack requires that then the other attacked
device also is a master relative to the attacker. However, only slaves can require becoming
masters. Kügler [Kg03] therefore explains that both attacked devices must request for a role
switch while establishing a connection, which is an unlikely case. Relaying messages also fails
if an encrypted link is used. The master’s address is used for the encryption stream, and both
devices believe that the other is the master.
Kügler [Kg03] gives two improvements. Jakobsson and Wetzel [JW01] initiate the com-
munication. Kügler shows that both intercepting and initiating are possible attacks, even when
an encrypted link is used. The attacker establishes a connection to both devices, posing to be
the other device. The attacker is the master, and therefore can determine the clock setting to
be used. The encryption key uses the master’s address and clock setting. It however, does not
depend on the most significant bit of the clock setting. The attacker therefore chooses the clock
136 Bluetooth Security
of device A (CLKA ) XOR’ed with 227 (the clock is 27 bits). Next, the attacker switches roles
with device A (which A may already have requested). A then uses its own clock setting CLKA ,
as it is the master, while device B still uses CLKA ⊕ 227 . They use the same channel hopping
sequence, but at different offsets, as they both think that A is the master. The attacker now
impersonates both devices to each other, and has the power to change relayed packets.
An attacker can initiate the relay attack on two devices, but this does not make the two
devices transmit interesting files to each other. Kügler [Kg03] shows that it is also possible to
intercept a communication startup using a relay attack.
Intercepting relay attack The interception attack of Kügler [Kg03] makes use of the paging
procedure. The master sequentially sends a number of page requests until the slave responds.
The attacker responds faster than the slave to the page request. This can be achieved with two
methods.
The first method is ’jamming’. The attacker scans all 32 page hopping frequencies in parallel,
and responds as fast as possible. It therefore will not respond slower than the paged slave, but
possibly at the same time. However, the channel then will be jammed, and the attacker gets
another chance.
The second method is ’busy waiting’, and only requires one listening device. The attacker
starts establishing a connection with the slave before the master does, but does not send the FHS
packet. This leaves the slave waiting for a FHS packet. The attacker can now respond to the
master’s page request (the ID packet), without having the slave as competitor. The slave waits
for FHS packets until a timeout occurs, after this timeout it restarts listening for ID packets. If
no ID packet is received, the slave returns in normal operation. If the FHS timeout occurs, the
attacker can again initiate a connection leaving the slave again in a ”half open” connection.
At this point, the attacker has a connection with the master. The attacker restarts the paging
procedure with the slave, and finally responds with a relayed FHS packet containing a different
clock setting. The master and slave will therefore use the same channel hopping sequence, but
at a different offset. If the slave requests a role switch, the attacker relays the packets. When the
new master sends a FHS packet, the attacker can again change the clock setting, again leaving
both devices at different offsets in the same new channel hopping sequence.
have different clock settings, to not let them hear each other. It is unfortunately possible to relay
encrypted communication with a certain clock offset. The encryption sequence depends on only
the first 26 clock bits, discarding the most significant bit. By flipping the most significant bit
for one device at the connection establishment, both use the same encryption key but different
frequency hopping spectra. The most significant bits however do need to be equal. Therefore, the
attacker has to relay a packed within the same transmission frame. Bluetooth has an uncertainty
window of 10µs because clocks do not drift with whole clock ticks. This gives the attacker 10µs
to relay the packet.
Changing communication When the attacker relays all (encrypted) communication, he has
the power to flip certain bits. In the previous section it is shown that the attacker needs to relay
packets within 10µs if the communication is encrypted. It may take more time to determine
which bits to flip. The solution is to not relay the messages, and reply with NAK packets to the
transmitter. The attacker replies with NAK packets until he knows which bits to flip. In the
next resent message, the attacker can flip the bits and immediately bitwise relay the message.
The attacker can continue to reply with NAK packets until the other device gives the desired
response.
If only a communication is recorded, without any authentication messages, the cipher can be
attacked. This can be done using a 128 bits known plain-text attack with 2100 bit operations, or
with a known birthday attack in time and memory complexity 266 [JW01].
Fluhrer and Lucks [FL01] derive the cipher key from a set of known key stream bits in time
varying from O(284 ) if 132 bits are available to O(273 ) given 243 known key stream bits. They
use a linear consistency attack, described in 1989 by Zeng, Yang, and Rao [ZYT89]. The used
E0 key stream generator in Bluetooth uses 4 linear feedback shift registers (LFSR), and a 4-bit
finite state machine (Blender FSM). The LFSR’s are added, the two most significant bits (MSB)
are input to the blender. The blenders output is XOR’ed with the least significant bit. This is
shown in figure 16.2.
The LFSR’s are of bit length 25 (LFSR1), 31(LFSR2), 33(LFSR3), and 39(LFSR4). Fluhrers
and Lucks’ [FL01] base attack technique is guessing the initial states of part of the cipher, and
checking for consistency. For each possible starting state a set λ of linear equations for LFSR3
and LFSR4 is maintained. They repeatedly examine a possible state n of LFSR1, LFSR2 and
the blender FSM. The exclusive or of these with the known key stream bit Zn , should be equal
to the exclusive or of LFSR3 and LFSR4. If the exclusive or is zero, a branch is created for the
λ rules (LF SR3n = 0 ∧ LF SR4n = 0) and (LF SR3n = 1 ∧ LF SR3n = 1). If the exclusive
or is one, then to λ the rule (LF SR3n 6= LF SR4n ) is added. For n ≥ 33 the tap equations of
LFSR3 are added to λ. For n ≥ 39 the tap equations of LFSR4 are added to λ. If λ becomes
inconsistent, backtrack to the next state. If it is consistent, then the next blender FSM state can
be computed using the current (known) state, and the (known) number of LFSRs that output a
one.
Fluhrer and Lucks [FL01] show that the base attack can be further improved. They note that
the base attack is more efficient if the outputs of LFSR3 and LFSR4 exclusive or’ed together
happens to have a high hamming weight (the number of ’1’-bits in sequence). Therefore, they
assume that, at a specific point in the key stream, the next n+1 bits of (LF SR3 ⊕ LF SR4) are
n ones followed by a zero. Here n will be less than the length of the LFSRs. The algorithm
becomes the following. Select a position in the known key stream that is the start of more than
132 consecutive known bits. Examine each possible combination of the 4 bits of the blender FSM
138 Bluetooth Security
state, 25 bits of LFSR1 state and the last 30-n bits of LFSR2 state. Compute the initial n + 1
bits of LFSR2 state that is consistent with the exclusive or of LFSR3 and LFSR4, consisting of
n ones and then a zero. Run the base attack on that setting. Stop if it finds a consistent initial
setting.
If multiple non-consecutive packets are received, the attack can be done independent one each
one of them. However, the exclusive or of the clock values are known, and therefore the exclusive
or of the first level LFSRs are known [FL01].
Because of this total attack, the real security level of E0 is no more than 73-84 bits. The
security depends on the amount of key stream available to the attacker, but not on the key length
if above 64 bits. The Bluetooth specification requires that the length of keys may be changed,
to easily increase the security. This attack shows that all keys of length more than 64 bits are
equally secure.
Jakobsson and Wetzel [JW01] explain that a more practical solution is to determine the stream
cipher output using a known sent/received plain-text, with which the corresponding received/sent
cipher-text can be decrypted. This is possible because the stream cipher output is reused for
decryption [JW01]. Using this method, the number of known bits can be increased, lowering the
complexity of determining the full encryption key using the method of Fluhrer and Lucks [FL01].
I think the knowledge of absolute CLK (27 used bits) and BD ADDR (48 bits) values can lower
the complexity of the encryption key attack. The attacker needs to know these to be able to
eavesdrop on the communication, so he could easily use them to attack the encryption key.
[Ada97] Adams, C. Constructing symmetric ciphers using the cast design procedure. In proc.
Designs, Codes and Cryptography 12 (1997), Kluwer Academic Publishers, pp. 283–
316.
[AMT01] A. Miyaji, M. N. and Takano, S. New explicit conditions of elliptic curve traces
for fr-reduction. IEICE Trans. Fundamentals (2001).
[BDJR00] Bellare, M., Desai, A., Jokipii, E., and Rogaway, P. A concrete security
treatment of symmetric encryption.
http://www-cse.ucsd.edu/users/mihir/papers/sym-enc.ps.gz.
[Ben92] Bennett, C. H. Quantum cryptography using any two nonorthogonal states. Phys.
Rev. Letters 68, 21 (1992), 3121–3124.
[Bra93] Brands, S. An efficient off-line cash system based on the representation problem.
Tech. rep., 1993.
http://ftp.cwi.nl/CWIreports/AA/CS-R9323.pdf.
[CGJ+ 01] Crosby, S., Goldberg, I., Johnson, R., Song, D., and Wagner, D. A
cryptanalysis of the high-bandwidth digital content protection system. LNCS 2320
Security and Privacy in Digital Rights Management (2001), 192–200.
[Cro] Crowell, W. Introduction to the venona project.
http://www.nsa.gov:8080/docs/venona/index.html.
[DBS04] Dan Boneh, B. L. and Shacham, H. Short signatures from the weil pairing.
Journal of Cryptology 17 (2004), 297–319.
[FL01] Fluhrer, A. and Lucks, S. Analysis of the e0 encryption system. In proc. Lecture
notes in Computer Science 2259 (2001), Springer-Verlag, pp. 38–48.
[FR94] Frey, G. and Ruck, H. A remark concerting m-divisibility and the discrete loga-
rithm in the divisor class group of curves. Math. Comput. 62 (1994), 865–874.
[GH04a] Garcia, F. D. and Hoepman, J.-H. Off-line karma: A decentralized currency for
peer-to-peer and grid applications.
[GH04b] Garcia, F. D. and Hoepman, J.-H. Off-line karma: Towards a decentralized
currency for peer-to-peer and grid applications (brief abstract).
[Hes] Hess, F. Exponent group signature schemes and efficient identity based signature
schemes based on pairings.
[Jou04] Joux, A. A one round protocol for tripartite diffie-hellman. Journal of Cryptology
17 (2004), 263–276.
[JQYY02] Joye, M., Quisquater, J., Yen, S., and Yung, M. Observability analysis
- detecting when improved cryptosystems fail. Lecture Notes in Computer Science
2271 (2002).
[Kat05] Katos, V. A randomness test for block ciphers. Applied Mathematics and Compu-
tation 162 (2005), 29–65.
142 Bibliography
[Kg03] Kgler, D. ”man in the middle” attacks on bluetooth. In proc. Lecture notes in
Computer Science 2742 (2003), Springer-Verlag, pp. 149–161.
[KR96] Kilian, J. and Rogaway, P. How to protect des against exhaustive key search.
In proc. Advances in cryptology-CRYPTO ’96 (1996), vol. 1109 of Lecture Notes in
Computer Science, Springer-Verlag, Berlin, pp. 252–267.
[LcA+ 04] Levi, A., Çetintas, E., Aydos, M., Kaya Koç Çetin, and Çaglayan, M. U.
Relay attacks on bluetooth authentication and solutions. In proc. Lecture notes in
Computer Science 3280 (2004), Springer-Verlag, pp. 278–288.
[LFE+ 01] Lamm, G., Falauto, G., Estrada, J., Gadiyaram, J., and Cockerham, D.
Bluetooth wireless networks security features. In proc. Proceedings of the 2001 IEEE
(2001), Society for Industrial and Applied Mathematics, pp. 265–272.
[LHL93] Lenstra, A. and H.W. Lenstra, J. (eds.). The development of the number field
sieve, vol. 1554 of Lecture Notes in Mathematics. Springer-Verlag, 1993.
[LHT97] Lee, J., Heys, H., and Tavares, S. Resistance of a CAST-like encryption algo-
rithm to linear and differential crytanalysis. In proc. Designs, Codes and Cryptography
12 (1997), Kluwer Academic Publishers, pp. 267–282.
[LN02] Lefranc, S. and Naccache, D. Cut-&-paste attacks with java. Lecture Notes in
Comnputer Science 2587 (2002).
[Mat94] Matsui, M. Linear cryptanalysis methods for des cipher. In proc. Advances in
Cryptology, EUROCRYPT ’93 (Berlin, 1994), vol. 765 of Lecture Notes in Computer
Science, Springer-Verlag, pp. 386–397.
[Mil04] Miller, V. S. The weil pairing, and its efficient calculation. Journal of Cryptology
17 (2004), 235–261.
[Nie00] Nielsen, J. The case for micropayments. Journal of Cryptology 13 (2000), 361–396.
[Pom05] Pomerance, C. Smooth numbers and the quadratic sieve. In Algorithmic Number
Theory: Lattices, Number Fields, Curves and Cryptography, J. Buhler and P. Steven-
hagen (eds.), Cambridge University Press, 2005.
Cryptography in Action 143
[PS00] Pointcheval, D. and Stern, J. Security arguments for digital signatures and
blind signatures. Journal of Cryptology 13 (2000), 361–396.
[PSB04] Paulo S.L.M. Barreto, Ben Lynn, M. S. Efficient implementation if pairing-
based cryptosystems. Journal of Cryptology 17 (2004), 321–334.
[Sha49] Shannon, C. Communication theory of secrecy systems. Bell System Technical
Journal 28 (1949), 656–715.
[Shi00] Shirky, C. The case against micropayments. Journal of Cryptology 13 (2000), 361–
396.
[Sho97] Shor, P. W. Polynomial-time algorithms for prime factorization and discrete loga-
rithms on a quantum computer. SIAM Journal on Scientific Computing 26 (1997).
http://www.arxiv.org/abs/quant-ph/9508027.
[SLL+ ] Sung, J., Lee, S., Lim, J., Lee, W., and Yi, O. Concrete security analysis of
ctr-ofb and ctr-cfb modes of operation.
http://cist.korea.ac.kr/Tr/TR01_18.ps.
[SSSW98] Salter, C., Saydjari, O., Schneier, B., and Wallner, J. Towards a secure
system engineering methodology. Tech. rep., NSA, 1998.
[ST01] Standards, N. I. of and Technology. Announcing the advanced encryption
standard. Federal Information Processing Standards Publication FIPS-197, National
Institute of Standards and Technology, 2001.
[Ste99] Stevenson, F. A. Cryptanalysis of contents scrambling system. http://www-
2.cs.cmu.edu/ dst/DeCSS/FrankStevenson/analysis.html, 1999.
[Ste05] Stevenhagen, P. The number field sieve. In Algorithmic Number Theory: Lat-
tices, Number Fields, Curves and Cryptography, J. Buhler and P. Stevenhagen (eds.),
Cambridge University Press, 2005.
[Tel02] Tel, G. Cryptografie: Bescherming van de Digitale Maatschappij. Addison-Wesley,
2002 (363 pp.).
[VVS01] Vivek Vishnumurthy, S. C. and Sirer, E. G. Karma: A secure economic
framework for peer-to-peer resource sharing.
[Wik05a] Wikipedia. Advanced encryption standard, 2005.
http://en.wikipedia.org/wiki/AES.
[Wik05b] Wikipedia. Dvd, 2005. http://en.wikipedia.org/wiki/DVD.
[XDT03] Xu, S., Doumen, J., and Tilborg, v. H. On the security of digital signatures
based on error - correcting codes. Designs, Codes and Cryptography 28 (2003).
[Zag04] Zagar. Son of css: Hd-dvd v. skylink..., 2004.
http://zagarsbrain.blogspot.com/2004/12/son-of-css-hd-dvd-v-skylink.html.
[Zei99] Zeidler, E. Applied Functional Analysis, corrected third printing edition. 1999.
[ZYT89] Zeng, K., Yang, C.-H., and T.Rao. On the linear consistenc test (lct) in crypt-
analysis with applications. In proc. Lecture notes in Computer Science 435 (1989),
Springer-Verlag, pp. 164–176.
Index