See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/351837904
Cryptographic Hash Functions
Article · May 2021
CITATIONS READS
0 3,412
1 author:
Wahome Macharia
University of London
2 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wahome Macharia on 25 May 2021.
The user has requested enhancement of the downloaded file.
Cryptographic Hash Functions
Kelvin W. Macharia
kwm14@student.london.ac.uk
Abstract. Cryptography is the science and technique of securing information and
communications to guarantee that only those for whom the information is intended
can understand and process it. Hashing is the process through which plaintext data of
any length is mapped into a unique ciphertext of fixed length known as a hash. A
function that does hashing is a hash function. All cryptographic hash functions are
hash functions but not every hash function is a cryptographic hash function. This
paper describes what cryptographic hash functions are, what security properties are
expected of them and what attacks can be performed against them.
1. Introduction
A hash function is a mathematical algorithm that takes data of arbitrary length as input and
maps it to a fixed length enciphered text as output. This output is called a message digest, a
hash value, a hash code or simply a hash.
More formally, a hash function is a mathematical function H : D → R, where the domain D =
{0,1}* and R = {0,1}n for some n >= 1, that maps a numerical input value m of arbitrary
length into a condensed numerical value output h of fixed length [1]. That is: h = H(m).
A hash function that satisfies some additional requirements so that it can be used for
cryptographic applications is known as a cryptographic hash function. These functions are
essential constructs that have a variety of use cases. The main fields of their application are
protection of stored passwords, message authentication, digital signatures, and therefore
certificates.
Cryptographic hash functions are broadly classified into two classes: unkeyed hash
functions also known as Manipulation Detection Code (MDC) or Message Authentication
Code (MAC) with single a parameter – an input message – and keyed hash functions with
two distinct inputs – an input message and a secret key. Generally, the term hash functions
refers to unkeyed hash functions [8].
Some examples of cryptographic hash algorithms are:
1
▪ The SHA (Secure Hash Algorithm) family - published by the National Institute
of Standards and Technology (NIST) as a U.S. Federal Information Processing
Standard (FIPS) [2]. This family designates six different hash functions: SHA-0, SHA-
1, SHA-224, SHA-256, SHA-384, and SHA-512 [2, 3]. The first four operate on 512-bit
message blocks divided into 32-bit words and the last two on 1024-bit blocks divided
into 64-bit words. Bitcoin, the original and largest cryptocurrency (at the time of
writing), uses the SHA-256 hash function.
▪ The MD (Message Digest) family - comprises of MD2, MD4, MD5 and MD6
authored by Ronald Rivest for RSA security and was adopted as the Internet
Standard RFC 1321 [4].
▪ RIPEMD (RACE Integrity Primitives Evaluation Message Digest) – a family
of cryptographic hash functions based upon the design principles used in MD4
developed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel at the COSIC
research group at the Katholieke Universiteit Leuven. RIPEMD-160 produces a hash
digest of 160 bits (20 bytes).
▪ Whirlpool – designed by Vincent Rijmen and Paulo S. L. M. Barreto, this hash
function based on a substantially modified version of the Advanced Encryption
Standard (AES). Whirlpool produces a hash digest of 512 bits (64 bytes).
▪ BLAKE – a hash function submitted to the NIST hash function competition by Jean-
Philippe Aumasson, Luca Henzen, Willi Meier, and Raphael C.-W. Phan. It is based
on Dan Bernstein's ChaCha stream cipher, but a permuted copy of the input block,
XORed with round constants, is added before each ChaCha round.
▪ Curl-P – a hash function formerly used in IOTA Signature Scheme (ISS). IOTA is a
cryptocurrency designed for use with the Internet of Things (IoT) and automotive
ecosystems. ISS is based on Winternitz One-Time Signatures but unlike traditional
Winternitz, in IOTA users sign the hash of a message. Thus, the security of ISS relies
on its cryptographic hash function, which was Curl-P-27.
2. Properties of cryptographic hash functions
A cryptographic hash function is expected to have the following properties that guarantee its
effectiveness and security:
• One-way function (pre-image resistance) – this property requires that for a
hash function H if given any hash value h, it is computationally infeasible to find an
input m such that H(m) = h. In other words, it must be easy to compute on every
input but extremely difficult to invert given the image of a random input.
2
• Target collision resistance (2nd pre-image resistance) – this property
requires that given a hash function H and any input m, it should be computationally
infeasible to find another input m’ such that m’ ≠ m and H(m) = H(m’).
• Collision resistance - this property requires that given a hash function H, it should
be computationally infeasible to find two inputs m and m’ such that m ≠ m’ and H(m)
= H(m’). Due to the fixed size of hash values compared to the much larger – and
arbitrary – size of inputs, collisions are expected to exist in hash functions. However,
they must be computationally intractable to find.
• Deterministic – this property requires that a hash function H should consistently
map a given input m to a hash value h. It should also be public and computable.
• Avalanche effect – this property requires that a change in just one bit of the input
data should result in a large change in the output. This “diffusion” ensures that any
inference about the input from the output is infeasible thus this property is also
sometimes defined as unruliness.
• Hash speed – an ideal property of a cryptographic hash function is its ability to
operate at a reasonable speed. In many situations, a hashing algorithm should
compute hash values rather quickly. However, it’s worthwhile to note that faster is
not always better or more secure.
3. Attacks on cryptographic hash functions
Attacking a cryptographic hash function implies breaking one of its security properties. For
example, breaking pre-image resistance means that an attacker can create a message that
hashes to a specific hash [5]. Attacks on hash functions may focus on either the structure of
the hash function or on the algorithm of the compression function used to condense
arbitrary size input into a fixed size hash value.
Over years, a significant number of cryptographic hash functions have been broken and
proven to be vulnerable to security attacks. The main target of these attacks is the collision
resistance of hash functions. For instance, in August 2004 collisions were found in several
then-popular hash functions, including MD5 [6].
A hash function can be termed as “broken” when, immaterial of the computation feasibility
of that effort, a lower number of its evaluations compared to the brute force attack
complexities and strengths estimated by its designer are used to overcome at least one of its
properties. For instance, consider that it requires 290 evaluations to find a collision for a 256-
bit hash function. The hash function is broken because this factor is less than the 2128
evaluations required by the Birthday attack despite the intractability of the computational
complexity [7].
3
Attacks on hash functions can be classified into two broad categories – Brute Force attacks
and Cryptanalytical attacks. Figure 1 below illustrates this classification:
Fig. 1 Classification of attacks on Hash Functions
3.1 Brute Force attacks
A brute force attack, also known as an exhaustive search, is a trial-and-error based
approach in which an attacker uses a set of predetermined inputs (guesses) against
an algorithm while analysing the output for a possible match. It is the equivalent of
trying every key on a key ring and eventually finding the right one. Brute force attacks
work on all hash functions independent of their structure or any other working
details [8].
The strength and security of a hash function – and the difficulty of Brute Force
attacks – relies solely on the size of its output hash value. For a hash output of length
n, the effort required to resist various classical brute force attacks can be expressed as
follows:
• One-way function inversion (pre-image resistance) attack – the
effort required to find an input m which maps to h by H given a challenge h
equals 2n because for a given n-bit hash h of the hash function H, an attacker
4
would evaluate H with every plausible input m until the desired output hash
value h is obtained.
• 2nd pre-image resistance attack – the effort required to find two inputs
m and m’ that are mapped to the same output by H equals 2n. In this variation
of the brute force attack, an attacker would evaluate the hash function H with
every possible input m’ ≠ m, for a given input m, until the hash value
h = H(m) is obtained.
• Collision attack – the effort required for a given hash function H, to find
two inputs m and m’ such that m ≠ m’ and H(m) = H(m’) equals 2n/2 as on
average, they would have to try 2n - 1 (i.e., 2n / 2) inputs to find one whose hash
value matches. However, in what is referred to as a Birthday attack that is
based on the Birthday Paradox, a chosen plain text attack is possible in which
case the effort required for a collision in a hash function equals 2n/2 as
opposed to 2n - 1. [9]
Further extensions of these classic Brute Force attacks have been studied by various
authors. These include: the K-way collision attack for K >= 2 whose aim is to
find K different inputs mi such that H(mi) = … = H(mk) [5] and the K-way 2nd pre-
image resistance attack for K >= 1 where given an input m, a hash value h, and a
hash function H such that h = H(m), the aim is to find K different inputs mi with
H(mi) = h and mi ≠ m.
3.2 Cryptanalytical attacks
Cryptanalysis is the study of ciphertext, ciphers and cryptosystems with the aim of
identifying weaknesses or leaks of hidden aspects of cryptosystems that are useful in
obtaining meaning of encrypted information without access to the secret key typically
required to do so. Cryptanalysis of hash functions focuses on their underlying
structure and/or the algorithm of the compression function.
Hash functions must be efficient to be used in information processing tasks such as
computation of digital signatures. Effectiveness is achieved by designing them in the
iterative mode of operation where a function that accepts fixed length input is
iterated until an arbitrary length input is processed completely [7]. This iterative
structure was independently proposed by Ivan Damgård [10] and Ralph Merkle [11] at
Crypto ’89. It is referred to as the Merkle- Damgård construction and is applied in the
design of most hash functions in use today.
5
Fig. 2 General structure of most hash algorithms (IJERT [12])
The hash function takes an input message and partitions it into L fixed size blocks of
b bits each. The final block can be padded to b bits if necessary and may also include
the value of the total length of the input to the hash function which makes the job of
an attacker more difficult. They must either find two messages of equal length that
hash to the same value or two messages of differing lengths that, with each of their
respective lengths appended, hash to the same value [12].
In their respective papers, Damgård [10] and Merkle [11], provide theorems showing
that if there exists a fixed-length input collision resistant compression function f : {0,
1}b x {0, 1}t then one can design a variable-length input collision resistant hash
function H : {0, 1}* → {0, 1}t by iterating that compression function [7]. In other
words, if the compression function is collision resistant, then so is the resultant
iterated hash function. Thus, if the compression function is vulnerable to any attack,
then so is the iterated hash function but the converse of this result is not necessarily
true in general.
Cryptanalytical collision finding algorithms and attacks may be classified as either
single or multi block attacks depending on whether the attack uses one compression
function (one block) or more than one iteration of the compression function (more
than one block) in finding collisions or pre-images [8]. In his thesis, Gauravaram [7]
further classifies cryptanalytical attacks on hash functions into generic and specific
attacks.
6
Generic attacks are those that work on a general hash function construction. These
attacks are applicable even if the underlying compression algorithm is replaced by
some abstract oracle [8]. For example, attacks on the Merkel-Damgård construction
that work on all hash functions designed using the approach are generic attacks.
Length extension attacks, Joux's multi-collision attacks, multi (2nd) pre-image attacks
such as the one based on fixed points, correcting block attacks, herding attacks, and
meet in the middle attacks are examples of generic cryptanalytical attacks.
Specific attacks are those that apply differential [19], linear [20,21] or rotational [22]
cryptanalysis or that work on specific hash functions or the algorithm underlying its
compression function such as the collision attacks on the specific hash function of
MD4 [13], MD5 [14,15], SHA-0 [16,17], and SHA-1 [16,18]. If the hash function’s underlying
compression function is implemented using the encryption algorithm, then the
weaknesses in the encryption algorithm – the encryption algorithm may have
complementation property or weak keys or fixed points – can be exploited to attack
hash functions.
4. Conclusion
Cryptanalytical attacks on hash functions, just like with encryption algorithms, seek to
exploit some property of the algorithm to perform some attack other than an exhaustive
search.
Hash functions are however practically easier to attack than encryption algorithms because
the attacker does need to assume any secrets and the maximum computational effort
required to attack the hash function is only upper bounded by the attacker's resources and
not user’s gullibility. This is not the case with block ciphers where the maximum practical
count of executions of the block algorithm is limited by how much computational effort the
attacker can get the user to do [7].
The measure of resistance of a hash algorithm to cryptanalysis is based upon a comparison of
its strength to the level of effort required for a brute-force attack. That is, an ideal hash
algorithm will require a cryptanalytic effort greater than or equal to the brute-force effort.
References
[1] Rompay, B. V. (2004) “Analysis and Design of Cryptographic Hash functions, MAC
algorithms and Block Ciphers”. Ph.D. thesis, Leuven, Belgium: Electrical Engineering
Department, Katholieke Universiteit.
7
[2] National Institute of Standards and Technology (NIST). (1995) “Secure Hash Standard”.
FIPS Publication 180-1. Google Scholar
[3] National Institute of Standards and Technology (NIST). (2002) “Secure Hash Standard”.
FIPS Publication 180-2. Google Scholar
[4] Rivest, R. (1992) “The MD5 Message-Digest Algorithm”. Request for Comments: 1321,
Network Working Group. RFC 1321
[5] Lucks, S. (2004) “Design Principles for Iterated Hash Functions”. Cryptology ePrint
Archive, Report 2004/253, pp. 253. https://eprint.iacr.org/2004/253
[6] Wang, X. Feng, D. Lai, X. Yu, H. (2004) “Collisions for Hash Functions MD4, MD5,
HAVAL-128, and RIPEMD”. Jinan250100, China: The School of Mathematics and System
Science, Shandong University. https://eprint.iacr.org/2004/199.pdf
[7] Gauravram, P. (2003) “Cryptographic Hash Functions: Cryptanalysis, design and
applications”. Ph.D. thesis, Brisbane, Australia: Faculty of Information Technology,
Queensland University of Technology.
[8] Rajeev, S. Geetha, G. (2012) “Cryptographic Hash Functions: A Review”. International
Journal of Computer Science Issues, ISSN (Online): 1694-0814. Vol 9. 461 - 479.
https://www.researchgate.net/publication/267422045_Cryptographic_Hash_Functions_A
_Review
[9] Bellare, M. Kohno, T. (2004) "Hash Function Balance and Its Impact on Birthday
Attacks". In EUROCRYPT, pp.401-418.
[10] Damgård, I. (1989) “A design principle for hash functions”. In Gilles Brassard, editor,
Advances in Cryptology: CRYPTO 89, volume 435 of Lecture Notes in Computer Science,
pages 416-427. Springer-Verlag.
[11] Merkle, R. (1989) “One way hash function and DES”. In Gilles Brassard, editor,
Advances in Cryptology: CRYPTO 89, volume 435 of Lecture Notes in Computer Science,
pages 416-427. Springer-Verlag.
[12] Kumar, C. K. Suyambulingom, C. (2012) “Cryptographic of high Security Hash
Functions”. International Journal of Engineering Research & Technology (IJERT), ISSN:
2278-0181, Vol. 1 Issue 3. https://www.ijert.org/research/cryptographic-of-high-security-
hash-functions-IJERTV1IS3074.pdf
[13] Wang, X. Lai, X. Feng, D. Chen, H. Yu, X. (2005) "Cryptanalysis of the Hash Functions
MD4 and RIPEMD". In EUROCRYPT, pp.1-18.
8
[14] Wang, X. Lai, X. Feng, D. Yu, X. (2004) "Collisions for Hash Functions MD4, MD5,
HAVAL-128 and RIPEMD". IACR Cryptology ePrint Archive, pp. 199.
[15] Wang, X. Yu, X. (2005) "How to Break MD5 and Other Hash Functions”. In
EUROCRYPT, pp. 19-35.
[16] Biham, E. Chen, R. Joux, A. Carribault, P. Lemuet, C. Jalby, W. (2005) "Collisions of
SHA-0 and Reduced SHA-1". In EUROCRYPT, pp.36-57.
[17] Wang, X. Yu, H. Yin, Y. L. (2005) "Efficient Collision Search Attacks on SHA-0". In
CRYPTO, pp.1-16.
[18] Wang, X. Yin, Y. L. Yu, H. (2005) "Finding Collisions in the Full SHA-1". In CRYPTO,
pp.17-36.
[19] Biham, E. Shamir, A. (1991) “Differential Cryptanalysis of DES-like Cryptosystems”.
Journal of Cryptology, Vol. 4, No. 1, pp. 3-72.
[20] Bakhtiari, S. Safavi-Naini, R. Pieprzy, J. (1995) “Cryptographic Hash Functions: A
Survey”. Technical Report 95-09, Department of Computer Science, University of
Wollongong
[21] Matsui, M. (1993) “Linear Cryptanalysis methods for DES Cipher”. In EUROCRYPT, pp.
386-397.
[22] Khovratovich, D. Nikolic, I. (2010) "Rotational Cryptanalysis of ARX". In FSE, pp.333-
346.
[23] Muller F. (2004) “The MD2 Hash Function Is Not One-Way”. In: Lee P.J. (eds)
Advances in Cryptology - ASIACRYPT 2004. ASIACRYPT 2004. Lecture Notes in Computer
Science, vol 3329. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30539-
2_16
View publication stats