KEMBAR78
An Introduction to Hashing: A basic understanding | PPT
Outline
• Project 1
• Hash functions and its application on security
• Modern cryptographic hash functions and
message digest
– MD5
– SHA
GNU Privacy Guard
Yao Zhao
Introduction of GnuPG
• GnuPG Stands for GNU Privacy Guard
• A tool for secure communication and data
storage
• To encrypt data and create digital signatures
• Using public-key cryptography
• Distributed in almost every Linux
• For T-lab machines --- gpg command
Functionality of GnuPG
• Generating a new keypair
– gpg -- gen-key
• Key type
– (1) DSA and ElGamal (default)
– (2) DSA (sign only)
– (4) ElGamal (sign and encrypt)
• Key size
– DSA: between 512 and 1024 bits->1024 bits
– ElGamal: any size
• Expiration date: key does not expire
• User ID
• Passphrase
Functionality of GnuPG
• Generating a revocation certificate
– gpg --output revoke.asc --gen-revoke yourkey
• Exporting a public key
– gpg --output alice.gpg --export alice@cyb.org
– gpg --armor --export alice@cyb.org
• Importing a public key
– gpg --import blake.gpg
– gpg --list-keys
– gpg --edit-key blake@cyb.org
• fpr
• sign
• check
Functionality of GnuPG
• Encrypting and decrypting documents
– gpg --output doc.gpg --encrypt --recipient blake@cyb.org doc
– gpg --output doc --decypt doc.gpg
• Making and verifying signatures
– gpg --output doc.sig --sign doc
– gpg --output doc --decrypt doc.sig
• Detached signatures
– gpg --output doc.sig --detach-sig doc
– gpg --verify doc.sig doc
Questions?
Outline
• Project 1
• Change of class time on 1/30: 4:30-5:50pm ?
• Hash functions and its application on security
• Modern cryptographic hash functions and
message digest
– MD5
– SHA
Hash Functions
• Condenses arbitrary message to fixed size
h = H(M)
• Usually assume that the hash function is public
and not keyed
• Hash used to detect changes to message
• Can use in various ways with message
• Most often to create a digital signature
Hash Functions & Digital Signatures
Requirements for Hash Functions
1. Can be applied to any sized message M
2. Produces fixed-length output h
3. Is easy to compute h=H(M) for any message M
4. Given h is infeasible to find x s.t. H(x)=h
• One-way property
5. Given x is infeasible to find y s.t. H(y)=H(x)
• Weak collision resistance
6. Is infeasible to find any x,y s.t. H(y)=H(x)
• Strong collision resistance
Birthday Problem
• How many people do you need so that the probability of
having two of them share the same birthday is > 50% ?
• Random sample of n birthdays (input) taken from k (365,
output)
• kn total number of possibilities
• (k)n=k(k-1)…(k-n+1) possibilities without duplicate birthday
• Probability of no repetition:
– p = (k)n/kn  1 - n(n-1)/2k
• For k=366, minimum n = 23
• n(n-1)/2 pairs, each pair has a probability 1/k of having the
same output
• n(n-1)/2k > 50%  n>k1/2
How Many Bits for Hash?
• m bits, takes 2m/2 to find two with the same
hash
• 64 bits, takes 232
messages to search
(doable)
• Need at least 128 bits
Using Hash for Authentication
• Alice to Bob: challenge rA
• Bob to Alice: MD(KAB|rA)
• Bob to Alice: rB
• Alice to Bob: MD(KAB|rB)
• Only need to compare MD results
Using Hash to Encrypt
• One-time pad with KAB
– Compute bit streams using MD, and K
• b1=MD(KAB), bi=MD(KAB|bi-1), …
  with message blocks
– Is this a real one-time pad ?
– Add a random 64 bit number (aka IV) b1=MD(KAB|
IV), bi=MD(KAB|bi-1), …
General Structure of Secure Hash Code
• Iterative compression function
– Each f is collision-resistant, so is the resulting hashing
MD5: Message Digest Version 5
input Message
Output 128 bits Digest
• Until recently the most widely used hash algorithm
– in recent times have both brute-force & cryptanalytic
concerns
• Specified as Internet standard RFC1321
MD5 Overview
MD5 Overview
1. Pad message so its length is 448 mod 512
2. Append a 64-bit original length value to message
3. Initialise 4-word (128-bit) MD buffer (A,B,C,D)
4. Process message in 16-word (512-bit) blocks:
– Using 4 rounds of 16 bit operations on message block &
buffer
– Add output to buffer input to form new buffer value
5. Output hash value is the final buffer value
Processing of Block mi - 4 Passes
ABCD=fF(ABCD,mi,T[1..16])
ABCD=fG(ABCD,mi,T[17..32])
ABCD=fH(ABCD,mi,T[33..48])
ABCD=fI(ABCD,mi,T[49..64])
mi
+ + + +
A B C D
MDi
MD i+1
Padding Twist
• Given original message M, add padding bits
“10*
” such that resulting length is 64 bits less
than a multiple of 512 bits.
• Append (original length in bits mod 264
),
represented in 64 bits to the padded message
• Final message is chopped 512 bits a block
MD5 Process
• As many stages as the number of 512-bit
blocks in the final padded message
• Digest: 4 32-bit words: MD=A|B|C|D
• Every message block contains 16 32-bit words:
m0|m1|m2…|m15
– Digest MD0 initialized to:
A=01234567,B=89abcdef,C=fedcba98, D=76543210
– Every stage consists of 4 passes over the message
block, each modifying MD
• Each block 4 rounds, each round 16 steps
Different Passes...
Each step i (1 <= i <= 64):
• Input:
– mi – a 32-bit word from the message
With different shift every round
– Ti – int(232
* abs(sin(i)))
Provided a randomized set of 32-bit patterns, which
eliminate any regularities in the input data
– ABCD: current MD
• Output:
– ABCD: new MD
MD5 Compression Function
• Each round has 16 steps of the form:
a = b+((a+g(b,c,d)+X[k]+T[i])<<<s)
• a,b,c,d refer to the 4 words of the buffer,
but used in varying permutations
– note this updates 1 word only of the buffer
– after 16 steps each word is updated 4 times
• where g(b,c,d) is a different nonlinear
function in each round (F,G,H,I)
MD5 Compression Function
Functions and Random Numbers
• F(x,y,z) == (xy)(~x  z)
– selection function
• G(x,y,z) == (x  z) (y ~ z)
• H(x,y,z) == xy z
• I(x,y,z) == y(x  ~z)
Secure Hash Algorithm
• Developed by NIST, specified in the Secure
Hash Standard (SHS, FIPS Pub 180), 1993
• SHA is specified as the hash algorithm in the
Digital Signature Standard (DSS), NIST
General Logic
• Input message must be < 264
bits
– not really a problem
• Message is processed in 512-bit blocks
sequentially
• Message digest is 160 bits
• SHA design is similar to MD5, a little slower,
but a lot stronger
Basic Steps
Step1: Padding
Step2: Appending length as 64 bit unsigned
Step3: Initialize MD buffer 5 32-bit words
Store in big endian format, most significant bit in low address
A|B|C|D|E
A = 67452301
B = efcdab89
C = 98badcfe
D = 10325476
E = c3d2e1f0
Basic Steps...
Step 4: the 80-step processing of 512-bit blocks
– 4 rounds, 20 steps each.
Each step t (0 <= t <= 79):
– Input:
• Wt – a 32-bit word from the message
• Kt – a constant.
• ABCDE: current MD.
– Output:
• ABCDE: new MD.
SHA-1 verses MD5
• Brute force attack is harder (160 vs 128 bits for
MD5)
• A little slower than MD5 (80 vs 64 steps)
– Both work well on a 32-bit architecture
• Both designed as simple and compact for
implementation
• Cryptanalytic attacks
– MD4/5: vulnerability discovered since its design
– SHA-1: no until recent 2005 results raised concerns on
SHA-1: no until recent 2005 results raised concerns on
its use in future applications
its use in future applications
Revised Secure Hash Standard
• NIST have issued a revision FIPS 180-2 in 2002
• Adds 3 additional hash algorithms
• SHA-256, SHA-384, SHA-512
– Collectively called SHA-2
• Designed for compatibility with increased
security provided by the AES cipher
• Structure & detail are similar to SHA-1
• Hence analysis should be similar, but security
levels are rather higher

An Introduction to Hashing: A basic understanding

  • 1.
    Outline • Project 1 •Hash functions and its application on security • Modern cryptographic hash functions and message digest – MD5 – SHA
  • 2.
  • 3.
    Introduction of GnuPG •GnuPG Stands for GNU Privacy Guard • A tool for secure communication and data storage • To encrypt data and create digital signatures • Using public-key cryptography • Distributed in almost every Linux • For T-lab machines --- gpg command
  • 4.
    Functionality of GnuPG •Generating a new keypair – gpg -- gen-key • Key type – (1) DSA and ElGamal (default) – (2) DSA (sign only) – (4) ElGamal (sign and encrypt) • Key size – DSA: between 512 and 1024 bits->1024 bits – ElGamal: any size • Expiration date: key does not expire • User ID • Passphrase
  • 5.
    Functionality of GnuPG •Generating a revocation certificate – gpg --output revoke.asc --gen-revoke yourkey • Exporting a public key – gpg --output alice.gpg --export alice@cyb.org – gpg --armor --export alice@cyb.org • Importing a public key – gpg --import blake.gpg – gpg --list-keys – gpg --edit-key blake@cyb.org • fpr • sign • check
  • 6.
    Functionality of GnuPG •Encrypting and decrypting documents – gpg --output doc.gpg --encrypt --recipient blake@cyb.org doc – gpg --output doc --decypt doc.gpg • Making and verifying signatures – gpg --output doc.sig --sign doc – gpg --output doc --decrypt doc.sig • Detached signatures – gpg --output doc.sig --detach-sig doc – gpg --verify doc.sig doc
  • 7.
  • 8.
    Outline • Project 1 •Change of class time on 1/30: 4:30-5:50pm ? • Hash functions and its application on security • Modern cryptographic hash functions and message digest – MD5 – SHA
  • 9.
    Hash Functions • Condensesarbitrary message to fixed size h = H(M) • Usually assume that the hash function is public and not keyed • Hash used to detect changes to message • Can use in various ways with message • Most often to create a digital signature
  • 10.
    Hash Functions &Digital Signatures
  • 11.
    Requirements for HashFunctions 1. Can be applied to any sized message M 2. Produces fixed-length output h 3. Is easy to compute h=H(M) for any message M 4. Given h is infeasible to find x s.t. H(x)=h • One-way property 5. Given x is infeasible to find y s.t. H(y)=H(x) • Weak collision resistance 6. Is infeasible to find any x,y s.t. H(y)=H(x) • Strong collision resistance
  • 12.
    Birthday Problem • Howmany people do you need so that the probability of having two of them share the same birthday is > 50% ? • Random sample of n birthdays (input) taken from k (365, output) • kn total number of possibilities • (k)n=k(k-1)…(k-n+1) possibilities without duplicate birthday • Probability of no repetition: – p = (k)n/kn  1 - n(n-1)/2k • For k=366, minimum n = 23 • n(n-1)/2 pairs, each pair has a probability 1/k of having the same output • n(n-1)/2k > 50%  n>k1/2
  • 13.
    How Many Bitsfor Hash? • m bits, takes 2m/2 to find two with the same hash • 64 bits, takes 232 messages to search (doable) • Need at least 128 bits
  • 14.
    Using Hash forAuthentication • Alice to Bob: challenge rA • Bob to Alice: MD(KAB|rA) • Bob to Alice: rB • Alice to Bob: MD(KAB|rB) • Only need to compare MD results
  • 15.
    Using Hash toEncrypt • One-time pad with KAB – Compute bit streams using MD, and K • b1=MD(KAB), bi=MD(KAB|bi-1), …   with message blocks – Is this a real one-time pad ? – Add a random 64 bit number (aka IV) b1=MD(KAB| IV), bi=MD(KAB|bi-1), …
  • 16.
    General Structure ofSecure Hash Code • Iterative compression function – Each f is collision-resistant, so is the resulting hashing
  • 17.
    MD5: Message DigestVersion 5 input Message Output 128 bits Digest • Until recently the most widely used hash algorithm – in recent times have both brute-force & cryptanalytic concerns • Specified as Internet standard RFC1321
  • 18.
  • 19.
    MD5 Overview 1. Padmessage so its length is 448 mod 512 2. Append a 64-bit original length value to message 3. Initialise 4-word (128-bit) MD buffer (A,B,C,D) 4. Process message in 16-word (512-bit) blocks: – Using 4 rounds of 16 bit operations on message block & buffer – Add output to buffer input to form new buffer value 5. Output hash value is the final buffer value
  • 20.
    Processing of Blockmi - 4 Passes ABCD=fF(ABCD,mi,T[1..16]) ABCD=fG(ABCD,mi,T[17..32]) ABCD=fH(ABCD,mi,T[33..48]) ABCD=fI(ABCD,mi,T[49..64]) mi + + + + A B C D MDi MD i+1
  • 21.
    Padding Twist • Givenoriginal message M, add padding bits “10* ” such that resulting length is 64 bits less than a multiple of 512 bits. • Append (original length in bits mod 264 ), represented in 64 bits to the padded message • Final message is chopped 512 bits a block
  • 22.
    MD5 Process • Asmany stages as the number of 512-bit blocks in the final padded message • Digest: 4 32-bit words: MD=A|B|C|D • Every message block contains 16 32-bit words: m0|m1|m2…|m15 – Digest MD0 initialized to: A=01234567,B=89abcdef,C=fedcba98, D=76543210 – Every stage consists of 4 passes over the message block, each modifying MD • Each block 4 rounds, each round 16 steps
  • 23.
    Different Passes... Each stepi (1 <= i <= 64): • Input: – mi – a 32-bit word from the message With different shift every round – Ti – int(232 * abs(sin(i))) Provided a randomized set of 32-bit patterns, which eliminate any regularities in the input data – ABCD: current MD • Output: – ABCD: new MD
  • 24.
    MD5 Compression Function •Each round has 16 steps of the form: a = b+((a+g(b,c,d)+X[k]+T[i])<<<s) • a,b,c,d refer to the 4 words of the buffer, but used in varying permutations – note this updates 1 word only of the buffer – after 16 steps each word is updated 4 times • where g(b,c,d) is a different nonlinear function in each round (F,G,H,I)
  • 25.
  • 26.
    Functions and RandomNumbers • F(x,y,z) == (xy)(~x  z) – selection function • G(x,y,z) == (x  z) (y ~ z) • H(x,y,z) == xy z • I(x,y,z) == y(x  ~z)
  • 27.
    Secure Hash Algorithm •Developed by NIST, specified in the Secure Hash Standard (SHS, FIPS Pub 180), 1993 • SHA is specified as the hash algorithm in the Digital Signature Standard (DSS), NIST
  • 28.
    General Logic • Inputmessage must be < 264 bits – not really a problem • Message is processed in 512-bit blocks sequentially • Message digest is 160 bits • SHA design is similar to MD5, a little slower, but a lot stronger
  • 29.
    Basic Steps Step1: Padding Step2:Appending length as 64 bit unsigned Step3: Initialize MD buffer 5 32-bit words Store in big endian format, most significant bit in low address A|B|C|D|E A = 67452301 B = efcdab89 C = 98badcfe D = 10325476 E = c3d2e1f0
  • 30.
    Basic Steps... Step 4:the 80-step processing of 512-bit blocks – 4 rounds, 20 steps each. Each step t (0 <= t <= 79): – Input: • Wt – a 32-bit word from the message • Kt – a constant. • ABCDE: current MD. – Output: • ABCDE: new MD.
  • 31.
    SHA-1 verses MD5 •Brute force attack is harder (160 vs 128 bits for MD5) • A little slower than MD5 (80 vs 64 steps) – Both work well on a 32-bit architecture • Both designed as simple and compact for implementation • Cryptanalytic attacks – MD4/5: vulnerability discovered since its design – SHA-1: no until recent 2005 results raised concerns on SHA-1: no until recent 2005 results raised concerns on its use in future applications its use in future applications
  • 32.
    Revised Secure HashStandard • NIST have issued a revision FIPS 180-2 in 2002 • Adds 3 additional hash algorithms • SHA-256, SHA-384, SHA-512 – Collectively called SHA-2 • Designed for compatibility with increased security provided by the AES cipher • Structure & detail are similar to SHA-1 • Hence analysis should be similar, but security levels are rather higher

Editor's Notes

  • #9 A variation on the message authentication code is the one-way hash function. As with the message authentication code, a hash function accepts a variable-size message M as input and produces a fixed-size output, referred to as a hash code H(M). Unlike a MAC, a hash code does not use a key but is a function only of the input message. The hash code is also referred to as a message digest or hash value. Slides from Stallings (first 3)
  • #10 Stallings Figure 11.5c “Basic Uses of Hash Functions” shows the hash being “signed” with the senders private key, thus forming a digital signature.
  • #11 The purpose of a hash function is to produce a “fingerprint”of a file, message, or other block of data. These are the specifications for good hash functions. Essentially it must be extremely difficult to find 2 messages with the same hash, and the hash should not be related to the message in any obvious way (ie it should be a complex non-linear function of the message). There are quite a few similarities in the evolution of hash functions & block ciphers, and in the evolution of the design requirements on both.
  • #18 Stallings Fig 12.1
  • #19 The padded message is broken into 512-bit blocks, processed along with the buffer value using 4 rounds, and the result added to the input buffer to make the new buffer value. Repeat till run out of message, and use final buffer value as hash. nb. due to padding always have a full final block (with length in it).
  • #24 Each round mixes the buffer input with the next "word" of the message in a complex, non-linear manner. A different non-linear function is used in each of the 4 rounds (but the same function for all 16 steps in a round). The 4 buffer words (a,b,c,d) are rotated from step to step so all are used and updated. g is one of the primitive functions F,G,H,I for the 4 rounds respectively. X[k] is the kth 32-bit word in the current message block. T[i] is the ith entry in the matrix of constants T. The addition of varying constants T and the use of different shifts helps ensure it is extremely difficult to compute collisions.
  • #31 Compare using the design goals listed earlier. In 2005, a research team described an attack in which two separate messages could be found that deliver the same SHA-1 hash using 2^69 operations, far fewer than the 2^80 operations previously thought needed to find a collision with an SHA-1 hash [WANG05]. This result should hasten the transition to newer, longer versions of SHA. In August 2005, an improved attack on SHA-1, discovered by Xiaoyun Wang, Andrew Yao and Frances Yao, was announced at the CRYPTO conference rump session. The time complexity of the new attack is claimed to be 263. But no collision on SHA-1 has been found yet.
  • #32 See Stallings Tables 12.3 and 12.4 for details.