Hash Length Extension: The padding that killed your secret key

Its been a while folks! but I'm back with another really interesting post, this time about how to abuse hashing algorithms or rather a certain style of hashing algorithms.

I'm going to try to teach you the analysis that gave birth to this very clever attack and to do that I need to talk a little about hashing algorithms and how they are constructed, I mean its not just coincidence that collisions in hashing functions are extremely rare.




Meet the Merkel-Damgard

Hashing algorithms like many other cryptographic constructions and implementations are designed according to models---at least I hope most of them are---. These models are designed to serve as a guideline for those daring minds who want to make use of there Pseudo random functions and key stream generators to facilitate things like protection, confidentiality, integrity and secrecy. Think about them like a blueprint that says stuff like "connect joint A of PRF 1.2 to joint B of PRG 1.3".

One of these awesome models is called the Merkle-Damgard construction for hashing functions. Its a road map for implementing collision resistant hashing functions---for a while after hearing about this I wasn't sure what the MD in MD5 stood for---. The Merkel Damgard construction looks like this:

This picture is a bit misleading, so I'll need to emphasis a few points of its operation before I can begin easing you into the exploitation of this construction.

Here's a short passage from Wikipedia about it, note the parts of the text I've highlighted :

The Merkle–Damgård hash function first applies an MD-compliant padding  function to create an output whose size is a multiple of a fixed number (e.g. 512 or 1024) — this is because compression functions cannot handle inputs of arbitrary size. The hash function then breaks the result into blocks of fixed size, and processes them one at a time with the compression function, each time combining a block of the input with the output of the previous round. [1] :146 In order to make the construction secure, Merkle and Damgård proposed that messages be padded with a padding that encodes the length of the original message. This is called length padding  or  Merkle–Damgård strengthening .

So what we learn from this passage is that each block of input is thrown into f one at a time. Practically because some hashing algorithms are only geared to work with fixed length input, some padding needs to take place, so what we are looking at is an algorithm that basically works like this:

  1. Pad input until its length is a multiple of the block size
  2. break the input into block lengths
  3. for each block compute f(block)
  4. output the hash'
This is severely simplified, you might ask how the bits of the hash actually get computed? Well each hashing function of this kind has something called a state vector which is just another name for a bunch of integers that will eventually become the hash you are computing. The way the algorithm works is it grabs a couple of bits from each block, does some crazy irreversible-collision resistant operation on it uses the result to augment the state vector---and then moves onto the next block and does the same thing. After every round of compression the state vector is augmented---each augmentation influences the next---,this happens until the last round when the "final station" is reached and the actual bits in the state vector are made presentable and are then returned as the result.

This is generally how most Merkle-Damgard construction hashing algorithms are implemented, I don't imagine anyone would try anything too far removed from this but just to be concrete and precise I know that MD5,SHA1 and SHA2 all follow this methodology.
 

Analysis

You may realize that the length of the message influences---or rather determines---the number of blocks---its easy to see that the more message bits you have the bigger number of blocks these will get broken up into and evidently then---pushed through the hashing function and this in turn determines how many times the compression function is computed which finally determines how many times the state vector is augmented. this is the critical realization you need to make if you want to understand the attack!

This means that if you stick a message M which is N blocks long through the compression function in succession, you'll get state vector V. Should you then stick a message M" which is the same N blocks plus an arbitrary number of extra blocks which I will call A through the compression function in succession you'll then get a different state vector at the end say V". If you've though about this carefully you would have realized that during computation of M" the state vector V" at some stage was equal to V until the remaining A blocks where pushed into the compression function. I'll prove this by ascii art

M = N1,N2,N3,..Nn
F(N1) -> V1 #V1 is the state vector after compression of the 1st block of M
F(N2) -> V2
...
F(Nn) -> Vn = V

now for M"
M" = N1,N2,N3,...Nn,Nn+1,Nn+2,...Na
Where Nn+1,...,Na = A

So if we now stick M" through the hashing algorithm we get,
F(N1) -> V1
F(N2) -> V2
...
F(Nn) -> Vn = V #the first n blocks of M" are equal to M thus the state vector will be the same
F(Nn+1) -> Vn+1
...
F(Na) -> Va = V"

Another thing you may find interesting is that---after looking at the source code of some Merkle Damgard ciphers---the hash delivered to you at the end is in fact just a representation of the state
vector, it actually is just the final state vector. 

Why is this so awesome? Well because we can actually decompose the hash back into a state vector---typically the final hash is basically the hexadecimal presentation of the integers in the final state vector.  And Why would we want to do that? 

Well, given H(M) should we want to compute the hash of M || X where '||' denotes concatenation all we need to do is augment the state vector from of the hash of M with the compression of X. So in summary:

H(M || X) = Final_Station( State_Vector(H(M)) + [the compression of all the bits in X]) 

Should a hash function of the Merkle-Damgard construction be used in this way to ensure message integrity then we can construct a valid forged message---or rather an existential forgery.

I'll follow this post up with some practical exploitation examples involving the hashes you all love to use and abuse ;)

Cheers!