Bashing Hashing

Recently there has been a lot of exciting work in cryptographic hash functions. Hash functions are functions which take some large (possibly infinite) domain, and map it to a smaller (usually of fixed size) range. Cryptographic hash functions are hash functions which are one-way and also collision free. One-way means that if H is the hash function, then, given H(x) it is computationally infeasible to determine x. Collision free means that if we are given H(x), then it is computationally infeasible to find a y not equal to x such that H(x)=H(y). Actually this is called weakly collision free. Stronly collision free is if it is computationally infeasible to find any two x and y not equal to each other such that H(x)=H(y). You may notice that there exist many collisions when the domain is larger than the range, but for a strongly collision free hash function, finding even a single collision is difficult.
One common use for cryptographic hash functions is in digital signatures. Instead of digitally signing the entire document, one almost always uses a hash function on this document and then digitally signs the resulting value. So if it is possible to construction documents which have the same hash value, then this whole scheme gets into trouble.
Recently two widely used cryptographic hash functions, MD5 and SHA, have been under attack. Work by a whole host of people, including papers by Wang and Yu, and Biham, Chen, Joux, Carribault, Lemuet, and Jalby has shown that it is possible to find collisions for these hash functions in computationally feasible times. But wait, you say, finding collisions doesn’t necessarily mean that we can’t use these attacks for the digital signature schemes we described above. You would think this because the results on breaking the cryptographic hash functions provide basically random collisions. For the specific task of creating a forged document with the same hash function, you might think these attacks don’t work. But you would be wrong! In fact, you can use these attacks to create forged documents with the same hash function. Magnus Duam and Stefan Lucks have a wonderful discussion and demonstration of this here. What is very cool is that they have two postscript files which both produce the same value after applying the MD5 hash function (a25f7f0b 29ee0b39 68c86073 8533a4b9, heh.) (Previously Kaminski and Mikle have also shown such capabilities.)
So…do you really trust MD5? I don’t.
Update: As Aram points out in the comments, this page has a cool list of hash functions and those that have been broken. Not a pretty picture.

One Reply to “Bashing Hashing”

Leave a Reply

Your email address will not be published. Required fields are marked *