Last Updated: 2012-04-02 16:50:25 UTC
by Johannes Ullrich (Version: 1)
NIST is currently in the final stages of defining a new "secure hashing algorithm" (SHA) . The goal of the competition is to find a replacement for the current standard ("SHA-2", aka SHA-256 and SHA-512). NIST attempts to be somewhat proactive in defining crypto standards, realizing that new standards need to be implemented well before the old once are considered broken.
The three popular hash functions, MD5, SHA1 and SHA2, all use variations of a particular hashing algorithm known as "Merkle-Damgård Construction". Attacks have been developed for MD5 and SHA1, and it is plausible that they will be extended to SHA2 in the future. As a result, the candidates for SHA-3 use different algorithms that are hopefully safe from similar attacks.
A good cryptographic hash will make it hard to come up with two different documents that exhibit the same hash. There are a number of variations of this attack. For example, weather or not one of the documents (or hashes) is provided. One attack in particular affects the Merkle-Damgård based hashes, the "length extension attack". This attack is in particular relevant if a hash is used to verify the integrity of a document.
In this attack, the hash is created by concatenating a secret and a file, then hashing it. The hash and the file (without the secret) at then transmitted to a recipient. The recipient uses the same secret to recreate the hash and to verify if the file is authentic. However, for ciphers susceptible to the length extension attack, an attacker may calculate a new hash, if the attacker knows the size of the original hashed document. However, the attacker can only add to the document.
To make this more specific, lets say I am sending you a contract. We agreed on a pre-shared secret. I am creating the hash of "secret+contract" and send the hash to you with the document. An attacker now intercepts the message, and adds a page to the contract, and calculates a new hash. All the attacker needs to know (guess?) is the length of the secret.
Current hashing functions can be used safely to authenticate messages, but the algorithm has to be slightly more complex. Instead of just appending the key, an HMAC algorithm has to be used, which essentially applies the key to the message by XOR'ing the message with the key before hashing (just use HMAC.. its a bit more complex then that and has to be done right)
But back to SHA-3: NIST has narrowed the field of potential candidates down to 5. All of them are safe with respect to the length-extension attack. However, they all take up more CPU cycles then SHA-2, unless in some cases where HMAC is required. In addition, at a recent IETF conference, it was pointed out that during the competition, SHA-2 turned out to be more robust then expected, reducing the need for SHA-3 to replace SHA-2.
So why should you care? There are a number of reasons why you should be concerned about encryption standards. First of all, many compliance regiments require the use of specific hashing and encryption algorithms. Secondly, while there may be equivalently strong algorithms, usually developers of libraries spent more effort in optimizing standard algorithms, and you may even find them "in silicon" in modern CPUs. I wouldn't be surprised to find a "SHA-3" opcode in a future main stream CPU. At this point, SHA-256 or SHA-512 should be used if you are developing new software. However, if you find SHA-1, you shouldn't panic. Make sure you are using HMAC properly, and are not just concatenating secrets with documents in order to validate them.