Last Updated: 2011-06-28 13:15:45 UTC
by Johannes Ullrich (Version: 1)
After talking about SQL Injection, this is the second part of the mini series to help you protect yourself from simple persistent attacks as we have seen them in the last couple months. A common MO employed in these attacks is to steal passwords from a database via sql injection. Later, the attacker will try to use these passwords to break into other sites for which users may choose the same password. Of course, part of the problem is password reuse. But for now, we will focus on the hashing of passwords to make it harder for an attacker to retrieve a users plain text password.
First of all: What is hashing? According to NIST, "A hash algorithm (alternatively, hash "function") takes binary data, called the message, and produces a condensed representation, called the message digest. A cryptographic hash algorithm is a hash algorithm that is designed to achieve certain security properties."  A good cryptographic hash will make it hard to find two messages with the same hash, or find clear text for a specific hash value. Common hashing algorithms are MD-5 (old), or the "Secure Hashing Algorithms" (SHA) family of hashing algorithms (SHA-1, SHA-256...). Another common but less popular algorithm is RIPE-MD.
Storing a password as a hash will make it difficult to figure out the actual password a user used. In order to verify the password, it is first hashed, then compared to the stored hash in the database.
A hash isn't fool proof. All hashes are vulnerable to brute forcing. If I can get the hash, I can try various passwords, hash them and check if they match. I may actually end up with a different password then the correct one. Hashes do have collisions (same hash for two different plain text values). A good hash is slow enough to calculate to make brute forcing difficult. Brute forcing can be improved by using databases with pre-calculated hash values, or so called rainbow tables. These systems reduce the brute forcing to a simple database look-up, but require storage space. Rainbow tables are practical for strings up to 10 characters in length (lower case alpha numeric). Of course, the size of a rainbow table will increase fast as the length of the plain text increases or the complexity of the plain text increases .
Probably the most important defense against rainbow tables is the idea of introducing a "salt". First of all a "salt" will ensure that two users who happen to use the same password, will end up with a different hash. A salt can also be used to increase the length of the plain text beyond the point where rainbow tables become practical.
In order to use a "salt", the salt value and the users password are first concatenated, then the string is hashed.
Another trick to harden a hash is to just apply the same algorithm multiple times. For example, if we take the SHA-1 algorithm, and apply it 100 times, we will slow down a brute force attempt by a factor of 100. However, the delay in validating an individual password will be hard to notice.
When selecting an algorithm to hash passwords, it is important to select carefully as it is difficult to change the algorithm later. You will have to ask users to change their password if you do as you no longer know what password they picked.
Here a proposal to create difficult to reverse hashes with salt:
- as salt, do use a complex string that is different for each user. I like to use the username or email address (of course, this means the user will have to enter their password whenever they change their e-mail address, but that is usually a good idea anyway). You could just create a salt for each user and store it with the hash (similar to what the unix /etc/shadow file does).
- first, hash the salt (e-mail address) and the password by themselves. This way, we end up with simple fixed length strings. We no longer have to worry about odd characters in either.
- concatenate the two hashes, and hash them again.
So the complete formula to create our password hash would look like (using sha1 as an example, you could also mix and match hashing algorithms):
You could also add a secret, in addition to the salt. If the secret is not stored in the database, it would not be easily reachable via a SQL injection exploit (yes, you can use them to read files, but it requires sufficient privileges). The formula would now look like:
Introducing minimum password strength requirements may also help, but can also lead to annoyed users. The best defense against brute forcing a password hash is a long password using a diverse character set. Even if you do not require strong passwords, you should at least not restrict the length or the character set. Hashing the password as soon as received (for example as part of input validation) will help mitigate any risks due to odd characters a user may use. Passwords should never be echoed back to the user.
As a user, how do you know if your password is hashed? There isn't a bullet proof way to figure it out. But there are some indications that it is not hashed:
- the length or characters you are allowed to use is limited. If the password is hashed, it doesn't matter how long the original password was.
- As part of the password recovery, the site is returning your old password (very bad... and well, proof that the password was not hashed)
If you want to read more about this, Heise.de recently published a nice article about password hashing . This is of course also a topic we cover in our secure coding / defending web application classes like DEV522.
NIST is currently finalizing a competition to come up with a new hashing standard, that will be known as SHA-3. The "winner" should be announced in 2012. Until then, SHA-1, SHA-224, SHA-256, SHA-384, and SHA-512 are suggested. I usually recommend to stick to these standards. As programming languages change, it is likely that they will keep supporting these standard algorithms. Other less popular algorithms may on the other hand be dropped.