Leaked Passwords and Better Security Practices
Password security has been in the news a great deal. LinkedIn, eHarmony, and Last.fm all had their password databases leaked onto the public Internet in June. Many commentators opined—some more lucidly than others—on what was wrong and right with their password-handling practices. Brian Krebs, whose website is excellent reading for anyone interested in security, posted an insightful interview with security researcher Thomas H. Ptacek.
As testers, how do we assess whether or not our software is handling passwords securely? Let's start by reviewing the basics of password storage. The simplest way to store passwords is in cleartext, with no encryption or transformation of any kind. This approach is both straightforward and horribly insecure. Someone who gets access to the password database—either an administrator or a cracker—instantly knows the passwords of all users.
The next step up in security is to hash the passwords. A hash function takes an input (e.g., "password") and turns it into a hash value—a sort of seemingly-random fingerprint, like "b92d5869c21b0083." The hash function satisfies three important rules:
- The same input always generates the same hash value—e.g., "password" always produces "b92d5869c21b0083."
- Any change in the input produces an unpredictable change in in the output.
- The hash function is one way—i.e., the original input cannot be determined from the hash value.
When the user sets her password, the hash value of the password is stored instead of the password itself. When she tries to log in, the password she supplies is hashed and compared to the stored hash value. If they match, we know the correct password has been given.
Hashing passwords is clearly an improvement. Passwords are not directly visible in the database, and an attacker who obtains it gets only the hashes. He can't determine the passwords from the hashes, so he is reduced to guessing passwords, hashing them, and comparing the resulting hash values in hopes of a match.
The problem with this approach is that if an attacker has access to a dictionary that matches likely passwords to hash values, he can easily crack a large number of passwords. This dictionary would take a long time to compile—a few days to a few years—but it only needs to be done once for any hashing algorithm. And, sure enough, such dictionaries can be readily found on the Internet.
Adding a salt—a fixed-length, random number that's different for each password—to each user's password before hashing it helps with this problem. Now, an attacker needs a dictionary for each possible salt—thousands or more—which may be prohibitive in terms of effort. Additionally, two users with the same password will likely receive different salts and so have different hashes in the database, preventing someone from seeing that their passwords are the same.
Now that we're armed with the basics of password storage, what do we do about testing it in our own applications?
First, passwords should never be stored in the clear. You shouldn't be able to see a cleartext password in the database or anywhere in the application. This includes getting back your password as a password reminder. Instead, users should get a one-time token they can use to change their password.
Second, if inputting the same password for two different users results in the same hash in the database, this means that salts are not being used. The password database is vulnerable to a precomputed dictionary attack if someone gets hold of it.
Finally, passwords should be hashed using a purpose-built password-hashing algorithm like bcrypt. Bcrypt is designed to allow you to customize how much computing time is required to hash a password, so you can make guessing large quantities of passwords infeasible while the relatively few hashing operations your application has to perform still are not inconvenienced at all.