How to store salt?

  • If you expect to store user password securely, you need to do at least the following:

    $pwd=hash(hash($password) + salt)

    Then, you store $pwd in your system instead of the real password. I have seen some cases where $pwd contains the salt itself.

    I wonder whether the salt should be stored separately, or is it OK if an attacker gets the hashed value and the salt at the same time. Why?

    Well, one thing to consider with salts. If you do manage to keep them secret, and your password hashes get leaked, it becomes completely impossible(probably) to recover a password, even with 10,000 graphics cards brute forcing it

    @Earlz That's **not** the purpose of a salt, and it's nearly impossible to keep salts hidden anyway!

    @Polynomial Well, something to consider. I have an authentication system which uses two salts for each password, one randomly generated and stored in the database, and one which is global to the website and stored in the code. My goal was to make it so that even if the database got leaked, it's still not possible to forge a login cookie or brute force the password.

    @Earlz The second salt is called a pepper, which protects you from attackers that only have access to the database. It's useful, but since you're using parameterised queries (you *are* using them, right?) SQL injection isn't a problem, so that model of attacker is much less likely.

    Heh I did not know there was a name for that, @Polynomial. And I'm pretty sure my database is secure, but just in case it's not. (MongoDB). I figure it can't hurt and it's not that hard to implement

    Forgive my ignorance, but is it really necessary to hash the password _before_ adding the salt? If you have a good hashing algorithm, it shouldn't make a difference, right?

    What about $pwd=hash(hash($password) + username)?

    @X-Zero There should be no difference at all, and it makes things more awkward when using an adaptive KDF.

    @SilverViper That's a bad idea. It gives an attacker the salt, allowing them to compute a rainbow table ahead of time, before overtly breaching your database. In such a situation, the attacker can log in immediately after the attack and you have no time to react. See my answer for a more detailed explanation of this problem.

    +1 for using salts, which is more than big websites seem to do usually.

    @JonasWielicki Yup. Some even store in plaintext. See Plaintext Offenders for examples. Another large company, not listed there, is Tesco.

    Why would you store a salt? If you could compute it using the credentials some way.. such as `$salt = base64_encode($username.$password);` then you don't have to store it, and if the username and password is correct, it will match the salt too.

    @matejkramny That's exactly equivalent to just using the username as the salt, and bad for the same reason: if the attacker knows a username they want to break into, they can just precompute the hashes for that username. Including the password in the salt doesn't change a thing. To stop them from precomputing hashes, the salt should be stored in the database, so they can't get it before getting the hashes themselves, which means they need a lot of time to break the hashes *after* they compromise the database, which gives you a chance to change passwords before they get access.

    // , What containers are you using? Are you using Iodized or non-Iodized? You could wind up with a poisoned salt, so avoid metal containers, because salt leaches metals and/or elements out of the metal. Well regardless, ya don't need to use a food-saver, because salt will not go rancid even if it is exposed to air. For large amounts, I recommend scooping the salt into heavy duty gallon-sized plastic bags. Then you can put the bags in a food grade 5-gallon bucket. For all of you preppers out there, it can be used as a bartering item in time of need.

  • Polynomial

    Polynomial Correct answer

    9 years ago

    TL;DR - You can store the salt in plaintext without any form of obfuscation or encryption, but don't just give it out to anyone who wants it.

    The reason we use salts is to stop precomputation attacks, such as rainbow tables. These attacks involve creating a database of hashes and their plaintexts, so that hashes can be searched for and immediately reversed into plaintext.

    For example*:

    86f7e437faa5a7fce15d1ddcb9eaeaea377667b8 a
    e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98 b
    84a516841ba77a5b4648de2cd0dfcb30ea46dbb4 c
    948291f2d6da8e32b007d5270a0a5d094a455a02 ZZZZZX
    151bfc7ba4995bfa22c723ebe7921b6ddc6961bc ZZZZZY
    18f30f1ba4c62e2b460e693306b39a0de27d747c ZZZZZZ

    Most tables also include a list of common passwords:

    5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 password
    e38ad214943daad1d64c102faec29de4afe9da3d password1
    b7a875fc1ea228b9061041b7cec4bd3c52ab3ce3 letmein
    5cec175b165e3d5e62c9e13ce848ef6feac81bff qwerty123

    *I'm using SHA-1 here as an example, but I'll explain why this is a bad idea later.

    So, if my password hash is 9272d183efd235a6803f595e19616c348c275055, it would be exceedingly easy to search for it in a database and find out that the plaintext is bacon4. So, instead of spending a few hours cracking the hash (ok, in this case it'd be a few minutes on a decent GPU, but we'll talk about this later) you get the result instantly.

    Obviously this is bad for security! So, we use a salt. A salt is a random unique token stored with each password. Let's say the salt is 5aP3v*4!1bN<x4i&3 and the hash is 9537340ced96de413e8534b542f38089c65edff3. Now your database of passwords is useless, because nobody has rainbow tables that include that hash. It's computationally infeasible to generate rainbow tables for every possible salt.

    So now we've forced the bad guys to start cracking the hashes again. In this case, it'd be pretty easy to crack since I used a bad password, but it's still better than him being able to look it up in a tenth of a second!

    Now, since the goal of the salt is only to prevent pre-generated databases from being created, it doesn't need to be encrypted or obscured in the database. You can store it in plaintext. The goal is to force the attacker to have to crack the hashes once he gets the database, instead of being able to just look them all up in a rainbow table.

    However, there is one caveat. If the attacker can quietly access a salt before breaking into your database, e.g. through some script that offers the salt to anyone who asks for it, he can produce a rainbow table for that salt as easily as he could if there wasn't one. This means that he could silently take your admin account's salt and produce a nice big rainbow table, then hack into your database and immediately log in as an admin. This gives you no time to spot that a breach has occurred, and no time to take action to prevent damage, e.g. change the admin password / lock privileged accounts. This doesn't mean you should obscure your salts or attempt to encrypt them, it just means you should design your system such that the only way they can get at the salts is by breaking into the database.

    One other idea to consider is a pepper. A pepper is a second salt which is constant between individual passwords, but not stored in the database. We might implement it as H(salt + password + pepper), or KDF(password + pepper, salt) for a key-derivation function - we'll talk about those later. Such a value might be stored in the code. This means that the attacker has to have access to both the database and the sourcecode (or webapp binaries in the case of ASP .NET, CGI, etc.) in order to attempt to crack the hashes. This idea should only be used to supplement other security measures. A pepper is useful when you're worried about SQL injection attacks, where the attacker only has access to the database, but this model is (slowly) becoming less common as people move to parameterized queries. You are using parameterized queries, right? Some argue that a pepper constitutes security through obscurity, since you're only obscuring the pepper, which is somewhat true, but it's not to say that the idea is without merit.

    Now we're at a situation where the attacker can brute-force each individual password hash, but can no longer search for all the hashes in a rainbow table and recover plaintext passwords immediately. So, how do we prevent brute-force attacks now?

    Modern graphics cards include GPUs with hundreds of cores. Each core is very good at mathematics, but not very good at decision making. It can perform billions of calculations per second, but it's pretty awful at doing operations that require complex branching. Cryptographic hash algorithms fit into the first type of computation. As such, frameworks such as OpenCL and CUDA can be leveraged in order to massively accelerate the operation of hash algorithms. Run oclHashcat with a decent graphics card and you can compute an excess of 10,000,000,000 MD5 hashes per second. SHA-1 isn't much slower, either. There are people out there with dedicated GPU cracking rigs containing 6 or more top-end graphics cards, resulting in a cracking rate of over 50 billion hashes per second for MD5. Let me put that in context: such a system can brute force an 8 character alphanumeric password in less than 4 minutes.

    Clearly hashes like MD5 and SHA-1 are way too fast for this kind of situation. One approach to this is to perform thousands of iterations of a cryptographic hash algorithm:

    hash = H(H(H(H(H(H(H(H(H(H(H(H(H(H(H(...H(password + salt) + salt) + salt) ... )

    This slows down the hash computation, but isn't perfect. Some advocate using SHA-2 family hashes, but this doesn't provide much extra security. A more solid approach is to use a key derivation function with a work factor. These functions take a password, a salt and a work factor. The work factor is a way to scale the speed of the algorithm against your hardware and security requirements:

    hash = KDF(password, salt, workFactor)

    The two most popular KDFs are PBKDF2 and bcrypt. PBKDF2 works by performing iterations of a keyed HMAC (though it can use block ciphers) and bcrypt works by computing and combining a large number of ciphertext blocks from the Blowfish block cipher. Both do roughly the same job. A newer variant of bcrypt called scrypt works on the same principle, but introduces a memory-hard operation that makes cracking on GPUs and FPGA-farms completely infeasible, due to memory bandwidth restrictions.

    Update: As of January 2017, the state-of-the-art hashing algorithm of choice is Argon2, which won the Password Hashing Competition.

    Hopefully this gives you a nice overview of the problems we face when storing passwords, and answers your question about salt storage. I highly recommend checking out the "links of interest" at the bottom of Jacco's answer for further reading, as well as these links:

    +1, Amen. I whish we where able to create an index/directory that points to the good answers to common/popular questions such this one. Even more so, because the common/popular questions regulary also attract quite insecure answers.

    I was reading about Scrypt the other day, which is a KDF which aims to be more computationally expensive than Bcrypt and PBKDF2. It may be another option worth considering.

    @GarrettAlbright I mentioned scrypt in the second-to-last paragraph. The only downside with Scrypt is that there aren't as many libraries available, and there hasn't been much study of it. Right now I'd suggest bcrypt, since FPGA-based cracking engines are not a security concern for most people, but when scrypt has been studied carefully by a few cryptographers I'd be happy to tell people to use it instead.

    +1 for linking to **Links of interest** section of @Jacco's answer.

    However, if you implement factor such as sms code then you could even store your passwords in cleartext.. because they still got to go through the sms code.

    @matejkramny No. The primary issue is that users will ALWAYS re-use their passwords for other sites. You have a duty of care to protect any passwords given to you.

    What is the reason to always add the salt again when re-hashing the password?

    @cooky451 I'm not sure what you're asking, but I suggest creating a new question.

    bcrypt also happens to be the easiest to use from a programmer's perspective. You don't even have to come up with a salt yourself because the crypto library will do it for you, since the output includes the salt (it's in the form `$$$<22 character salt><31 character hash>`). So all the information that is needed is stored in what you can think of as the hash. Doesn't get easier than this (while still being highly secure).

    Just for emphasis: Make sure you use a ***different*** salt for each password! If you use the same salt for every password, the computational complexity of the brute force algorithm is significantly reduced. Furthermore, an attacker could see which users use the same password just by comparing the hashes, which could have a deanonymizing effect on users who reuse passwords across multiple accounts.

    Bouncycastle has a scrypt implementation.

    Really good answer. But you still didn't answer the question "where do I store the random salt"? You need that random salt to check if the entered password is correct... am I wrong?

    "This means that the attacker has to have access to both the database and the sourcecode" - not true for that last part. In many (if not most) cases strings are not encrypted. A hacker can easily load an executable in a hex editor and see most of the strings. They can also step through the assembly code to read values as well. In many cases source code is not required, and thus any salt and pepper value stored/calculated on the client side (i.e. desktop app) is not impossible; just hard. ;) Don't store them in plain text. Obfuscate these types of strings to make finding it harder.

    @JamesWilkins Yes, I was thinking of PHP/Python web applications when I wrote this. I'll edit to reflect the more diverse case where the web app is a .NET or CGI binary.

    –1 — too much text, and doesn't actually answer the question on where to store salt (although I stopped reading midway due to the text being irrelevant to the actual question)

License under CC-BY-SA with attribution

Content dated before 7/24/2021 11:53 AM