When to use HMAC alongside AES?
One of my clients wants to provide an URL to each of his customers to register in a system. This URL would contain a query string parameter with some data (e.g. code, email and name) from his clients encrypted using AES with CBC, similar to this (IV is in bold):
When a user enter this page, it would decrypt it and check if the data is valid (e.g. if the customer's code is valid in an external database and if it is not already registered) in order to show the register form to him.
I've seen some people using HMAC alongside AES to encrypt data, but I don't know if this is needed in a case like this. My question is: Is this secure enough? Should I use something like HMAC along with the plaintext before encrypting it with AES to authenticate the data?
AES is encryption; it is meant to maintain confidentiality. Encryption does not maintain integrity by itself: an attacker who can access encrypted data can modify the bytes, thereby impacting the cleartext data (though the encryption makes the task a bit harder for the attacker, it is not as infeasible as is often assumed).
To get integrity, you need a MAC, and HMAC is a nice MAC algorithm.
In many situations where encryption is mandated, integrity must also be maintained, so, as a general rule, AES "alone" is not sufficient. In your case, the potential attackers are the customers themselves; each customer may try to alter the URL so as to access the data of somebody else, or to be able to register under another name, or whatsnot. As I understand it, your "client" wants to remain the master of registration; he wants to decide which customer can register and under what name. Checked integrity is thus necessary.
There are several ways to combine AES-based encryption with HMAC; most of them are bad. See this question for some discussion on the subject. To make the story short:
If you can, use GCM or some other mode which does all the hard work of combining encryption and MAC safely.
If you cannot use GCM (for lack of support in your server-side programming framework), then you must do things old-style:
- Hash the key K with SHA-256 so as to get 256 bits of "key material". Split that into two halves: 128 bits for encryption (Ke), 128 bits for MAC (Km).
- Generate a random IV of 128 bits. You need a new one every time you encrypt, and you want to generate it with a strong PRNG (
- Pad the data (usual PKCS#5 padding) so that its length is a multiple of the AES block size (16 bytes).
- Encrypt the data with AES in CBC mode, using the IV generated just above, and Ke as key. Let's call C the resulting ciphertext.
- Compute HMAC/SHA-256 with key Km over the concatenation of IV and C, in that order. Call M the resulting value. It is crucial that the IV is part of the input to HMAC.
- Concatenate IV, C and M, in that order. This is your "registration key".
- When receiving a registration request, first verify the HMAC (by recomputing it), then (and only then) proceed to the decryption step.
Of course, all of this assumes that there is a key K, that your client can use to generate the registration keys for the customers, and that your server also knows in order to verify and decrypt incoming registration requests. As with all keys, be careful where you store it.
It's a minor difference, but shouldn't that be PKCS#7 padding, rather than PKCS#5? Looking at the PKCS#5 spec, it seems to be specific to 8-byte blocks rather than AES's 16.
The _concept_ of that padding was described in PKCS#5 first, but PKCS#7 worded it out in a more generic way. At the level of this discussion, "PKCS#5" and "PKCS#7" can be used interchangeably.
Can we use the Hash of Key bytes directly or in KDF? and could we instead use SHA-512 hash to create 2 x 256bit keys for the HMAC Ke/Km keys?
Strictly speaking, what is needed here is a KDF, but a good hash function can work as a KDF as long as the total length you require is no more than the hash function output length. If you want 512 bits worth of derived keys, then SHA-512 can give you that. (Of course, 256-bit keys are no better than 128-bit keys, so this is moot.)
Why is the order of concatenation important when computing the HMAC and creating the "registration key"?
@ThomasPornin I've implemented your 'old-style' method almost as outlined. Seems to work well. TY. I've added an additional step in the process - unsure if it has any benefit? Instead of AES encrypting the message I encrypt the IV(string version) + data, then continue as normal with: HMAC the IV + C. On decrypting I first verify HMAC, then decrypt, then strip off and verify the IV inside matches the one on the front of the original message. Belt and braces, extra salt, unnecessary, or actually introducing a potential weakness? Thanks again!