How exactly does the OpenSSL TLS heartbeat (Heartbleed) exploit work?
I've been hearing more about the OpenSSL Heartbleed attack, which exploits some flaw in the heartbeat step of TLS. If you haven't heard of it, it allows people to:
- Steal OpenSSL private keys
- Steal OpenSSL secondary keys
- Retrieve up to 64kb of memory from the affected server
- As a result, decrypt all traffic between the server and client(s)
The commit to OpenSSL which fixes this issue is here
I'm a bit unclear - everything I've read contains information about what one should do about it, but not how it works. So, how does this attack work?
Another useful blog article addressing the Heartbleed bug can be found here: http://cloudishvps.com/linux/openssl-heartbleed-bug-a-quick-explanation-on-the-recent-security-issue-and-the-fix/ It also gives the steps to upgrade OpenSSL on CentOS and Ubuntu.
Zulfikar Ramzan (CTO of cloud security firm Elastica) made this video, which does a great job of explaining the bug at a pretty high level. He also does a lot of videos for Khan Academy. Vimeo: OpenSSL Heartbeat (Heartbleed) Vulnerability (CVE-2014-0160) and its High-Level Mechanics Thanks to Greg Kumparak of TechCrunch for the link.
I was just curious about how the exploit works and video explains that perfectly, you should definitely check it out.
This is not a flaw in TLS; it is a simple memory safety bug in OpenSSL.
In short, Heartbeat allows one endpoint to go "I'm sending you some data, echo it back to me". You send both a length figure and the data itself. The length figure can be up to 64 KiB. Unfortunately, if you use the length figure to claim "I'm sending 64 KiB of data" (for example) and then only really send, say, one byte, OpenSSL would send you back your one byte -- and 64 KiB (minus one) of other data from RAM.
This allows the other endpoint to get random portions of memory from the process using OpenSSL. An attacker cannot choose which memory, but if they try enough times, their request's data structure is likely to wind up next to something interesting, such as your private keys, or users' cookies or passwords.
None of this activity will be logged anywhere, unless you record, like, all your raw TLS connection data.
The above xkcd comic does a nice job illustrating the issue.
Edit: I wrote in a comment below that the heartbeat messages are encrypted. This is not always true. You can send a heartbeat early in the TLS handshake, before encryption has been turned on (though you're not supposed to). In this case, both the request and response will be unencrypted. In normal usage, heartbeats ought to always be sent later, encrypted, but most exploit tools will probably not bother to complete the handshake and wait for encryption. (Thanks, RedBaron.)
`memcpy` with an unchecked length-parameter supplied by the user - one of the most common security-mistakes in C. It's a sad miracle that this wasn't noticed for so long in a software so widely deployed as OpenSSL.
@Philipp It wasn't noticed *and reported* until now. It's plausible that sharp-eyed intelligence agencies have been exploiting it for the last two years.
@MattNordhoff: Wont the 64 KB data sent also encrypted in case of heartbeat response ?
@ArunMu Yes, it will be encrypted. You can rest assured that attackers will be downloading your RAM *securely*. :-P Except from other attackers who have already stolen your keys.
@MattNordhoff: Yeah, this is what is confusing. How can one just steal the key using this attack?
@ArunMu It allows an attacker to connect to your TLS server and download the contents of much or all of the memory being used by the victim process, which will include things like private keys and HTTP cookies from other in-flight requests.
@MattNordhoff, why will users' cookies and passwords even be stored on RAM? Wouldn't that section of memory be wiped immediately after use?
@Pacerier Maybe, maybe not. But it sure won't be wiped *during* use. That stuff has to be in the server's memory for at least a moment.
@MattNordhoff, Let's assume we have a server that wipes it straight after use. And since the attacker is basically getting random data, Doesn't that make it almost impossible for an attacker to gain those passwords/cookies within that small timeframe?
This is actually a really great answer that is both easy to understand and technically accurate. To mangle an old saying: "Never hold out for a long answer when a diminutive one will do".
While it is not a flaw in the TLS extention or the TLS protocol, the TLS specification is still somewhat responsible. The layering of messages inside records and the fact that you typically have multiple length specifications inside those records as a very fragile protocol design and asks for trouble. Even worse when implementations do not abstract the segmentation and parsing away with safe helper methods (so all extension parsers need to reinvent the wheel).
@Philipp: Why doesn't security software routinely zero buffers as a matter of course? Performance impact should be pretty minor compared with other costs associated with encryption, etc.
@supercat I dont think the actual buffers are a problem here, the network buffers get overwritten, so when you extract memeory its only the current stuff in memory. And you cannot overwrite stuff you need later on (like cached sessions). And in fact zeroing is quite a high overhead for web servers. I think however some more static allocation of ssl record buffers would not only improve performance but also leaking random system memory (especialy initially allovated "system parameter" memory like the key)
@eckes: Are you saying incoming message gets stored in a buffer of its actual length, and the system sends out that buffer plus whatever follows in memory, meaning it's an array-bounds error rather than an uninitialized memory error? If so, that just goes to show array bounds should be checked too.
@supercat no, the openssl layer which processes records will allocate and store the ssl record based on the (first) record length. This stored record is handed to the extension processing, where another length identifier inside the record is used. And the later one happens to have a larger number. So the memcopy starts at the beginning of the valid record and includes data after it. I suspect its a malloced buffer (even when thats a weird thing for high performance network code.) Check the diagnosis link above.
Great explanation but I'm confused about one thing. Many bug PoC or test programs send the heart-beat packet in unencrypted form. (this and expolit-db) but you mention that this communication is encrypted. It loks like the online test tool actually encrypts this communication. I am confused about how the OpenSSL knows whether the HB will be encrypted or unencrypted?
@RedBaron I glanced at the second tool, and I think it's sending the heartbeat very early in the connection, during the handshake, before the cryptographic parameters have been established and the connection has started to be encrypted. The RFC says "a HeartbeatRequest message SHOULD NOT be sent during handshakes", but it is valid. So I was somewhat wrong: the heartbeat may or may not be encrypted. OpenSSL "knows" whether it's encrypted because it knows the current status of the TLS connection -- it has to, or TLS wouldn't work at all.
Actually now that I've thought about it, it has become clearer. The PoCs never complete the TLS handshake (ChangeCipherSpec et al), so any further TLS comm. has to be plain-text while the online test tool completes the entire handshake so the HB is encrypted. However, the Heartbeat RFC explicitly state that no HB messages SHOULD BE sent during hand-shake. Isn't OpenSSL actually not complying with the RFC by acepting HB messages before entire handshake has been completed? Not that it makes any difference for bug but it sure makes exploitation a lot easier
@matt Heh telepathic timing. But RFC also states that any implementation should discard such HB packets received during handshake
@RedBaron Mid-air collision! The RFC says a heartbeat "SHOULD NOT" be sent during the handshake and that the receiver "SHOULD" drop it. An implementation is allowed to disobey a "SHOULD" -- otherwise it would be a "MUST".
So because the data is in an allocated block, the data is not data currently in use. I am surprised there is no wiping of that data at the time it's freed. Like a dot net `SecureString`.aspx) does.