In the past couple of years, large swaths of the security community have been getting excited about trusted execution environments (TEEs) and secure enclaves (SEVs). Compared to other privacy-enhancing technologies like homomorphic encryption and zero-knowledge proofs, TEEs represent a much faster, cheaper, and more efficient way to secure data. This makes TEEs a viable solution for securing large, complex workloads. People have proposed using TEEs to secure AI workloads, to accelerate blockchain rollups, and enable data cleanrooms. But TEEs aren’t as secure as we might want to believe.
Unlike cryptographic privacy-enhancing technologies, TEEs and SEVs aren’t mathematically guaranteed to be secure. They’re only secure because a bunch of hardware security engineers worked hard to try and figure out every vulnerability and protect against it… but sometimes they miss things. Earlier this year, we talked about BadRAM, which modifies DRAM memory modules to enable unauthorized access to secure memory. There are a large number of speculative execution vulnerabilities, which leverage performance-enhancing processor features to leak data access patterns. And that’s not even mentioning side-channel attacks, which require physical access to a chip but can easily leak cryptographic keys.
Well, there’s a new TEE vulnerability in town. The Heracles attack on AMD’s SEV technology enables a malicious hypervisor to read secure memory from a user process. Essentially, this means that a malicious cloud provider could access secret customer data running in confidential TEEs on their hardware. Today, we’re going to walk through the specifics of the attack, how to stop it, and what it means for TEE design in the future.
How do TEEs encrypt memory?
Trusted execution environments let user processes define different memory regions as enclaves, which can only be accessed by that specific process. This prevents other processes on the system from accessing or tampering with a user’s secret data. Most TEEs also feature both hardware and software isolation features to prevent processes from accessing each others’ memory regions. But encrypting large amounts of active memory is a difficult problem.
To make a TEE work efficiently, large blocks of data need to be securely encrypted without significantly impacting the latency of accessing and using that data. At the same time, a user needs to be able to access and decrypt any chunk of data from anywhere on the disk without too much overhead. Many common block cipher modes, like cipher block chaining, encrypt data in blocks such that each block depends on the previous block. This means that a user would need to read the whole disk to decrypt any block. There are workarounds, but they require adding a kind of metadata called initialization vectors to each sector of the disk.
AMD SEV uses a block cipher mode called Xor-Encrypt-Xor (XEX) for memory encryption. XEX is designed from the ground up to be well-suited to encrypting large chunks of memory. Both XEX encryption and decryption have independent block processing, which means that decryption can be parallelized to significantly improve latency. XEX also avoids initialization vectors by tweaking one of the encryption keys based on the address of the data in memory. That way, a user can read any chunk of memory they want, independently of other blocks.
This address-based key tweak is important, as it prevents an adversary from copying a user’s data to a different chunk of memory and then decrypting it. Because the copied data is in a different address, it will decrypt with a different key, and turn into gibberish. But the Heracles attack breaks that assumption using some clever memory-swapping tricks.
How does Heracles work?
Heracles relies on the fact that the hypervisor, which is the piece of software that manages virtualization on a computer, is capable of moving user data around in memory. When the hypervisor does this, the memory controller re-encrypts the memory with a new key, based on the new memory address. But while the encryption keys vary from address to address, they don’t change over time.
To perform the attack, a malicious hypervisor sets up two chunks (pages) of memory: one controlled by the victim, and one controlled by the hypervisor. The attacker observes the encrypted value stored in the user’s memory. Then, he writes data into his memory, and swaps his memory with the user’s. Finally, the attacker observes the encrypted value now stored in his memory, which is located in the same address where the user’s memory just was. If the encrypted value of the attacker’s memory matches the old encrypted value of the user’s memory, they must also have had the same plaintext. If the attacker continues on this way block-by-block, they can reconstruct the entire chunk of secure user memory.
Unfortunately, this is very slow. AES uses 16-byte blocks, so an adversary would need to guess 2128 different plaintexts for every block. Luckily, the Heracles team has a clever workaround.
Specifically, the Heracles team tailors their attack to work in situations where the attacker only needs to guess a single byte. This narrows down the search space from 2128 down to 28 -- several orders of magnitude faster. And it turns out that there are a lot of situations where such a byte-by-byte attack makes sense. For example, when a user is inputting their password while running the sudo command, that password is written to memory one byte at a time, so Heracles can easily leak it.
Using similar tricks to restrict an attack to single-byte guessing, the Heracles team can hijack a user’s session on a webserver, and they can even leak a user’s cryptographic keys. This makes Heracles a powerful and flexible attack, even with its performance limitations.
So what do we do now?
Luckily, there are mitigations to the Heracles attack. AMD already added support for a policy that disables the ability of the hypervisor to swap guest pages, but this may negatively affect memory utilization and overall system performance. And as far as we know, this group of ETH Zurich researchers are the first to have discovered this attack and disclosed it responsibly, so AMD was able to deploy a fix before anybody was able to leverage Heracles in the wild.
More generally, though, modern TEEs have some fundamental problems. Firstly, they’re proprietary, poorly documented, and often have their security features partially obfuscated by the companies that design them. Because the security community isn’t able to audit these TEEs, there could always be another vulnerability lurking around the corner, waiting to be discovered. That’s part of the reason I’m excited about open-source TEEs and their ability to restore trust into the world of secure hardware.
But also, building secure TEEs faces some fundamental challenges. Modern TEEs like AMD SEV and Intel SGX often act as a set of additional features crudely bolted on to the side of an existing, insecure processor. Building a truly secure processor might require a fundamentally different architecture. For example, many modern CPUs run deep out-of-order pipelines to increase performance, but these often introduce side-channel attacks. A security-oriented high-performance processor might use a large number of parallel compute cores without any pipelining to prevent these sorts of attacks. But that sort of design wouldn’t be possible inside of the current paradigm of adding additional security instructions to conventional processor cores. I hope one day, the demand for secure processing grows to the point where engineers can build hardware that’s actually secure from the ground up.