Before we jump into today’s post, I have a quick shout-out. If you’re reading this before January 21, 2025, and you’re in NYC, you should come to the event I’m putting on with Tower Research Capital about AI by & for Semiconductors! We’ll do a bit of discussion, a bit of Q&A, and then there’ll be a mixer. Stop by!
For high-performance, secure cloud computing, Trusted Execution Environments (TEEs) like Intel Secure Guard eXtensions (SGX) and AMD’s Secure Encrypted Virtualization (SEV) are the gold standard. TEEs power some of the most valuable, secure cloud services out there. They’re used in the cryptocurrency industry, like in Flashbots’ Rollup-Boost. They’re used for secure AI inference, like in Google’s A3 confidential AI VMs. And they’re used for all sorts of other secure applications in health care, DRM, and payments, too.
But TEEs aren’t nearly as secure as Intel and AMD like to claim that they are. Not only do they rely on security by obscurity, which makes them vulnerable to supply chain attacks, but software vulnerabilities are regularly discovered. Recently, a group of researchers at KU Leuven, the University and Lubeck, and the University of Birmingham discovered a new vulnerability that can compromise AMD’s SEV TEE technology for only $10. It’s called BadRAM, and today we’re going to walk through how it works, as well as discuss how attacks like BadRAM should change how you think about trusted hardware.
How do TEEs work?
The term “trusted execution environment” refers to a large number of different technologies. Some, like Apple’s Secure Enclave, feature a dedicated, isolated, secure processor. These are the most secure way to implement a TEE, but offer limited performance, as the secondary processor is often small and under-powered. Insteal, both Intel SGX and AMD SEV work by adding extra software and hardware isolation features to the main, high-performance processor, to provide extra performance for secure applications at the cost of some security guarantees.
SGX and SEV work by letting user’s define certain memory regions as enclaves. The memory in these enclaves is encrypted by the TEE hardware, and decrypted on-the-fly whenever the CPU needs to access them. These systems also include hardware isolation features to block processes from accessing each others’ memory regions. This means that not only is accessing secret memory prevented by hardware, but even if an attacker manages to do so, that memory is encrypted.
How does BadRAM work?
BadRAM is a physical attack that involves tampering with the DRAM memory modules, called DIMMs, that are installed in a server. However, as far as physical tampering attacks go, it’s fairly cheap and easy to carry out. All an attacker needs to do is replace one of the DIMMs in a server with a compromised DIMM, and then the server becomes vulnerable. An attacker with brief physical access to a cloud system could leverage BadRAM to compromise a system -- for example, a rogue employee at a cloud provider.
To compromise a DIMM, an attacker needs to modify the contents of the Serial Presence Detect (SPD) chip inside of the DIMM. With physical access to the DIMM, this is easy; the SPD chip isn’t designed to be tamper-proof. Specifically, an attacker modifies the SPD chip to lie to the CPU and say that the memory is bigger than the memory actually is.
By properly modifying the SPD data, the attacker can cause memory aliasing inside of the DIMM. This means that two different addresses correspond to the same physical row of memory in the DRAM chip. For example, let’s say that address 0x0000DADA is storing some secure data, protected by an AMD SEV TEE. If a BadRAM DIMM is installed in the system, an adversary could access this data by reading to or writing from the unsecured memory address 0x1000DADA.
Now, as you may be saying, that memory is still encrypted, so what’s the issue? Well, encrypting and decrypting data on-the-fly, like a TEE does, requires extremely fast encryption performance. Because of that, AMD SEV uses static encryption, so the same plaintext always maps to the same encrypted ciphertext. This means that BadRAM can easily replay ciphertexts, even if it doesn’t know the ciphertext’s value.
Also, through page table remapping, BadRAM can entirely decrypt regions of memory. The specifics of how that work are fairly technically complex, though, so I’ll direct interested readers to the BadRAM site and paper.
Are TEEs still secure?
Before BadRAM was publicly announced, AMD released a firmware patch that checks memory for aliasing when the system boots up. So, for systems with up-to-date firmware, BadRAM doesn’t pose any threat. Also, for Intel SGX-based TEEs, the BadRAM attack didn’t even work in the first place. So if BadRAM is mitigated, does that mean TEEs are secure now?
Well, BadRAM isn’t the first attack on TEEs that fundamentally undermined their foundational security assumptions. Before BadRAM, there was MemBuster, another memory attack that was more expensive, but also even more effective. Meltdown and Spectre were classic attacks on speculative execution that could compromise the security of TEEs in both Intel and AMD processors. There are large, exhaustive lists of vulnerabilities for different TEE technologies, some of which remain unresolved.
So does this mean that TEEs are insecure? Well, it’s complicated.
The real problem is that we’ll never entirely know how secure or insecure TEEs really are, because AMD and Intel keep their implementations closed-source. The public doesn’t get to know exactly how these TEEs work at a micro-architectural level, which means that there could still be vulnerabilities hiding. TEEs are good at securing software against most straightforward attacks, but because of their closed-source nature, I think users can only put a limited amount of trust in TEEs. Hopefully, as new, open-source hardware starts to be developed, open-source TEEs, audited for these kinds of vulnerabilities by the open-source security community, can start to restore trust in TEEs.