Earlier this week, Israeli security company CTS posted a description of 13 security vulnerabilities in AMD’s Ryzen and EPYC chips on the AMDFlaws website. The vulnerabilities affect AMD's secure coprocessor, which handles security functions for the main chip such as storing cryptographic keys and validating the operating system to ensure it hasn’t been tampered by malware. There was a “whitepaper” with some explosive claims, but no technical details to back them up. The vulnerabilities were grouped into four distinct classes-- RyzenFall, MasterKey, Fallout, and Chimera--but the descriptions were in general terms.
Everyone--security researchers, end-users, enterprise IT teams--was stuck. No one knew what was going on, and there was no way to find out. There was nothing end-users could do about their existing hardware, or indeed, know if there was anything that needed to be done. They could stop buying computers with AMD chips, but people had no idea what to do about their current machines. Enterprise IT teams had no guidance on what they could do to reduce the impact on their organizations in case of an exploit, or even what that attack would look like. It was the perfect setup for confusion and panic.
Publicize, but don’t tell
The backlash was inevitable. CTS was criticized for its glossy brochure-website, slick video, and lack of technical details. CTS also didn’t follow common practices for vulnerability disclosure, which generally give vendors ample time to fix an issue instead of the 24 hours AMD reportedly got. The company’s CTO responded to the backlash in a letter, with some bemusement at the level of anger the announcement had stirred up, arguing the current way the industry handled vulnerabilities gave control to the vendor.
“[It is] extremely rare that the vendor will come out ahead of time notifying the customers – 'We have problems that put you at risk, we’re working on it.’ Almost always it’s post-factum – 'We had problems, here’s the patch – no need to worry,’” Ilia Luk-Zilberman, CTO of CTS, wrote in the letter.
Instead of vendors working together to fix the problems in a timely fashion and then releasing the patch--or releasing the technical details if the vendor wasn’t fixing the issue--Luk-Zilberman said it would be better to create public pressure to force a fix.
"[A] better way, would be to notify the public on day 0 that there are vulnerabilities and what is the impact. To notify the public and the vendor together. And not to disclose the actual technical details ever unless it’s already fixed. To put the full public pressure on the vendor from the get go, but to never put customers at risk,” he wrote.
The problem with this approach, which Luk-Zilberman didn’t address in his letter, is that there is no way for anyone to confirm the validity of the research or to verify that the issues are critical enough to put this kind of pressure on the vendor.
CTS assumed AMD was going to act in bad faith when the chipmaker hadn’t done anything to make CTS think it was going to ignore the report. CTS may think the mistakes leading to the vulnerabilities should never have happened, (“the Ryzen and Ryzen Pro chipsets, currently shipping with exploitable backdoors, could not have passed even the most rudimentary white-box security review,” the CTO wrote) but there was no reason to think AMD would not have fixed the issues. It might have taken a long time--and that’s a legitimate concern--but Luk-Zilberman didn’t really show that there was no way AMD couldn’t have fixed the issues if it had been given the time.
Bugs still need fixing
After the initial disclosure--the marketing blitz, if you will--and the backlash, several security experts defended the technical findings on social media. “I have seen the technical details and there are legit design & implementation issues worth discussing as part of a coordinated disclosure effort,” Alex Ionescu, a well-respected security researcher familiar with chip design, said on Twitter. Focusing on the way the issues were disclosed is “sadly distracting from a real conversation around security boundaries,” Ionescu wrote.
Fair enough, but there is no way to have a “real conversation” if one side has all the information and is just saying “Trust us, we know what we are talking about.”
Dan Guido, CEO of New York City-based security consultancy Trail of Bits, who reviewed the technical paper before the site was publicized, posted a technical explanation on the Trail of Bits blog without giving away the details of the vulnerabilities. Simply put, CTS claimed attackers could manipulate the secure coprocessors and access the secure data. MASTERKEY bypasses the signature checks performed by the AMD Platform Security Processor (PSP) and would let attackers update the BIOS with malicious firmware. FALLOUT and RYZENFALL exploit PSP’s APIs to gain code execution. The CHIMERA bug abuses the exposed interfaces of the Promontory chipset to gain code execution.
Guido’s summary was necessary because it helped explain why vulnerabilities in AMD’s security coprocessor could wind up compromising, or bypassing, security features such as Trusted Platform Module (TPM) or Secure Encrypted Virtualization. Until Guido published his post, there was no way to verify the claim that the issues should be considered “critical.”
While Guido vouched for CTS on the validity of the vulnerabilities, his summary undermined Luk-Zilberman’s core point that the vulnerabilities were bad enough to justify pseudo-public disclosure. “There is no immediate risk of exploitation of these vulnerabilities for most users. Even if the full details were published today, attackers would need to invest significant development efforts to build attack tools that utilize these vulnerabilities,” Guido wrote.
Luk-Zulberman said his approach was necessary in order to ensure AMD acted promptly. “I honestly think it’s hard to believe we’re the only group in the world who has these vulnerabilities, considering who are the actors in the world today, and us being a small group of 6 researchers,” he wrote.
Contrast that with what Guido wrote, “This level of effort is beyond the reach of most attackers.”
Know where to look
One problem with saying there are issues but not discussing it is that it encourages others to go digging to try to find the issues themselves. Luk-Zulberman definitely made it seem like it would be easy to find when he wrote, “[About] once a week we found a new vulnerability, not in one specific section, but across different sections and regions of the chips. It’s just filled with so many vulnerabilities that you just have to point, research, and you’ll find something.”
If Guido is right, the vulnerabilities may not be discovered easily by someone else. Luk-Zilberman seems to think anyone else would stumble over the issues. If the truth is somewhere in the middle, then discovery by someone else is inevitable, and the technical details will eventually come out. At which point, if AMD hasn’t finished its patching process, users will be at risk for attack. In this case, the fact that it is complex and hard to find (per Guido) may be the only reason why someone else won’t find it independently.
Attackers can try to figure out where the issues are by reverse engineering. Just because you aren’t saying what the issue is, doesn’t mean someone else can’t find it.
The fact is, with so many people focused on research and hunting for vulnerabilities, no one disclosure model is going to prevail. There will always be reasons why the standard model doesn't apply. That's why trust is so important.
Fear is lucrative
Based on the language in the whitepaper, many in the industry accused CTS of trying to short AMD stock. Short-sellers take a different approach to the stock market than most investors. While most want the share price to go up so that they can make money when they sell the stock, short-sellers bet against the stock and profit when the share price falls. It’s a financial bet, and relies on the company having bad news.
Back in 2016, St. Jude accused hedge fund Muddy Waters of partnering with security company MedSec to manipulate St. Jude stock by publicly disclosing vulnerabilities in St. Jude pacemakers. MedSec’s chief told Bloomberg Television at the time that MedSec didn’t go to St. Jude with its findings because they were “worried that they would sweep this under the rug.”
The focus now, as well as then, is on the vendor when it should be on the end user. The goal shouldn’t be what to do for AMD, but rather, what helps the end-user. The current situation doesn’t help the end user any, because they are left knowing something is wrong without being able to do anything about it.
The whole situation feels similar to what is happening within the Android ecosystem. It’s great that Google is fixing the issues in Android, but because the fixes never make it to many user handsets (because of how US mobile carriers handle software updates), users tune out any discussion of Android vulnerabilities.
If the response to every report of a critical vulnerability is, there is nothing I can do, then users will feel justified in not caring and ignoring security. That is exactly the opposite of the kind of reaction we need.