A few days ago, Let’s Encrypt issued its one billionth security certificate, and now the free certificate authority is preparing to revoke many of them because of a bug in the server software the project uses.
The bug lies in the way that Let’s Encrypt’s server-side software performs the routine check of domain ownership before issuing a certificate. One of the ways that attackers have tried to take over domains or engender trust in malicious domains is by obtaining valid certificates for those domains. With a certificate in hand, an attacker can make his malicious site appear legitimate and a victim’s browser will trust that site. There are a number of protections in place at various levels of the CA and domain name infrastructure to help prevent this from happening, one of which is domain validation, a relatively simple process that enables the CA to verify that the party requesting a certificate owns the domain in question.
That validation usually involves the domain owner placing a specific file in a specific place on a server and the CA then retrieving it. But that only proves that the requesting party controls the target domain, and not that the controlling party is the legitimate owner. At the same time that it performs the domain validation, Let’s Encrypt also checks the certificate authority authorization (CAA) record for the domain. The CAA record is a separate record issued through the DNS system that defines which CA or CAs are allowed to issue certificates for the domain. The bug that Let’s Encrypt discovered in its Boulder server software is tied to the fact that Let’s Encrypt considers domain validations to be good for 30 days.
“That means in some cases we need to check CAA records a second time, just before issuance. Specifically, we have to check CAA within 8 hours prior to issuance, so any domain name that was validated more than 8 hours ago requires rechecking,” Jacob Hoffman-Andrews, a senior staff technologist at the Electronic Frontier Foundation, which runs Let’s Encrypt, said in a post explaining the bug.
“Because of the way this bug operated, the most commonly affected certificates were those that are reissued very frequently."
“When a certificate request contained N domain names that needed CAA rechecking, Boulder would pick one domain name and check it N times. What this means in practice is that if a subscriber validated a domain name at time X, and the CAA records for that domain at time X allowed Let’s Encrypt issuance, that subscriber would be able to issue a certificate containing that domain name until X+30 days, even if someone later installed CAA records on that domain name that prohibit issuance by Let’s Encrypt.”
The kind of transparency and clear communication Let's Encrypt displayed in this incident is fairly unusual in the CA world, which tends to be pretty opaque about such things. After discovering the bug on Feb. 29, Let’s Encrypt engineers disabled certificate issuance for more than two hours while they fixed it. The bug apparently was introduced into the Boulder software on July 25, 2019, and Let’s Encrypt is still in the process of investigating what happened and what the full effects are.
But one of the immediate effects is that the CA will revoke more than three million certificates tomorrow. The owners of the domains covered by those certificates can renew them before the revocations start, and Let’s Encrypt has been reaching out to the affected owners ahead of time to let them know how to proceed. Three million certificates is a significant number, but in an FAQ on the bug, Let’s Encrypt said that about one million of them are duplicates.
“Because of the way this bug operated, the most commonly affected certificates were those that are reissued very frequently, which is why so many affected certificates are duplicates,” the FAQ says.