Data of more than 150,000 to 200,000 patient were exposed in at least nine GitHub repositories—the result of improper access controls and hardcoded credentials in source code, according to a DataBreaches.net.
Jelle Ursem, a security researcher from the Netherlands, worked with DataBreaches.net after finding credentials to databases and services containing healthcare data in public GitHub repositories. The impacted organizations included medical clinics and hospitals, billing services companies, and other service providers.
Ursem searched on GitHub and found repositories with hardcoded credentials for systems such as databases, Microsoft Office O365, or a Secure File Transfer Host (SFTP). Ursem was able to use those credentials to directly access the systems to see the patient data.
“Once logged in to a Microsoft Office365 or Google G Suite environment, Ursem is often able to see everything an employee sees: contracts, user data, internal agendas, internal documents, emails, address books, team chats, and more,” the report said.
The title of the report is a very good reminder of what Ursem did: No Need to Hack When It’s Leaking. Ursem used valid credentials to log in to the services. There were some common mistakes that allowed Ursem access to the medical data. Developers had embedded hard-coded login credentials in code rather than using a separate configuration file on the server. In the case of email accounts or other online services, two-factor authentication was not enabled. At least one case was of an abandoned repository—the organization no longer needed the data but had kept the repository around instead of deleting it.
One database from a major regional clinic contained 1.3 million records. That database was exposed because Ursem was able to find the URL to the admin console of the electronic health record system being used.
For a software and services consulting company, a developer had included system credentials in code committed to a public repository. Ursem was able to eventually gain access to the vendor’s billing back offices, including data associated with nearly 7,000 patients and over 11,000 health insurance claims. It is unclear whether this data leak was ever reported to the Department of Health and Human Services.
“[Hackers] can find a large number of records in just a few hours of work, and this data can be used to make money in a variety of ways,” the report said.
Developers need to be reminded—and trained—to not embed credentials or access tokens in code which gets posted to public repositories. GitHub and other code-sharing repositories have introduced built-in scans and checks to help detect when credentials are being committed in code, but it is an ongoing problem. There should be regular security audits of all code to catch instances when mistakes are made. Organizations also don’t have to default to having public repositories—where the code can be viewed by anyone—if there is no business need for publicly accessible code.
The report has more details about the kind of mistakes the developers made as well as gaps in the security audits. It is a starting point for healthcare organizations to understand what processes they need to change and the types of mistakes to look for.
According to the report, Ursem struggled to notify the nine impacted organizations because there was no way to contact them or because they didn’t respond. Organizations need to make sure there is a clear reporting path so that they can find out when code is improperly exposed. That could be posting a public email address that is monitored regularly or providing customer support teams with a clear escalation path when these reports are made. Organizations also need to make sure their partners and contractors also know how to handle these attempts.
“[At] least three of the nine entities intentionally did not respond to early notification attempts and would later claim that they had been fearful the notifications were a social engineering attack. Their failure to respond left PHI exposed even longer,” the report said.
Data leaks because they were stored online with insufficient controls are getting increasingly common. In some cases, it was the case of the organization not configuring the cloud servers properly or not realizing that access controls were missing. In many cases, the leaks were the result of organizations not realizing how their third-party suppliers and contractors handled their data. Organizations should be asking their suppliers and contractors for audits to make sure their partners are also properly locking down how they use code repositories.
Administrators should routinely search GitHub for their firm’s name and domain names to see what comes up.
“[Even] if you do not use a developer, one of your business associates or vendors might,” the report said.