Detecting malware in package manager repositories
Author: Tomislav Pericin, Chief Software Architect & Co-Founder at ReversingLabs
Software developers are increasingly being targeted by supply chain attacks. Popularization of package managers and their integration into the development pipelines has made this ecosystem an interesting target for various threat actors. A testament to how successful they’ve become at this is the frequency at which such stories appear in the media. Recently it feels like new supply chain attacks are being discovered and talked about on a daily basis.
Developer account or build environment compromise are the two main ways attackers subvert the development infrastructure. Once in, they have an almost infinite number of ways to plant malware and introduce backdoors into the software package.
Backdoors are particularly tricky to detect because of this. A single line of code that changes the logic of the program can be a backdoor that allows unauthorized access to the system. Similarly, a single line of code is all it takes for a script to reach out to a remote server and download instructions to execute on the infected machine. Because of this, such supply chain attacks are usually detected post-infection, by developers themselves, upon realization that something odd is going on in the system.
Malware is what usually follows a successful backdoor installation. And its second stage payload can be anything, from credential stealing malware to ransomware or crypto mining software. In today’s world, most attackers are financially motivated and will use their position for any sort of gain.
The complexity of finding these supply chain attacks cannot be overstated. A backdoor that leads to an infection can be a single line of code, one that’s hidden away, surrounded by hundreds of others written by the original software authors. And that backdoored package is just one of the million others hosted in the package repository. That makes checking all this code a daunting task, to say the least.
We processed the entire NPM repository with our Titanium Platform static analysis solution running on a single server with one AMD EPYC 32 core processor, 256 GB of RAM, and two 3.5 TB NVMe SSD disks. This analysis took slightly under 6 days and 16 hours to complete, during which we’ve collected, and indexed, around 1.6TB worth of metadata. This metadata, generated by our static analysis engine, is what allows us to hunt for malicious files and answer complex questions - like which packages contain portable executable files that are similar to previously known malicious pieces of code. And, as it is implied by the title of this blog, we did find a few of those.
Fig: Breakdown of filetypes and subtypes across the NPM repository
Intrigued by the presence of executable code for Windows, PE files on the graph, we decided to dig in and take a closer look at the packages containing this file type. Among the plethora of Mono .NET applications and common third-party libraries, stood a pretty well-known potentially unwanted application - a password recovery tool. We call these kinds of applications potentially unwanted as their use, or misuse, has to be put in context. A password recovery tool used to refresh your memory when you forget a website credential is OK, but it being found in NPM repository, probably not OK.
The password recovery tool in question is WebBrowserPassView. It is used to recover website login information stored by Internet Explorer, Mozilla Firefox, Google Chrome, Safari, and Opera browsers.
Its interface and usage are pretty straightforward, but that’s not the reason why threat actors use it. The appeal of this tool is in the fact that it can be used programmatically from the command line, without even showing the graphical interface, to store the credentials list shown above to a file. That makes it perfect for threat actors. With all the complexity of password recovery taken away, they can focus on what really matters: easy credentials.
We found this password recovery tool in a package called bb-builder, and its story is certainly an interesting one. It was uploaded to the NPM repository by a pretty active open source developer, someone with hundreds of projects on their personal GitHub page. But this bb-builder project isn’t one of them. Furthermore, that is the most recently published and updated project on their NPM page, telling us that the account probably isn’t actively used anymore.
Inspection of the code within the different versions of bb-builder paints a picture of how this code was developed. And with the help of the metadata NPM registry tracks about these packages, the timeline of this supply chain attack can be perfectly reconstructed.
bb-builder project is created and its initial version 1.0.0 is uploaded to the NPM repository. The package is empty and contains a very basic placeholder manifest. This is likely done to test if the compromised developer credentials are still valid.
bb-builder project is updated to version 1.0.2. Dependency on Axios package is added, an HTTP client that can post web requests. This dependency is used to submit the password recovery output to a web server hosted here http://1[.]host[.]jwte[.]ch:1337/pwn
bb-builder project is updated to version 1.0.3. The attacker realizes that storing the password recovery output in the current working directory probably isn’t the greatest idea and decides to change that to a more discreet location.
bb-builder project is updated to version 1.0.4. The attacker realizes that the server that’s set up to receive the passwords isn’t getting any traffic from his test machine. There’s a bug in the code that incorrectly tries to locate the password recovery output. It is fixed.
bb-builder project is updated to version 1.0.5. The attacker refactors the code a little bit to make it easier to maintain down the road.
bb-builder project is updated to version 1.0.6. The attacker decides that leaving the evidence in the form of the password recovery output also isn’t a great idea and decides to add code that deletes it after submission to the previously mentioned web server. This is the final version of bb-builder project.
Download statistics for this NPM package give a glimpse into its possible prevalence in the field.
Fig: Download statistics for NPM project bb-builder
This package isn’t included in any other as a dependency, therefore its installation prevalence can only be attributed to wrongful installation by developers that were looking for a different package with a similar name. The NPM repository hosts a few dozen packages that start with the term bb. It is very likely that the ones that have installed bb-builder by mistake were looking for the package bb-build instead.
The domain that collected the stolen credentials is gone at this point, and the damage bb-builder package can cause without it is practically zero. But its existence is a cautionary tale of how easy it is for these kinds of attacks to hide in plain sight, and how easy it is to remain undetected for a very long time.
Titanium platform is the ultimate static analysis solution capable of taking on the challenge that is large dataset analysis. But its power isn’t only in the malware detection rules we’ve put into it ourselves, it is also in the flexibility that it gives its users to extend it by adding their own. This allows for rapid threat hunting iterations. Regardless of the dataset being processed, be it a public one like NPM or a private one that you own, hunting for threats has never been this fast.
Preventing software supply chain attacks that target your developers is only possible with a platform that can inspect every single package they use.
Affected packages and SHA256:
08/20/2019 - Contacted NPM security team
08/21/2019 - NPM security team removes the packages (advisory)
To learn more about how static analysis can find malicious code within a package manager repository, listen to our podcast hosted by Cyberwire: Detecting Malware in Package Manager Repositories