XZ Utils is a free and open source software suite that provides highly efficient file compression and decompression using the LZMA2 compression algorithm. LZMA2 provides better file compression than the older gzip and bzip2 compression algorithms, so many open source projects include the XZ Utils components to be able to work with smaller compressed files to reduce bandwidth and file storage costs.
On March 29th, Microsoft engineer Andres Freund publicly described an attack whereby malicious code was added to releases of XZ Utils for the purpose of allowing backdoor access to SSH servers. Investigation into the source of the malicious code showed that a trusted maintainer of the XZ Utils project had deliberately introduced the vulnerability.
The XZ Utils backdoor attack on SSH was shockingly sophisticated. It exploited the indirect usage of liblzma from XZ Utils by OpenSSH and the attack code was committed to XZ Utils as seemingly innocuous code, right in the open.
In his email to the Openwall mailing list, Andres Freund describes the roundabout circumstances that led to the SSH daemon using the liblzma library from XZ Utils: "openssh does not directly use liblzma. However debian and several other distributions patch openssh to support systemd notification, and libsystemd does depend on lzma." (source) As a core piece of security infrastructure, the OpenSSH design and source code are carefully scrutinized to minimize attack surface. When performing threat modeling on OpenSSH, no relations to liblzma from the XZ Utils would be found in the OpenSSH source code on Github. Even in Debian’s patched version of OpenSSH, sshd did not directly reference liblzma in source code or as a library, however a binary SBOM analysis would detect the linkage from sshd to liblzma.
One of the most surprising aspects of the XZ Utils backdoor attack is that the malicious code was added to the open source XZ Utils project via seemingly-innocuous git commits. Rather than modifying the source code of XZ Utils, malware payload was hidden as x86_64 object code embedded within binary test files, ostensibly committed to unit test edge-cases in XZ decompression behavior. This object code was only inserted into the final released XZ Utils artifacts as part of the release build process which used commands hidden in Autotools scripts to decode the malicious object code for inclusion in the tarballs used to distribute releases.
As the head of engineering at Corsha, this attack vector appeals to me personally because I review dozens of Git pull requests each week and the method by which this malicious code was committed is tricky to catch. The tell-tale sign was that the test data was not even exercised by tests. In code review, coming across a binary file (.zip, .gpg, etc.) will always slow down the review process as the opaque contents are checked for correctness and necessity.
In the end, the XZ Utils backdoor attack was luckily found before it made it to production releases of any Linux distributions, as the latest example of Linus’s Law that “given enough eyeballs, all bugs are shallow”. Much like 2014’s Heartbleed vulnerability in OpenSSL, the XZ Utils backdoor was able to go undetected because of the understaffing on the XZ Utils project, dwindling from half a dozen contributors to just one. With just a single maintainer on the XZ Utils project, the attacker was able to gain the trust of this maintainer and after 2 years the attacker was elevated to become a co-owner of the XZ Utils project.
by David Mazary
About David Mazary
David serves as Corsha's Head of Engineering. With a decade of experience crafting mission-critical systems for defense customers, David is a seasoned professional in the field and a frequent contributor to the Corsha blog and other online forums.
About Corsha
Corsha is an Identity Provider for Machines that allows an enterprise to securely connect, move data, and automate with confidence from anywhere to anywhere. Corsha builds dynamic identities for trusted machines and brings innovation like automated, one-time-use MFA credentials to APIs. This ensures automated communication across clouds, data centers, or shop floors is pinned to only trusted machines and helps an organization move past outdated, costly secrets management and reimagine identity and access for machines.