Xzbot: Notes, honeypot, and exploit demo for the xz backdoor

It's pretty interesting that they didn't just introduce an RCE that anyone can exploit, it requires the attacker's private key. It's ironically a very security conscious vulnerability.

haswell · 2 years ago

I suspect the original rationale is about preserving the longevity of the backdoor. If you blow a hole wide open that anyone can enter, it’s going to be found and shut down quickly.

If this hadn’t had the performance impact that brought it quickly to the surface, it’s possible that this would have lived quietly for a long time exactly because it’s not widely exploitable.

eli · 2 years ago

More to the point it prevents your enemies from using the exploit against friendly targets.

The tradeoff is that, once you find it, it's very clearly a backdoor. No way you can pretend this was an innocent bug.

declan_roberts · 2 years ago

I agree that this is probably about persistence. Initially I thought the developer was playing the long-con to dump some crypto exchange and make off with literally a billion dollars or more.

But if that was the case they wouldn't bother with the key. It'd be a one-and-done situation. It would be a stop-the-world event.

Now it looks more like nation-state spycraft.

Deleted Comment

chpatrick · 2 years ago

Also people can come and immediately undo whatever you did if it's not authenticated.

Beijinger · 2 years ago

This is just a safety measure that it does not blow up in your own face (country).

Deleted Comment

takeda · 2 years ago

If you think of it as a state sponsored attack it makes a lot of sense to have a "secure" vulnerability in system that your own citizens might use.

It looks like the whole contribution to xz was an effort to just inject that backdoor. For example the author created the whole test framework where he could hide the malicious payload.

Before he started work on xz, he made contribution to libarchive in BSD which created a vulnerability.

pxx · 2 years ago

The libarchive diff didn't create any vulnerability. The fprintf calls were consistent with others in the same repository.

Alifatisk · 2 years ago

For real, it's almost like a state-sponsored exploit. It's crafted and executed incredibly well, the performance issue feels like pure luck it got found.

stingraycharles · 2 years ago

I like the theory that actually, it wasn’t luck but was picked up on by detection tools of a large entity (Google / Microsoft / NSA / whatever), and they’re just presenting the story like this to keep their detection methods a secret. It’s what I would do.

IshKebab · 2 years ago

I don't think it was executed incredibly well. There were definitely very clever aspects but they made multiple mistakes - triggering Valgrind, the performance issue, using a `.` to break the Landlock test, not giving the author a proper background identity.

I guess you could also include the fact that they made it a very obvious back door rather than an exploitable bug, but that has the advantage of only letting you exploit it so it was probably an intentional trade-off.

Just think how many back doors / intentional bugs there are that we don't know about because they didn't make any of these mistakes.

lyu07282 · 2 years ago

one question I still have is what exactly the performance issue was? I heard it might be related to enumeration of shared libraries, decoding of the scrambled strings[1], etc. anyone know for sure yet?

one other point for investigation is if the code is similar to any other known implants? like the way it obfuscates strings, the way it detects debuggers, the way its setting up a vtable, there might be code fragments shared across projects. Which might give clues about its origin.

[1] https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01

andersa · 2 years ago

It'd make total sense if it was. This way you get to have the backdoor without your enemies being able to use it against your own companies.

7373737373 · 2 years ago

I read someone speculating that the performance issue was intentional, so infected machines could be easily identified by an internet wide scan without arousing further suspicicion.

If this is or becomes a widespread method, then anti-malware groups should perhaps conduct these scans themselves.

jdewerd · 2 years ago

Yeah, we should probably expect that there are roughly 1/p(found) more of these lurking out there. Not a pleasant thought.

hnthrowaway0328 · 2 years ago

Do we have a detailed technical analysis of the code? I read a few analysis but they all seem preliminary. It is very useful to learn from the code.

timmytokyo · 2 years ago

I'm not sure why everyone is 100% sure this was a state-sponsored security breach. I agree that it's more likely than not state-sponsored, but I can imagine all sorts of other groups who would have an interest in something like this, organized crime in particular. Imagine how many banks or crypto wallets they could break into with a RCE this pervasive.

webmaven · 2 years ago

Was the performance issue pure luck? Or was it a subtle bit of sabotage by someone inside the attacking group worried about the implications of the capability?

If it had been successfully and secretly deployed, this is the sort of thing that could make your leaders much more comfortable with starting a "limited war".

There are shades of "Setec Astronomy" here.

cesarb · 2 years ago

This is called NOBUS: https://en.wikipedia.org/wiki/NOBUS

halJordan · 2 years ago

This is not that concept. That concept is no one but us can technically complete the exploit. Technical feasibility in that you need a supercomputer to do it, not protecting a backdoor with the normal cia triad

linsomniac · 2 years ago

Am I reading it correctly that the payload signature includes the target SSH host key? So you can't just spray it around to servers, it's fairly computationally expensive to send it to a host.

miduil · 2 years ago

*host key fingerprint, but I assume what you've meant.

It's practically a good backdoor then, crypto graphically protected and safe against "re-play" attacks.

Taniwha · 2 years ago

I wonder if anyone has packet logs from the past few weeks that show attempts at sshd - might be some incriminating IP addresses to start hunting with

bayindirh · 2 years ago

It's a (failed) case study in "what if we backdoor it in a way only good guys can use but bad guys can't?"

Computers don't know who's / what's good or bad. They're deterministic machines which responds to commands.

I don't know whether there'll be any clues about who did this, but this will be the poster child of "you can't have backdoors and security in a single system" argument (which I strongly support).

userbinator · 2 years ago

IMHO it's not that surprising; asymmetric crypto has been common in ransomware for a long time, and of course ransomware in general is based on securing data from its owner.

"It's not only the good guys who have guns."

nialv7 · 2 years ago

how are you going to sell it if anyone can get in?

ryanmerket · 2 years ago

this is probably the right answer. hacking group getting a 0day to sell to nation states on a per-use agreement.

whirlwin · 2 years ago

Imagine how much the private key is worth on the black market

Dead Comment

Has anyone tried the PoC against one of the anomalous process behavior tools? (Carbon Black, AWS GuardDuty, SysDig, etc.) I’m curious how likely it is that someone would have noticed relatively quickly had this rolled forward and this seems like a perfect test case for that product category.

dogman144 · 2 years ago

Depends how closely the exploit mirrors and/or masks itself within normal compression behavior imo.

I don’t think GuardDuty would catch it as it doesn’t look at processes like an EDR does (CrowdStrike, Carbon black), I don’t think sysdig would catch it as looks at containers and cloud infra. Handwaving some complexity here, as GD and sysdig could prob catch something odd via privileges gained and follow-on efforts by the threat actor via this exploit.

So imo means only EDRs (monitoring processes on endpoints) or software supply chain evaluations (monitoring sec problems in upstream FOSS) are most likely to catch the exploit itself.

Leads into another fairly large security theme interestingly - dev teams can dislike putting EDRs on boxes bc of the hit on compute and UX issues if a containment happens, and can dislike limits policy and limits around FOSS use. So this exploit hits at the heart of a org-driven “vulnerability” that has a lot of logic to stay exposed to or to fix, depending on where you sit. Security industry’s problem set in a nutshell.

acdha · 2 years ago

Guard Duty does have some ptocees level monitoring with some recent additions: https://aws.amazon.com/blogs/aws/amazon-guardduty-ec2-runtim...

The main thing I was thinking is that the audit hooking and especially runtime patching across modules (liblzma5 patching functions in the main sshd code block) seems like the kind of thing a generic behavioral profile could get but especially one driven by the fact that sshd does not do any of that normally.

And, yes, performance and reliability issues are a big problem here. When CarbonBlack takes down production again, you probably end up with a bunch of exclusions which mean an actual attacker might be missed.

knoxa2511 · 2 years ago

Sysdig released a blog on friday. "For runtime detection, one way to go about it is to watch for the loading of the malicious library by SSHD. These shared libraries often include the version in their filename."

The blog has the actual rule content which I haven't seen from other security vendors

https://sysdig.com/blog/cve-2024-3094-detecting-the-sshd-bac...

RamRodification · 2 years ago

That relies on knowing what to look for. I.e. "the malicious library". The question is whether any of these solutions could catch it without knowing about it beforehand and having a detection rule specifically made for it.

acdha · 2 years ago

Thanks! That’s a little disappointing since I would have thought that the way it hooked those functions could’ve been caught by a generic heuristic but perhaps that’s more common than I thought.

saagarjha · 2 years ago

That entire product category is for the most part snake oil.

Dead Comment

asveikau · 2 years ago

bilekas · 2 years ago

This whole thing has been consuming me over the whole weekend. The mechanisms are interesting and a collection of great obfuscations, the social engineering is a story that’s shamefully all too familiar for open source maintainers.

I find most interesting how they chose their attack vector of using BAD test data, it makes the rest of the steps incredibly easier when you have a good archive, manipulate it in a structured method (this should show on a graph of the binary pattern btw for future reference) then use it for a fuzzy bad data test. It’s great.

The rest of the techniques are banal enough except the most brilliant move seems to be that they could add “patches” or even whole new back doors using the same pattern on a different test file. Without being noticed.

Really really interesting, GitHub shouldn’t have hidden and removed the repo though. It’s not helpful at all to work through this whole drama.

Edit: I don’t mean to say this is banal in any way, but once the payload was decided and achieved through a super clever idea, the rest was just great obfuscation.

withinboredom · 2 years ago

It’s got me suspicious of build-time dependency we have in an open source tool, where the dependency goes out of its way to prefer xz and we even discovered that it installs xz on the host machine if it isn’t already installed — as a convenience. Kinda weird because it didn’t do that for any other dependencies.

These long-games are kinda scary and until whatever “evil” is actually done you have no idea what is actually malicious or just weird.

ToneWashed · 2 years ago

> It’s got me suspicious of build-time dependency we have in an open source tool, where the dependency goes out of its way to prefer xz and we even discovered that it installs xz on the host machine if it isn’t already installed — as a convenience. Kinda weird because it didn’t do that for any other dependencies.

Have you considered reaching out to the maintainers of that project and (politely) asking them to explain? In lieu of recent events I don't think anyone would blame you, in fact you might even suggest they explain such an oddly specific side effect in a README or such.

mondrian · 2 years ago

A main culprit seems to be the addition of binary files to the repo, to be used as test inputs. Especially if these files are “binary garbage” to prove a test fails. Seems like an obvious place to hide malicious stuff.

It is an obvious place for sure, but it also would have been picked up if the builds where a bit more transperant. That batch build script should have been questioned before approval.

throw156754228 · 2 years ago

They removed the repo so only the attackers had access to the code and know how.

andybak · 2 years ago

No. They obviously didn't do that so you're just being sarcastic but not actually making any point of your own in addition to that.

Super impressed how quickly the community and in particular amlweems were able to implement and document a POC. If the cryptographic or payload loading functionality has no further vulnerabilities, this would have been also at least not introducing a security flaw to all the other attackers until the key is broken or something.

Edit: I think what's next for anyone is to figure out a way to probe for vulnerable deployments (which seems non-trivial) and also perhaps possibly ?upstreaming? a way to monitor if someone actively probes ssh servers with the hardcoded key.

Kudos!

rst · 2 years ago

Well, it's a POC against a re-keyed version of the exploit; a POC against the original version would require the attacker's private key, which is undisclosed.

It's a POC nevertheless, it's a complete implementation of the RCE minus obviously the private key.

misswaterfairy · 2 years ago

Could the provided honeypot print out keys used in successful and unsuccessful attempts?

cjbprime · 2 years ago

Probing for vulnerable deployments over the network (without the attacker's private key) seems impossible, not non-trivial.

The best one could do is more micro-benchmarking, but for an arbitrary Internet host you aren't going to know whether it's slow because it's vulnerable, or because it's far away, or because the computer's slow in general -- you don't have access to how long connection attempts to that host took historically. (And of course, there are also routing fluctuations.)

anonymous-panda · 2 years ago

Should be able to do it by having the scanner take multiple samples. As long as you don’t need a valid login and the performance issue is still observable, you should be about to scan for it with minimal cost

faxmeyourcode · 2 years ago

Edit: I misunderstood what I was reading in the link below, my original comment is here for posterity. :)

> From down in the same mail thread: it looks like the individual who committed the backdoor has made some recent contributions to the kernel as well... Ouch.

https://www.openwall.com/lists/oss-security/2024/03/29/10

The OP is such great analysis, I love reading this kind of stuff!

ibotty · 2 years ago

No that patch series is from Lasse. He said himself that it's not urgent in any way and it won't be merged this merge window, but nobody (sane) is accusing Lasse of being the bad actor.

davikr · 2 years ago

Lasse Collin is not Jia Tan until proven otherwise.

verytrivial · 2 years ago

Speaking only hypothetically, but two points:

1) No-one has been is proven to "be" anyone in this case. Reputation is OSS is built upon behaviour only, not identity. "Jia Tan" managed to tip the scales by also being helpful. That identity is 99% likely to be a confection.

2) People can do terrible things when strongly encouraged or worse coerced. Including dissolving identity boundaries.

The first problem can be 'solved' by using real identities and web of trust but that will NEVER fly in OSS for a multitude of technical and social reasons. The second problem will simply never be solved in any context, OSS or otherwise. Bad actors be bad, yo.

pavon · 2 years ago

No, he likely is not. But the patch series includes commits co-developed by Jia Tan, and lists Jia Tan as a maintainer of the kernel module.

robocat · 2 years ago

Passive aggressive accusation.

This style of fake doubt is really not appropriate anywhere.

Denvercoder9 · 2 years ago

The referenced patch series had not made it into the kernel yet.

wezdog1 · 2 years ago

Also it may br a coincidence but JiaT75 looks a lot like Transponder 7500 which in aviation means hijacked...

dxthrwy856 · 2 years ago

The parallels in this one to the audacity event a couple years back are ridiculous.

Cookie guy claimed that he got stabbed and that the federal police was involved in the case, which kind of hints that the events were connected to much bigger actors than just 4chan. At the time a lot of people thought its just Muse Group that's involved, but maybe it was a (Russian) state actor?

Because before that he claimed that audacity had lots of telemetry/backdoors which were the reason he forked and removed in his first commits. Maybe audacity is backdoored after all?

Have to check the audacity source code now.

cookiengineer · 2 years ago

Careful, APT28 is pretty dangerous. They are merging their ops with APT29 these days, and I wouldn't wake the cozy bear if I were you.

So it was a state level actor that organized the doxxing campaign?

CommitSyn · 2 years ago

Cookie guy?

MuffinFlavored · 2 years ago

Instead of needing the honeypot openssh.patch at compile-time https://github.com/amlweems/xzbot/blob/main/openssh.patch

How did the exploit do this at runtime?

I know the chain was:

opensshd -> systemd for notifications -> xz included as transient dependency

How did liblzma.so.5.6.1 hook/patch all the way back to openssh_RSA_verify when it was loaded into memory?

tadfisher · 2 years ago

When loading liblzma, it patches the ELF GOT (global offset table) with the address of the malicious code. In case it's loaded before libcrypto, it registers a symbol audit handler (a glibc-specific feature, IIUC) to get notified when libcrypto's symbols are resolved so it can defer patching the GOT.

> When loading liblzma, it patches the ELF GOT (global offset table) with the address of the malicious code.

How was this part obfuscated/undetected?

jeffrallen · 2 years ago

ifunc

mrob · 2 years ago

Do we know if this exploit only did something if a SSH connection was made? There's a list of strings from it on Github that includes "DISPLAY" and "WAYLAND_DISPLAY":

https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01

These don't have any obvious connection to SSH, so maybe it did things even if there was no connection. This could be important to people who ran the code but never exposed their SSH server to the Internet, which some people seem to be assuming was safe.

rdtsc · 2 years ago

Those are probably kill switches to prevent the exploit from working if there is a terminal open or runs in a GUI session. In other words someone trying to detect, reproduce or debug it.

cma · 2 years ago

Could that be related x11 session forwarding (common security hole on the connectors' side if they don't turn it off when connecting to an untrusted machine).