The affected repo has now been taken down, so I am writing this partly from memory, but I believe the scenario is:
1. An attacker had write access to the tj-actions/changed-files repo
2. The attacker chose to spoof a Renovate commit, in fact they spoofed the most recent commit in the same repo, which came from Renovate
3. Important: this spoofing of commits wasn't done to "trick" a maintainer into accepting any PR, instead it was just to obfuscate it a little. It was an orphan commit and not on top of main or any other branch
4. As you'd expect, the commit showed up as Unverified, although if we're being realistic, most people don't look at that or enforce signed commits only (the real bot signs its commits)
5. Kind of unrelated, but the "real" Renovate Bot - just like Dependabot presumably - then started proposing PRs to update the action, like it does any other outdated dependency
6. Some people had automerging of such updates enabled, but this is not Renovate's default behavior. Even without automerging, an action like this might be able to achieve its aim only with a PR, if it's run as part of PR builds
7. This incident has reminded that many people mistakenly assume that git tags are immutable, especially if they are in semver format. Although it's rare for such tags to be changed, they are not immutable by design
>7. This incident has reminded that many people mistakenly assume that git tags are immutable, especially if they are in semver format. Although it's rare for such tags to be changed, they are not immutable by design
IME, this will be more "learned" than "reminded". Many many people set up pipelines to build artefacts based on tags (e.g. a common practise being "on tag with some pattern, then build artefact:$tag") and are just surprised if you call out the flaws.
It's one of many practises adopted because everyone does it but without basic awareness of the tradeoffs. Semver is another similar case of inherited practise, where surprisingly many people seem to believe that labelling software with a particular string magically translates into hard guarantees about its behaviour.
I theorized about this vulnerability a while back when I noticed new commits didn't disable automerging. This is an insane default from GH.
EDIT: seems GitHub has finally noticed (or started to care); just went to test this and auto merge has been seemingly disabled sitewide. Even though the setting is enabled, no option to automerge PRs shows up.
Seems I was right to worry!
EDIT2: We just tested this on GitLab's CI since they also have an auto-merge function and it appears they've done things correctly. Auto-merge enablement is only valid for the commit for which it was enabled; new pushes disable auto-merge. Much more sensible and secure.
Tags can be signed, and the signature can be verified. It's about as easy as signing / verifying commits. One can even make signing tags as the default option when creating tags.
This won't help in this case though, because a legitimate bot was tricked into working with a rogue commit; a tricked bot could as well sign a tag with a legitimate key.
"Immutable tags" of course exist, they are commit hashes, but they are uninformative :(
> 6. Some people had automerging of such updates enabled, but this is not Renovate's default behavior. Even without automerging, an action like this might be able to achieve its aim only with a PR, if it's run as part of PR builds
I'm not sure how this could exploited by just making a PR, unless you for some reason have secrets enabled for builds by unknown contributors, which obviously would be a mistake. Usually, only builds using secrets only run on certain branches which has a known contributor approving the code before it gets there.
> people mistakenly assume that git tags are immutable
If you're distributing a library on GitHub used by many other people/projects, then you really need to setup `protected branches` and `protected tags`, where you can prevent changes somewhat.
> I'm not sure how this could exploited by just making a PR, unless you for some reason have secrets enabled for builds by unknown contributors
In this context the renovate bot would be making the PR to a repo it had been installed on, making it a trusted contributor able to trigger CI builds on its PRs.
Neither Branch Protection nor the newer Rulesets allow to protect secrets from someone with push acces to the repo. From what I understand, only environment secrets provide this feature (and have the drawback that you can't share them among multiple repos in the same org without copying them everywhere, although you can script the copying with the github api)
Thanks for taking the time to comment. Not that it wasn't there before this, but this incident highlights a lot to take into consideration with respect to securing one's supply chain going forward.
Thanks for this writeup! It seems like #1 was the real weakness. Have you identified how the attacker was able to get write access to tj-actions/changed-files? Did this discovery result in any changes to how people can contribute to the project?
In recent years, it's started to feel like you can't trust third-party dependencies and extensions at all anymore. I no longer install npm packages that have more than a few transitive dependencies, and I've started to refrain from installing vscode or chrome extensions altogether.
Time and time again, they either get hijacked and malicious code added, or the dev themselves suddenly decides to betray everyone's trust and inject malicious code (see: Moq), or they sell out to some company that changes the license to one where you have to pay hundreds of dollars to keep using it (e.g. the recent FluentAssertions debacle), or one of those happens to any of the packages' hundreds of dependencies.
We need better capabilities.
E.g. when I run `fd`, `rg` or similar such tool, why should it have Internet access?
IMHO, just eliminating Internet access for all tools (e.g. in a power mode), might fix this.
The second problem is that we have merged CI and CD.
The production/release tokens should ideally not be on the same system as the ones doing regular CI.
More users need access to CI (especially in the public case) than CD.
For example, a similar one from a few months back https://blog.yossarian.net/2024/12/06/zizmor-ultralytics-inj...
> We need better capabilities. E.g. when I run `fd`, `rg` or similar such tool, why should it have Internet access?
Yeah!! We really need to auto sandbox everything by default, like mobile OSes. Or the web.
People browse the web (well, except Richard Stallman) all the time, and run tons of wildly untrusted code, many of them malicious. And apart from zero days here and there, people don't pay much attention to it, and will happily enter any random website in the same machine they also store sensitive data.
At the same time, when I open a random project from Github on VSCode, it asks whether the project is "trusted". If not, it doesn't run the majority of features like LSP server. And why not? Because the OS doesn't sandbox stuff by default. It's maddening.
OpenBSDs pledge[0] system call is aimed at helping with this. Although, it's more of a defense-in-depth measure on the maintainers part and not the user.
> The pledge() system call forces the current process into a restricted-service operating mode. A few subsets are available, roughly described as computation, memory management, read-write operations on file descriptors, opening of files, networking (and notably separate, DNS resolution). In general, these modes were selected by studying the operation of many programs using libc and other such interfaces, and setting promises or execpromises.
For CI/CD using something like ArgoCD let's you avoid giving CI direct access to prod - it still needs write access to a git repo, and ideally some read access to Argo to check if deployment succeeded but it limits the surface area.
FreeBSD has Capsicum [0] for this. Once a process enters capability mode, it can't do anything except by using already opened file descriptors. It can't spawn subprocesses, connect to the network, load kernel modules or anything else.
To help with things that can't be done in the sandbox, e.g. DNS lookups and opening new files, it provides the libcasper library which implements them using helper processes.
Not all utilities are sandboxed, but some are and hopefully more will be.
Linux recently added Landlock [1] which seems sort of similar, although it has rulesets and doesn't seem to block everything by default, as far as I can tell from quickly skimming the docs.
You also need to block write access, so they can’t encrypt all your files with an embedded public key. And read access so they can’t use a timing side channel to read a sensitive file and pass that info to another process with internet privileges to report the secret info back to the bad guy. You get the picture, I’m sure.
But that's what firejail and docker/podman are for. I never run any build pipeline on my host system, and neither should you. Build containers are pretty good for these kind of mitigations of security risks.
This is the death of fun. Like when you had to use SSL for buying things online.
Adding SSL was not bad, don't get me wrong. It's good that it's the default now. However. At one point it was sorta risky, and then it became required.
Like when your city becomes crime ridden enough that you have to lock your car when you go into the grocery store. Yeah you probably should have been locking it the whole time. what would it have really cost? But now you have to, because if you don't your car gets jacked. And that's not a great feeling.
In the era of the key fob it's pretty automatic to lock the car every time. Some cars even literally do it for you. I hardly think of this, let alone get not great feelings about it.
Yes. Same with browser plugins. I've heard multiple free-plugin authors say they're receiving regular offers to purchase their projects. I'm sure some must take up the offer.
I have long since stopped using any extension that doesn’t belong to an actual company (password managers for example). Even if they aren’t malware when you installed them, they will be after they get sold.
I'd also emphasize out that there's nothing safe about it being "only dev", given how many attacks use employee computers (non-prod) as a springboard elsewhere.
The original .NET (and I think Java?) had an idea in them of basically library level capability permissions.
That sort of idea seems increasingly like what we need because reputation based systems can be gamed too easily: i.e. there's no reason an action like this ever needed network access.
It was only recently removed in Java and there was a related concept (adopted from OSGi) designed to only export certain symbols -- not for security but for managing the surface area that a library vendor had to support
But I mentioned both of those things because [IMHO] they both fell prey to the same "humanity bug": specifying permissions for anything (source code, cloud security, databases, Kubernetes, ...) is a lot of trial and error, whereas {Effect: Allow, Action: ["*:*"]} always works and so they just drop a "TODO: tighten permissions" and go on to the next Jira
I had high hopes for the AWS feature "Make me an IAM Policy based on actual CloudTrail events" but it talks a bigger game than it walks
Are there examples of these types of actions in other circles outside of the .NET ecosystem? I knew about the FluentAssertions ordeal, but the Moq thing was news to me. I guess I've just missed it all.
node-ipc is a recent example from the Node ecosystem. The author released an update with some code that made a request to a geolocation webservice to decide whether to wipe the local filesystem.
Stealing crypto is so lucrative. So there is a huge 'market' for this stuff now that wasn't there before. Security is more important now than ever. I started sandboxing Emacs and python because I can't trust all the packages.
I hope the irony is not completely lost on the fine folks at semgrep that the admittedly "overkill" suggested semgrep solution is exactly the type of pattern that leads to this sort of vulnerability: that of executing arbitrary code that is modifiable completely outside of one's own control.
You should have never trusted them. That ecosystem is fine for hobbyists but for professional usage you can't just grab something random from the Internet and assume it's fine. Security or quality wise.
Yeah, I’ve moved off vscode entirely, back to fully featured out of the box ides for me. Jetbrains make some excellent tools and I don’t need to install 25 (dubious) plugins for them to be excellent
The alternative would be to find a sustainable funding model for open source, which is the source of betrayals due to almost all of the maintainers having to sell their projects to make a living in the first place.
The problem you're describing is an economical and a social one.
Currently, companies exploit maintainers of open source projects. There are rarely projects that make it due to their popularity, like webpack, when it comes to funding...but the actual state is that everyone that webpack is based on as a dependency didn't get a single buck for it, which is unfair, don't you think?
On top of sustainable funding, we need to change our workflows to reproducible build ecosystems that can also revert independent of git repositories. GitHub has become the almost single source of code for the planet, which is insane to even bet on from a risk assessment standpoint. But it's almost impossible to maintain your own registry or mirror of code in most ecosystems due to the sheer amount of transitive dependencies.
Take go mod vendor, for example. It's great to stick your dependencies but it comes with a lot of overhead work in case something like OPs scenario happens to its supply chain. And we need to account for that in our workflows.
It's not going to happen. If buying a forever license of unlimited usage for an open source library cost $1 I'd skip it. Not be cause I don't want to give money to people who deserve it, but because of the absolute monstrous bureaucratic nightmare that comes from trying to purchase anything at a company larger than 10 people.
Don't even talk about when the company gets a lawyer who knows what a software license is.
Open source has a very sustainable funding model as evidenced by 50 years of continuous, quality software being developed and maintained by a diverse set of maintainers.
I say sustainable because it has been sustained, is increasing in quantity and quality, and reasonably seems to be continuing.
> companies exploit maintainers of open source projects
Me giving something away and others taking what I give is not exploitation. Please don’t speak for others and claim people are exploited. One of the main tenets of gnu is to prevent exploitation.
> But Lewis Ardern on our team wrote a Semgrep rule to find usages of tj-actions, which you can run locally (without sending code to the cloud) via: semgrep --config r/10Uz5qo/semgrep.tj-actions-compromised.
So "remote code you download from a repo automatically and run locally has been compromised, here run this remote code you download from a repo automatically and run locally to find it"
I think the conventional approach of checking for vulnerabilities in 3rd party dependencies by querying CVE or some other database has set the current behaviour i.e. if its not vulnerable it must be safe. This implicit trust on vulnerability databases has been exploited in the wild to push malicious code to downstream users.
I think we will see security tools shifting towards "code" as the source of truth when making safety and security decision about 3rd party packages instead of relying only on known vulnerability databases.
Take a look at vet, we are working on active code analysis of OSS packages (+ transitive dependencies) to look for malicious code: https://github.com/safedep/vet
npm supply chain attacks are the lone thing that keeps me up at night, so to speak. I shudder thinking about the attack surface.
I go out of my way to advocate for removing dependencies and pushing against small dependency introductions in a large ruby codebase. Some dependencies that suck and impose all sorts of costs, from funky ass idiosyncratic behavior or absurd file sizes (looking at you any google produced ruby library, especially the protocol buffer dependent libraries) are unavoidable, but I try to keep fellow engineers honest about introducing libraries that do things like determine the underlying os or whatever and push towards them just figuring that out themselves or, at the least, taking "inspiration" from the code in those libraries and reproducing behavior.
A nice side effect of AI agents and copilots is they can sometimes write "organic" code that does the same thing as third party libraries. Whether that's ethical, I don't know, but it works for me.
Did you turn off updates on your phone as well? Because 99.999% of people have app auto-updates and every update could include an exploit.
I'm not saying you're wrong not to trust package managers and extensions but you're life is likely full of the same thing. The majority of apps are made from 3rd party libraries which are made of 3rd party libraries, etc.... At least on phones they update constantly, and every update is a chance to install more exploits.
The same is true for any devices that get updates like a Smart TV, router, printer, etc.... I mostly trust Apple, Microsoft, and Google to check their 3rd party dependencies, (mostly), but don't trust any other company - and yet I can't worry about it. Don't update and I don't get security vulnerabilities fixed. Do update and I take the chance that this latest update has a 3rd party exploit buried in a 3rd party library.
I don't trust apps. I trust Apple (enough) that they engineered iOS to have a secure enough sandbox that a random calculator app can't just compromise my phone.
Most developer packages have much higher permission levels because they integrate it with your code without a clear separation of boundaries. This is why attackers now like to attack GitHub Actions because if you get access to secrets you can do a lot of damage.
This is why I have begin to prefer languages with comprehensive, batteries-included standard libraries so that you need very few dependencies. Dep Management has become a full time headache nowadays with significant effort going into CVE analysis.
I think library/runtime makers aren't saying "let's make an official/blessed take on this thing that a large number of users are doing" as much as they should.
Popular libraries for a given runtime/language should be funded/bought/cloned by the runtime makers (e.g. MS for .NET, IBM/Oracle for Java) more than they are now.
I know someone will inevitably mention concerns about monopolies/anti-trust/"stifling innovation" but I don't really care. Sometimes you have to standardize some things to unlock new opportunities.
Instead of bloating the base language for this, a trusted entity could simply fork those libraries, vet them, and repackage into some "blessed lib" that people like you can use in peace. In fact, the level of trust needed to develop safe libraries is less than developing language features.
49 modules with only one maintainer and over 600 modules with only one maintainer if devDependencies are included. This is only a matter of time until the next module becomes compromised.
You can trust (in time), but you can't blindly upgrade. Vendor or choose to "lock" with a cryptographic hash over the files your build depends on. You then need to rebuild that trust when you upgrade (wait until everyone else does; read the diffs yourself).
There is something to be said for the Go proverb "a little copying is better than a little dependency", as well. If you want a simple function from a complicated library, you can probably copy it into your own codebase.
When using NuGet packages I usually won't even consider ones with non-Microsoft dependencies, and I like to avoid third-party dependencies altogether. I used to feel like this made me a weird conspiracy theorist but it's holding up well!
It also has led to some bad but fun choices, like implementing POP3 and IMAP directly in my code, neither of which worked well but taught me a lot?
This isn't new - Thompson warned us 40 years ago (and I believe others before him) in his Reflections on Trusting Trust paper.
It's something I've been thinking about lately because I was diving into a lot of discussion from the early 90s regarding safe execution of (what was, at the time, called) "mobile code" - code that a possibly untrustworthy client would send to have executed on a remote server.
There's actually a lot of discussion still available from w3 thankfully, even though most of the papers are filled with references to dead links from various companies and universities.
It's weirdly something that a lot of smart people seemed to have thought about at the start of the World Wide Web which just fell off. Deno's permissions are the most interesting modern implementation of some of the ideas, but I think it still falls flat a bit. There's always the problem of "click yes to accept the terms" fatigue as well, especially when working in web development. It's quite reasonable for many packages one interacts with in web development to need network access, for example, so it's easy to imagine someone just saying "yup, makes sense" when a web-related package requests network access.
Also none of this even touches on the reality of so much code which exists to brutally impact a business need (or perceived need). Try telling your boss you need a week or two to audit every one of the thousands of packages for the report generator app.
Trusting Trust is not about this at all. It's about the compiler being compromised, and making it impossible to catch malicious code by inspecting the source code.
The problem here is that people don't even bother to check the source code and run it blindly.
I'm just going to say this out loud: It's mostly a Javascript thing.
Not that every other platform in the world isn't theoretically vulnerable to the same sort of attack, but there's some deep-rooted culture in the javascript community that makes it especially vulnerable.
The charitable interpretation is "javascript evolves so fast!". The uncharitable interpretation is "they are still figuring it out!"
Either way, I deliberately keep my javascript on the client side.
If this were “the solution”, then the many, many smart individuals and teams tasked with solving these problems throughout the software industry would’ve been out of work for some time now.
It’s obviously more complicated than that.
Signed public builds don’t inherently mean jack. It highly depends on the underlying trust model.
—
Malicious actor: “we want to buy your browser extension, and your signing credentials”.
Plugin author: “Well, OK”.
—
Malicious actor: hijacks npm package and signs new release with new credentials
The vast majority of dependent project authors: at best, see a “new releaser” warning from their tooling, which is far from unusual for many dependencies. ignores After all, what are they going to do?.
—
Hacker News, as usual, loves to pretend it has all the answers to life’s problems, and the issue is that nobody has listened to them.
StepSecurity Harden-Runner detected this security incident by continuously monitoring outbound network calls from GitHub Actions workflows and generating a baseline of expected behaviors. When the compromised tj-actions/changed-files Action was executed, Harden-Runner flagged it due to an unexpected endpoint appearing in the network traffic—an anomaly that deviated from the established baseline. You can checkout the project here: https://github.com/step-security/harden-runner
The advertising in this article is making it actively difficult to figure out how to remediate this issue. The "recovery steps" section just says "start our 14 day free trial".
The security industry tolerates self-promotion only to the extent that the threat research benefits everyone.
Thank you, cyrnel, for the feedback! We are trying our best to help serve the community. Now, we have separate recovery steps for general users and our enterprise customers.
It's always been shocking to me that the way people run CI/CD is just listing a random repository on GitHub. I know they're auditable and you pin versions, but it's crazy to me that the recommended way to ssh to a server is to just give a random package from a random GitHub user your ssh keys, for example.
This is especially problematic with the rise of LLMs, I think. It's the kind of common task which is annoying enough, unique enough, and important enough that I'm sure there are a ton of GitHub actions that are generated from "I need to build and deploy this project from GitHub actions to production". I know, and do, know to manually run important things in actions related to ssh, keys, etc., but not everyone does.
Almost everyone will just copy paste this snippet and call it a day. Most people don't think twice that v4 is a movable target that can be compromised.
In case of npm/yarn deps, one would often do the same, and copy paste `yarn install foobar`, but then when installing, npm/yarn would create a lockfile and pin the version. Whereas there's no "installer" CLI for GH actions that would pin the version for you, you just copy-paste and git push.
To make things better, ideally, the owners of actions would update the workflows which release a new version of the GH action, to make it update README snippet with the sha256 of the most recent release, so that it looks like
Aren't GitHub action "packages" designate by a single major version? Something like checkout@v4, for example. I thought that that designated a single release as v4 which will not be updated?
I'm quite possibly wrong, since I try to avoid them as much as I can, but I mean.. wow I hope I'm not.
The crazier part is, people typically don't even pin versions! It's possible to list a commit hash, but usually people just use a tag or branch name, and those can easily be changed (and often are, e.g. `v3` being updated from `v3.5.1` to `v3.5.2`).
Fuck. Insecure defaults again. I argue that a version specifier should be only a hash. Nothing else is acceptable. Forget semantic versions. (Have some other method for determining upgrade compatibility you do out of band. You need to security audit every upgrade anyway). Process: old hash, new hash, diff code, security audit, compatibility audit (semver can be metadata), run tests, upgrade to new hash.
You and someone else pointed this out. I only use GitHub-org actions, and I just thought that surely there would be a "one version to rule them all" type rule.. how else can you audit things?
I've never seen anything recommending specifying a specific commit hash or anything for GitHub actions. It's always just v1, v2, etc.
Having to use actions for ssh/rsync always rubbed me the wrong way. I’ve recently taken the time to remove those in favor of using the commands directly (which is fairly straightforward, but a bit awkward).
I think it’s a failure of GitHub Actions that these third party actions are so widespread. If you search “GitHub actions how to ssh” the first result should be a page in the official documentation, instead you’ll find tens of examples using third party actions.
So much this. I recently looked into using GitHub Actions but ended up using GitLab instead since it had official tools and good docs for my needs. My needs are simple. Even just little scripting would be better than having to use and audit some 3rd party repo with a lot more code and deps.
And if you're new, and the repo aptly named, you may not realize that the action is just some random repo
> It's always been shocking to me that the way people run CI/CD is just listing a random repository on GitHub.
Right? My mental model of CI has always been "an automated sequence of commands in a well-defined environment". More or less an orchestrated series of bash scripts with extra sugar for reproducibility and parallelism.
Turning these into a web of vendor-locked, black-box "Actions" that someone else controls ... I dunno, it feels like a very pale imitation of the actual value of CI
I am surprised nobody here mentionned immutable github actions that are coming [1]. Been waiting for them since the issue was open in 2022. This would have significantly reduce impact and hopefully github will get it over the finish line.
I always fork my actions or at least use a commit hash.
I thought actions were already immutable and published to a registry, not fetched directly from their repo. TIL.
Go also uses tags for module versioning, and while go.mod or package-lock.json stop this attack from reaching existing consumers, allowing remapping of all versions to the compromised one still expands the impact surface a lot. GitHub should offer a “immutable tags” setting for repos like these.
Doing a bit of investigation with github_events in clickhouse, it is quite clear that the accounts used to perform the attack was "2ft2dKo28UazTZ", "mmvojwip" also seems suspicious:
It seems i forgot to cater for the quota applied to free "play" user in ClickHouse in my previous query... In fact, the threat actor did a lot more... this should give a better list of actions that was performed - Clearly showed he was testing his payload:
Nice find. Its a bit strange that the PRs listed there, are not present at all in the coinbase repo. Seems like the attack was directed there, but I also did not hear anything from Coinbase on this.
eg. Target their NPM and PYPI tokens, so they can push compromised packages.
Note that these account seems to be deleted now - 2ft2dKo28UazTZ clearly did more than just changed-files and also seem to target coinbase/agentkit as well (Actually .. they might be targeted by the threat actor)
The attacker was trying to compromise agentkit and found changed-files used in the repo so looked around. Found that it was using a bot with a PAT to release.
Totally possible the bot account had a weak password, and the maintainer said it didn't have 2FA.
They got the release bot PAT so they tried possibly quite an obvious vector that. They didn't need anything sophisticated or to exfil the credentials because agentkit is public.
It just so happened that it was detected before agentkit updated dependencies.
It's possible that with if thye had checked the dependabot config they could've timed it a bit better so that it's picked up in agentkit before being detected.
edit: Although, I don't think PATs are visible after they're generated?
GitHub Actions should use a lockfile for dependencies. Without it, compromised Actions propagate instantly. While it'd still be an issue even with locking, it would slow down the rollout and reduce the impact.
Semver notation rather than branches or tags is a great solution to this problem. Specify the version that want, let the package manager resolve it, and then periodically update all of your packages. It would also improve build stability.
Also don't het GH actions to do anything other than build and upload artifacts somewhere. Ideally a write only role. Network level security too no open internet.
Use a seperate system for deployments. That system must be hygienic.
This isn't foolproof but would make secrets dumping not too useful. Obviously an attack could still inject crap into your artefact. But you have more time and they need to target you. A general purpose exploit probably won't hurt as much.
The version numbers aren't immutable, so an attacker can just update the versions to point to the compromised code, which is what happened here. Commit hashes are a great idea, but you still need to be careful: lots of people use bots like Renovate to update your pinned hashes whenever a new version is published, which runs into the same problem.
Since they edited old tags here … maybe GitHub should have some kind of security setting a repo owner can make that locks-down things like old tags so after a certain time they can't be changed.
This is hilarious, the maven-lockfile project "Lockfiles for Maven. Pin your dependencies. Build with integrity" appears to have auto-merged a PR for the compromised action commit. So the real renovate bot immediately took the exfiltration commit from the fake renovate bot and started auto-merging it into other projects:
> After some cleanup the changed-files (https://github.com/tj-actions/changed-files) action seems to be more work to remove. It would be awesome if it could be added to the allowlist
> Done. Allowed all versions of this action. Should I pin it to one version in the allowlist (won't be convenient if renovate updates this dependency)?
The affected repo has now been taken down, so I am writing this partly from memory, but I believe the scenario is:
1. An attacker had write access to the tj-actions/changed-files repo
2. The attacker chose to spoof a Renovate commit, in fact they spoofed the most recent commit in the same repo, which came from Renovate
3. Important: this spoofing of commits wasn't done to "trick" a maintainer into accepting any PR, instead it was just to obfuscate it a little. It was an orphan commit and not on top of main or any other branch
4. As you'd expect, the commit showed up as Unverified, although if we're being realistic, most people don't look at that or enforce signed commits only (the real bot signs its commits)
5. Kind of unrelated, but the "real" Renovate Bot - just like Dependabot presumably - then started proposing PRs to update the action, like it does any other outdated dependency
6. Some people had automerging of such updates enabled, but this is not Renovate's default behavior. Even without automerging, an action like this might be able to achieve its aim only with a PR, if it's run as part of PR builds
7. This incident has reminded that many people mistakenly assume that git tags are immutable, especially if they are in semver format. Although it's rare for such tags to be changed, they are not immutable by design
IME, this will be more "learned" than "reminded". Many many people set up pipelines to build artefacts based on tags (e.g. a common practise being "on tag with some pattern, then build artefact:$tag") and are just surprised if you call out the flaws.
It's one of many practises adopted because everyone does it but without basic awareness of the tradeoffs. Semver is another similar case of inherited practise, where surprisingly many people seem to believe that labelling software with a particular string magically translates into hard guarantees about its behaviour.
EDIT: seems GitHub has finally noticed (or started to care); just went to test this and auto merge has been seemingly disabled sitewide. Even though the setting is enabled, no option to automerge PRs shows up.
Seems I was right to worry!
EDIT2: We just tested this on GitLab's CI since they also have an auto-merge function and it appears they've done things correctly. Auto-merge enablement is only valid for the commit for which it was enabled; new pushes disable auto-merge. Much more sensible and secure.
This won't help in this case though, because a legitimate bot was tricked into working with a rogue commit; a tricked bot could as well sign a tag with a legitimate key.
"Immutable tags" of course exist, they are commit hashes, but they are uninformative :(
I'm not sure how this could exploited by just making a PR, unless you for some reason have secrets enabled for builds by unknown contributors, which obviously would be a mistake. Usually, only builds using secrets only run on certain branches which has a known contributor approving the code before it gets there.
> people mistakenly assume that git tags are immutable
If you're distributing a library on GitHub used by many other people/projects, then you really need to setup `protected branches` and `protected tags`, where you can prevent changes somewhat.
Dead Comment
Time and time again, they either get hijacked and malicious code added, or the dev themselves suddenly decides to betray everyone's trust and inject malicious code (see: Moq), or they sell out to some company that changes the license to one where you have to pay hundreds of dollars to keep using it (e.g. the recent FluentAssertions debacle), or one of those happens to any of the packages' hundreds of dependencies.
Just take a look at eslint's dependency tree: https://npmgraph.js.org/?q=eslint
Can you really say you trust all of these?
We need better capabilities. E.g. when I run `fd`, `rg` or similar such tool, why should it have Internet access?
IMHO, just eliminating Internet access for all tools (e.g. in a power mode), might fix this.
The second problem is that we have merged CI and CD. The production/release tokens should ideally not be on the same system as the ones doing regular CI. More users need access to CI (especially in the public case) than CD. For example, a similar one from a few months back https://blog.yossarian.net/2024/12/06/zizmor-ultralytics-inj...
Yeah!! We really need to auto sandbox everything by default, like mobile OSes. Or the web.
People browse the web (well, except Richard Stallman) all the time, and run tons of wildly untrusted code, many of them malicious. And apart from zero days here and there, people don't pay much attention to it, and will happily enter any random website in the same machine they also store sensitive data.
At the same time, when I open a random project from Github on VSCode, it asks whether the project is "trusted". If not, it doesn't run the majority of features like LSP server. And why not? Because the OS doesn't sandbox stuff by default. It's maddening.
Computers are fast enough where the overhead doesn’t feel like it’s there for what I do.
For development, I think Vagrant should make a comeback as one of the first things to setup in a repo/group of repos.
> The pledge() system call forces the current process into a restricted-service operating mode. A few subsets are available, roughly described as computation, memory management, read-write operations on file descriptors, opening of files, networking (and notably separate, DNS resolution). In general, these modes were selected by studying the operation of many programs using libc and other such interfaces, and setting promises or execpromises.
[0]: https://man.openbsd.org/pledge.2
For CI/CD using something like ArgoCD let's you avoid giving CI direct access to prod - it still needs write access to a git repo, and ideally some read access to Argo to check if deployment succeeded but it limits the surface area.
To help with things that can't be done in the sandbox, e.g. DNS lookups and opening new files, it provides the libcasper library which implements them using helper processes.
Not all utilities are sandboxed, but some are and hopefully more will be.
Linux recently added Landlock [1] which seems sort of similar, although it has rulesets and doesn't seem to block everything by default, as far as I can tell from quickly skimming the docs.
[0] https://wiki.freebsd.org/Capsicum
[1] https://docs.kernel.org/userspace-api/landlock.html
I'd love to say "just use Kubernetes and run Nexus as a service inside" but unfortunately Network Policies are seriously limited [1]...
[1] https://kubernetes.io/docs/concepts/services-networking/netw...
Adding SSL was not bad, don't get me wrong. It's good that it's the default now. However. At one point it was sorta risky, and then it became required.
Like when your city becomes crime ridden enough that you have to lock your car when you go into the grocery store. Yeah you probably should have been locking it the whole time. what would it have really cost? But now you have to, because if you don't your car gets jacked. And that's not a great feeling.
And if you turn on devDependencies (top right), it goes from 85 to 1263.
That sort of idea seems increasingly like what we need because reputation based systems can be gamed too easily: i.e. there's no reason an action like this ever needed network access.
But I mentioned both of those things because [IMHO] they both fell prey to the same "humanity bug": specifying permissions for anything (source code, cloud security, databases, Kubernetes, ...) is a lot of trial and error, whereas {Effect: Allow, Action: ["*:*"]} always works and so they just drop a "TODO: tighten permissions" and go on to the next Jira
I had high hopes for the AWS feature "Make me an IAM Policy based on actual CloudTrail events" but it talks a bigger game than it walks
Abnormal behavior was to trust by default.
I hope the irony is not completely lost on the fine folks at semgrep that the admittedly "overkill" suggested semgrep solution is exactly the type of pattern that leads to this sort of vulnerability: that of executing arbitrary code that is modifiable completely outside of one's own control.
The problem you're describing is an economical and a social one.
Currently, companies exploit maintainers of open source projects. There are rarely projects that make it due to their popularity, like webpack, when it comes to funding...but the actual state is that everyone that webpack is based on as a dependency didn't get a single buck for it, which is unfair, don't you think?
On top of sustainable funding, we need to change our workflows to reproducible build ecosystems that can also revert independent of git repositories. GitHub has become the almost single source of code for the planet, which is insane to even bet on from a risk assessment standpoint. But it's almost impossible to maintain your own registry or mirror of code in most ecosystems due to the sheer amount of transitive dependencies.
Take go mod vendor, for example. It's great to stick your dependencies but it comes with a lot of overhead work in case something like OPs scenario happens to its supply chain. And we need to account for that in our workflows.
Don't even talk about when the company gets a lawyer who knows what a software license is.
I say sustainable because it has been sustained, is increasing in quantity and quality, and reasonably seems to be continuing.
> companies exploit maintainers of open source projects Me giving something away and others taking what I give is not exploitation. Please don’t speak for others and claim people are exploited. One of the main tenets of gnu is to prevent exploitation.
> But Lewis Ardern on our team wrote a Semgrep rule to find usages of tj-actions, which you can run locally (without sending code to the cloud) via: semgrep --config r/10Uz5qo/semgrep.tj-actions-compromised.
So "remote code you download from a repo automatically and run locally has been compromised, here run this remote code you download from a repo automatically and run locally to find it"
I think we will see security tools shifting towards "code" as the source of truth when making safety and security decision about 3rd party packages instead of relying only on known vulnerability databases.
Take a look at vet, we are working on active code analysis of OSS packages (+ transitive dependencies) to look for malicious code: https://github.com/safedep/vet
I go out of my way to advocate for removing dependencies and pushing against small dependency introductions in a large ruby codebase. Some dependencies that suck and impose all sorts of costs, from funky ass idiosyncratic behavior or absurd file sizes (looking at you any google produced ruby library, especially the protocol buffer dependent libraries) are unavoidable, but I try to keep fellow engineers honest about introducing libraries that do things like determine the underlying os or whatever and push towards them just figuring that out themselves or, at the least, taking "inspiration" from the code in those libraries and reproducing behavior.
A nice side effect of AI agents and copilots is they can sometimes write "organic" code that does the same thing as third party libraries. Whether that's ethical, I don't know, but it works for me.
I'm not saying you're wrong not to trust package managers and extensions but you're life is likely full of the same thing. The majority of apps are made from 3rd party libraries which are made of 3rd party libraries, etc.... At least on phones they update constantly, and every update is a chance to install more exploits.
The same is true for any devices that get updates like a Smart TV, router, printer, etc.... I mostly trust Apple, Microsoft, and Google to check their 3rd party dependencies, (mostly), but don't trust any other company - and yet I can't worry about it. Don't update and I don't get security vulnerabilities fixed. Do update and I take the chance that this latest update has a 3rd party exploit buried in a 3rd party library.
Most developer packages have much higher permission levels because they integrate it with your code without a clear separation of boundaries. This is why attackers now like to attack GitHub Actions because if you get access to secrets you can do a lot of damage.
I think library/runtime makers aren't saying "let's make an official/blessed take on this thing that a large number of users are doing" as much as they should.
Popular libraries for a given runtime/language should be funded/bought/cloned by the runtime makers (e.g. MS for .NET, IBM/Oracle for Java) more than they are now.
I know someone will inevitably mention concerns about monopolies/anti-trust/"stifling innovation" but I don't really care. Sometimes you have to standardize some things to unlock new opportunities.
If I see a useful extension, I want to use is on GitHub. I fork it. Sometimes I make a bookmarklet with the code instead.
I keep most extensions off until I need to use them. Then, I enable them, use them, and turn them off again. I try to even keep Mac apps to a minimum.
There is something to be said for the Go proverb "a little copying is better than a little dependency", as well. If you want a simple function from a complicated library, you can probably copy it into your own codebase.
What a nice way to put it! Thanks for the mention and thanks for making me discover https://go-proverbs.github.io/ .
Was it really a recent thing?
> Just take a look at eslint's dependency tree
Npm / node has always been extra problematic though. Where's the governance / validation on these packages? It's free for all.
It also has led to some bad but fun choices, like implementing POP3 and IMAP directly in my code, neither of which worked well but taught me a lot?
It's something I've been thinking about lately because I was diving into a lot of discussion from the early 90s regarding safe execution of (what was, at the time, called) "mobile code" - code that a possibly untrustworthy client would send to have executed on a remote server.
There's actually a lot of discussion still available from w3 thankfully, even though most of the papers are filled with references to dead links from various companies and universities.
It's weirdly something that a lot of smart people seemed to have thought about at the start of the World Wide Web which just fell off. Deno's permissions are the most interesting modern implementation of some of the ideas, but I think it still falls flat a bit. There's always the problem of "click yes to accept the terms" fatigue as well, especially when working in web development. It's quite reasonable for many packages one interacts with in web development to need network access, for example, so it's easy to imagine someone just saying "yup, makes sense" when a web-related package requests network access.
Also none of this even touches on the reality of so much code which exists to brutally impact a business need (or perceived need). Try telling your boss you need a week or two to audit every one of the thousands of packages for the report generator app.
The problem here is that people don't even bother to check the source code and run it blindly.
Not that every other platform in the world isn't theoretically vulnerable to the same sort of attack, but there's some deep-rooted culture in the javascript community that makes it especially vulnerable.
The charitable interpretation is "javascript evolves so fast!". The uncharitable interpretation is "they are still figuring it out!"
Either way, I deliberately keep my javascript on the client side.
Deleted Comment
Dead Comment
It’s obviously more complicated than that.
Signed public builds don’t inherently mean jack. It highly depends on the underlying trust model.
—
Malicious actor: “we want to buy your browser extension, and your signing credentials”.
Plugin author: “Well, OK”.
—
Malicious actor: hijacks npm package and signs new release with new credentials
The vast majority of dependent project authors: at best, see a “new releaser” warning from their tooling, which is far from unusual for many dependencies. ignores After all, what are they going to do?.
—
Hacker News, as usual, loves to pretend it has all the answers to life’s problems, and the issue is that nobody has listened to them.
Dead Comment
StepSecurity Harden-Runner detected this security incident by continuously monitoring outbound network calls from GitHub Actions workflows and generating a baseline of expected behaviors. When the compromised tj-actions/changed-files Action was executed, Harden-Runner flagged it due to an unexpected endpoint appearing in the network traffic—an anomaly that deviated from the established baseline. You can checkout the project here: https://github.com/step-security/harden-runner
The security industry tolerates self-promotion only to the extent that the threat research benefits everyone.
This is especially problematic with the rise of LLMs, I think. It's the kind of common task which is annoying enough, unique enough, and important enough that I'm sure there are a ton of GitHub actions that are generated from "I need to build and deploy this project from GitHub actions to production". I know, and do, know to manually run important things in actions related to ssh, keys, etc., but not everyone does.
Let's have a look at a random official GH provided action:
https://github.com/actions/checkout
It lists the following snippet:
`uses: actions/checkout@v4`
Almost everyone will just copy paste this snippet and call it a day. Most people don't think twice that v4 is a movable target that can be compromised.
In case of npm/yarn deps, one would often do the same, and copy paste `yarn install foobar`, but then when installing, npm/yarn would create a lockfile and pin the version. Whereas there's no "installer" CLI for GH actions that would pin the version for you, you just copy-paste and git push.
To make things better, ideally, the owners of actions would update the workflows which release a new version of the GH action, to make it update README snippet with the sha256 of the most recent release, so that it looks like
`uses: actions/checkout@abcdef9876543210` # v4.5.6
Since GitHub doesn't promote good defaults, it's not surprising that third-party maintainers do the same.
I'm quite possibly wrong, since I try to avoid them as much as I can, but I mean.. wow I hope I'm not.
I've never seen anything recommending specifying a specific commit hash or anything for GitHub actions. It's always just v1, v2, etc.
I think it’s a failure of GitHub Actions that these third party actions are so widespread. If you search “GitHub actions how to ssh” the first result should be a page in the official documentation, instead you’ll find tens of examples using third party actions.
And if you're new, and the repo aptly named, you may not realize that the action is just some random repo
Right? My mental model of CI has always been "an automated sequence of commands in a well-defined environment". More or less an orchestrated series of bash scripts with extra sugar for reproducibility and parallelism.
Turning these into a web of vendor-locked, black-box "Actions" that someone else controls ... I dunno, it feels like a very pale imitation of the actual value of CI
Dead Comment
I always fork my actions or at least use a commit hash.
[1] https://github.com/features/preview/immutable-actions
Go also uses tags for module versioning, and while go.mod or package-lock.json stop this attack from reaching existing consumers, allowing remapping of all versions to the compromised one still expands the impact surface a lot. GitHub should offer a “immutable tags” setting for repos like these.
Deleted Comment
https://play.clickhouse.com/play?user=play#c2VsZWN0ICogZnJvb...
Actions taken by the threat actor at the time can be seen here:
https://play.clickhouse.com/play?user=play#c2VsZWN0ICogZnJvb...
https://play.clickhouse.com/play?user=play#c2VsZWN0ICogZnJvb...
eg. Target their NPM and PYPI tokens, so they can push compromised packages.
The attacker was trying to compromise agentkit and found changed-files used in the repo so looked around. Found that it was using a bot with a PAT to release.
Totally possible the bot account had a weak password, and the maintainer said it didn't have 2FA.
They got the release bot PAT so they tried possibly quite an obvious vector that. They didn't need anything sophisticated or to exfil the credentials because agentkit is public.
It just so happened that it was detected before agentkit updated dependencies.
It's possible that with if thye had checked the dependabot config they could've timed it a bit better so that it's picked up in agentkit before being detected.
edit: Although, I don't think PATs are visible after they're generated?
Semver notation rather than branches or tags is a great solution to this problem. Specify the version that want, let the package manager resolve it, and then periodically update all of your packages. It would also improve build stability.
Use a seperate system for deployments. That system must be hygienic.
This isn't foolproof but would make secrets dumping not too useful. Obviously an attack could still inject crap into your artefact. But you have more time and they need to target you. A general purpose exploit probably won't hurt as much.
your build should always use hashes and not version tags of GHA's
There is some latent concern that most git installations use SHA-1 hashes, as opposed to SHA-256. [0]
Also the trick of creating a branch that happens to be named the same as a revision, which then takes precedence for certain commands.
[0] https://git-scm.com/docs/hash-function-transition
All the tags point to commit `^0e58ed8` https://github.com/tj-actions/changed-files/commit/0e58ed867...
https://github.com/chains-project/maven-lockfile/pull/1111
> After some cleanup the changed-files (https://github.com/tj-actions/changed-files) action seems to be more work to remove. It would be awesome if it could be added to the allowlist
> Done. Allowed all versions of this action. Should I pin it to one version in the allowlist (won't be convenient if renovate updates this dependency)?