Rust without crates.io

chrismorgan · 2 years ago

> What’s interesting is that this problem is largely solved for C and C++: Linux distributions like Debian package such a wide range of libraries that for many things that you want to develop or install, you don’t need any third-party libraries at all.

This does not reflect my observations at all. It works if you’re an end user and the software you want has been packaged. But if either of those is not the case, you’re left in the mud. Rather, you end up with builds that are difficult to set up or reproduce except on specific platforms, require a lot more maintenance over time because bitrot occurs naturally (your code depends on libexample 1, but Debian only package version 2 now, so good luck dusting off that project you made five years ago and getting it to build again), vendoring upon vendoring, more diverse and generally lower-quality dependencies for a particular task, and other problems like these. You’ve solved one problem, but at a very significant cost that must be acknowledged, even if you decide it’s still worth it.

Take an app written five years ago that depends on majorish C++ libraries, and there’s a fair chance compiling it will be a nightmare and require invasive surgery due to version incompatibilities. It’s not improbable that the most pragmatic solution may be to find and rebuild the transitive package tree for some older version of some package, and that’s generally quite painful (and can be outright incompatible).

Take an app written five years ago in pure Rust, and it’ll almost certainly (p>0.99) still work fine. Take an app written five years ago in Rust that binds to some C/C++ library, and there’s a fair chance you’ll run into trouble, but it will almost certainly be a good deal easier to resolve, much more likely to work with only a version bump and no other changes than in C/C++.

As a developer, I know which general mode of operation I prefer. Debian are welcome to twiddle the dependencies in the way that suits their chosen requirements, but the canonical sources are still crates.io, and I know that when they’ve stopped packaging example-1.0.0, I won’t be beholden to their packaging but can just use `cargo build` and it’ll still work (contingent upon extern bindings).

I’m dubious that any meaningful curation is happening, and curious if there’s even cursory review of crate contents. But most of all, I’d be interested in a statistical comparison of the crates that Debian ships with the most-downloaded crates from crates.io. I suspect the sets would look very similar.

josephg · 2 years ago

Yeah. The part that gives me goosebumps with that proposal is: as an author of some not very popular packages, what do I do? Is it my job to package and promote my packages on Debian, redhat, homebrew, and all the others? Do I need usage before they’ll pick it up? Sounds like a chicken and egg problem. And doing all that packaging work sounds like a boring, time consuming job that I don’t want.

It’s also quite common for me to pull out bits of code I’ve written that I want to share as a dependency between different programs I write. I have dozens of packages in npm along these lines - and it’s quite elegant. I factor out the code, package it independently, write a readme and some tests and publish it. My other packages can then just depend on the brand new package directly. In this brave new world the author is proposing - what? I make a .deb, argue for it on some mailing lists I’m not on, wait who knows how long, eventually gain permission to depend on my own code, then realise I can’t install it on my Mac laptop and do the same dance in homebrew? Oh - there’s a bug. Better manually update all the package repositories? Sounds like a complete nightmare to me.

I’m sympathetic to the problem. Supply chain attacks are way too easy. But maintaining up to date packages on every single Linux distribution, homebrew, ports, and whatever windows does sounds like a full time job. A boring full time job. And it would still not work as well for my users, because when a new feature is released in one of my upstream dependencies I can’t use it until all of those package repositories ship the new version of my upstream dependency too. I’d rather eat sand.

“It works for C/C++” by making developers working in those languages pay in blood for every dependency they pull in. You can tell by looking at just how few dependencies the average C program uses. It’s like that because of how painful it is. Anything under a certain size tends to get either reinvented every time, badly (hash table libraries in C) or vendored (eg json parsers). Common libraries like rust’s rand don’t exist in C because it’s just too much effort to pull in dependencies. So everyone rolls their own crappy version instead. Ask GTA5 how well that went for them. And that’s not the developer ecosystem I want to work in. It’s frankly a terrible proposal.

no_wizard · 2 years ago

>I’m sympathetic to the problem. Supply chain attacks are way too easy

This is the crux as to why crate.io is great: you have one official place for packages that can be continuously audited and monitored. This is the way to do security, in my opinion, because you are always able to look at incoming and outgoing packages.

Now, you can argue if we are doing enough with this, and that is definitely up for debate, however, I contend that the premise of having one canonical place for package origination and hosting makes it easier to monitor for security purposes

mid-kid · 2 years ago

When you make a new library for C/C++, you generally just install it locally at first (pkg-config and PKG_CONFIG_PATH make this trivial). It doesn't (need to) get packaged by a distribution until a program using it is packaged. You don't need to do this yourself, unless you yourself use a distribution and want it packaged there.

As for the amount of dependencies, you have to wonder if it's better to have tonnes of small dependencies or whether it's better to have a few bigger ones. When I need something in a library, I don't make a new library for it, I first try contributing to it, and failing that I vendor and patch it. I feel this is less common in the brave new world of everyone just doing whatever and pushing it to a largely unmoderated repository...

galangalalgol · 2 years ago

C++ apps don't pull in that few dependencies. Take something modern that needs to talk over flatbuffs or protobuffs and maybe has a web ui based on emscripten? Then run ldd on it and recoil in horror. C++ dependency trees now often have a hundred or more items. Many of those are packaged with the OS, but that really shouldn't comfort you.

zozbot234 · 2 years ago

> In this brave new world the author is proposing - what?

In practice, you just vendor your code as part of the source package. It's the distro maintainers' job to factor it out as a separate dependency.

zajio1am · 2 years ago

> as an author of some not very popular packages, what do I do?

Depends. Do you want your software to be used by users? For me as a user, software that is not in distribution repository almost does not exist. It is too much hassle to install, manage and uninstall it compared to installing software from Linux distribution. If i really, really need it, i will make exception, but that are individual cases.

mattpallissard · 2 years ago

> “It works for C/C++” by making developers working in those languages pay in blood for every dependency they pull in.

I think the industry may have thrown the baby out with the bath water. Now they're is a complete aversion to RYO anything, no matter how simple. Personally, I'm waiting for the pendulum to swing back towards the middle a bit more.

IAmNotACellist · 2 years ago

>Yeah. The part that gives me goosebumps with that proposal is: as an author of some not very popular packages, what do I do? Is it my job to package and promote my packages on Debian, redhat, homebrew, and all the others? Do I need usage before they’ll pick it up? Sounds like a chicken and egg problem. And doing all that packaging work sounds like a boring, time consuming job that I don’t want.

Github premium (or similar) will have an army of AI builder bots that spin up VMs and modify your codebase until it works, then send you PRs to approve. People would pay a ton for that service. All your projects, automatically maintained and packaged.

paulddraper · 2 years ago

You create apt, rpm repos.

And users install from there.

Not saying that's a nice answer, but that is the answer.

mid-kid · 2 years ago

As a distro contributor, I don't share these views at all. Yes, it's sometimes a bit of a pain when some application needs an older, incompatible library, but in 99% of the cases, the fallout is minor, or there's some compatibility library or shim I can install. The advantage of doing this being that the old application will now support all modern audio/video formats (e.g. ffmpeg) and the new graphical and audio subsystems on linux (e.g. sdl), as well as whatever security fixes.

If I really need an old library, it's generally not hard to install that in a temporary prefix and set the relevant PATH variables for it, though I do wish this was easier sometimes. It gets better as more projects just use pkg-config.

That all said, Meson[1] solves all of these issues in a way that keeps both developers and distributions/users happy. But as is with all things C, getting everyone and especially windows users to adopt it is gonna take a good while.

[1]: https://mesonbuild.org/

lolinder · 2 years ago

> As a distro contributor, I don't share these views at all.

That's because you're in the very small group of people that knows how your distro's packaging process works in detail. The majority of developers don't use Linux at all, and those that do typically know just enough to install a few blessed packages, not enough to hunt down and install the correct shim in a temporary directly.

Multiply that lack of training by the dozens of Linux distros being used by users of a language, and it's no wonder most modern languages have a standard language-specific package manager. The alternative is for them to be constantly fielding impossible-to-answer questions from developers who (unlike you) don't understand their distro well enough to make compilation work.

It's far easier for the language to have a single cross-platform package management system where the answer is usually as simple as "X, or Y if you're on Windows".

saidinesh5 · 2 years ago

> The advantage of doing this being that the old application will now support all modern audio/video formats (e.g. ffmpeg) and the new graphical and audio subsystems on linux (e.g. sdl), as well as whatever security fixes.

That's the theory. In practise, it just means small app developers putting up with:

1) #ifdefs for different versions of same libraries in different distros

2) distro specific bug reports for an operating system with less than 2% user base

3) binary releases that target just 'Linux' being out of question for most proprietary applications / apps that can't/won't be included in distro repositories for various reasons.

It's not as if distro maintainers will test your random little application every time they update ffmpeg anyway. It's the users, for whom the application breaks.

Gigachad · 2 years ago

I just don't attempt to compile C/++ software these days because the process is always an absolute nightmare where you have to attempt to compile, then google the error to see what package it's related to, try to work out what the distro name for this package is, install, recompile, repeat 10 times. And hope you don't get completely stuck where you distro just doesn't have the package or version you need.

Something in rust is always just `cargo run` and you're done.

chrismorgan · 2 years ago

Well, clearly what I wrote has resonated with people, as this is now my second-highest-voted comment, currently at +101.

(My >4000 comments range in score from −4 to +76 plus one outlier at +287 because people really hate stale bots on issue trackers. See https://news.ycombinator.com/item?id=32541391 for methodology. This comment has also clearly evoked a strong emotional reaction.)

All I can say is that I’ve had a lot of trouble over the years building older C/C++ software, often seeking to consume it from Python but sometimes just from C/C++. Maybe if I lived and breathed those languages I’d cope better, but at the very least they consistently take extra effort that requires at least mildly specialised knowledge, compared with my experiences of pure-Rust or even mostly-Rust software.

dahhowl · 2 years ago

(wrong link, you mean .com)

Aerbil313 · 2 years ago

Take a look at nix-shell scripts. Much simpler, lightweight and faster alternative to Docker and friends for being able to build a project forever. Start of a build script from a C project of mine:

  #!/usr/bin/env nix-shell
  #! nix-shell -i fish --pure
  #! nix-shell -I https://github.com/NixOS/nixpkgs/archive/4ecab3273592f27479a583fb6d975d4aba3486fe.tar.gz
  #! nix-shell --packages fish gcc gnumake gmp git cacert
  ...build script goes here...

-i fish: I am using fish shell.

--pure: Environment is as pure as it can reasonably get.

--packages: Packages from nixpkgs to be made available in the environment.

-I <github_commit_hash>: pinned to a specific git commit of nixpkgs. This script will give you the exact same environment with the exact same packages with exact same dependencies forever.

You can drop into a shell with the exact same configuration by using the cli command nix-shell with the same options.

Admittedly this is not as declarative as using flakes, since it's a script, but hey, I'm lazy, still didn't sit down to learn them once and for all.

Reference: https://nixos.org/manual/nix/stable/command-ref/nix-shell

cge · 2 years ago

> Take a look at nix-shell scripts. Much simpler, lightweight and faster alternative to Docker and friends for being able to build a project forever.

As researchers using nix derivations to preserve usability of code meant to be archived with our papers, we found that it was not particularly effective for this task. Even with nixpkgs pinning, our code failed to compile after only a few years. Outside of nixos itself, it does not appear that the build environment is the same; aspects of compilation change, and we ended up needing to change compiler flags a few years later, while expecting that we might need to again at some point.

Overall nix has been quite disappointing for us from an archival standpoint.

oneshtein · 2 years ago

RPM .spec is more readable, IMHO.

Archive of system repo + mock + SRPM does the same thing, but offline.

NobodyNada · 2 years ago

It's even worse when you consider cross-platform or even cross-distro builds. Take a C++ application developed on Linux and try to build it on Mac, or vice versa. This can be really painful even without waiting 5 years because oftentimes two different package managers have incompatible versions of packages, or even if the versions are the same they're built with different settings/features. Not to mention the pain of search paths and filenames differing across platforms and breaking assumptions in build systems.

j1elo · 2 years ago

As a C++ dev myself, and lots of past experiences with code archeology, nowadays I wouldn't even bat an eye before reaching for a Docker container of the appropriate system version, where to build an old and unmaintained C or C++ codebase. Otherwise, pain ensues...

If we add loading binary libs built for an older system, then there is no question. Who wants to deal with incompatible versions of the GLIBC?

JohnFen · 2 years ago

> Take an app written five years ago that depends on majorish C++ libraries, and there’s a fair chance compiling it will be a nightmare

Interesting. My experience doesn't back this up at all. I may have issues with 20+ year old code, but I don't remember ever having anything but minor issues otherwise.

I wonder what the difference between us is?

bfrog · 2 years ago

C++ is horrendous when it comes to backwards compatibility and builds.

If you believe that you won't run into a compiler bug in a large enough sampling of C++ code to build against you are misrepresenting reality or haven't ever looked at the gcc or clang bug trackers.

Rust actually does a really nice job along these lines by continuously building and running test for some subset of crates in crates.io

The only thing maybe even close to that with gcc/clang involved I've seen is nixos's hydra.

tikkabhuna · 2 years ago

> If crates.io goes down or access is otherwise disrupted then the Rust community will stop work.

As someone who has worked in a big corporate their entire career this statement is bizarre. This has been solved for Java for over a decade.

Builds don't access the internet directly. You host a proxy on-premise (Nexus/Artifactory/etc) and access that. If the internet location is down, you continue as normal.

> You need only one author in your maybe-hundreds-of-dependencies tree to be hacked, coerced or in a malicious mood for you to have a really bad day.

There are products (Nexus Firewall) that can check dependencies for vulnerabilities and either block them from entering the network or fail CI pipelines.

brabel · 2 years ago

The main problem the author is talking about is actually about version updates, which in Maven as well as crates.io is up to each lib's author, and is not curated in any way.

There's no technical solution to that, really. Do you think Nexus Firewall can pick up every exploit, or even most? How confident of that are you, and what data do you have to back that up? I don't have any myself, but would not be surprised at all if "hackers" can easily work around their scanning.

However, I don't have a better approach than using scanning tools like Nexus, or as the author proposes, use a curated library repository like Debian is doing (which hopefully gets enough eyeballs to remain secure) or the https://github.com/crev-dev/cargo-crev project (manually reviewed code) also mentioned. It's interesting that they mention C/C++ just rely on distros providing dynamic libs instead which means you don't even control your dependencies versions, some distro does (how reliable is the distro?)... I wonder if that could work for other languages or if it's just as painful as it looks in the C world.

oblio · 2 years ago

Everyone is focusing on Nexus Firewall. Nexus/Artifactory themselves are private repos/proxies/caches that are in widespread use in the Java world so up to a point most companies prune their own repos.

This is NOT standard practice in the JS/Python/PHP/whatever world because these products do not exist or are not widely available, everyone "drinks" directly from the open internet source.

Yes, again, solutions exist, technically, but how many companies do you know, especially smaller ones, that do this?

For Java almost every company bigger than 5 devs tends to set up their own Maven repo at some point, many do it on day 1 because at least 1 dev is familiar with this practice from their former BigCo and it's easy to set up Nexus/Artifactory.

mid-kid · 2 years ago

It works in the C world because everyone is super fixated on library compatibility, and the programs are expected to be rebuilt by the distribution when necessary. When an API is broken, this entails a new major version and installing both versions side by side until all maintained software is ported. This doesn't happen often enough to be an issue - that's the perk of having a really mature ecosystem.

jjgreen · 2 years ago

Coding mainly in C for 20 years, I've never had a program break on minor version of system shared library conflict. Occasionally need to recompile on OS version update when the major version increments.

amelius · 2 years ago

> There's no technical solution to that, really.

There is. Theoretically speaking, you could confine libraries to their own sandbox.

But Rust can't do that.

wongarsu · 2 years ago

And this is easy to do with Rust as well. The article mentions both major options: store your dependencies in your repo using the built-in `cargo vendor` command, or run a mirror like panamax that just mirrors all of crates.io (or just the packages you need, but then you run into issues with updating if crates.io is ever down).

Mirroring all of crates.io takes about a terabyte of storage, not much bandwidth and is really easy to set up, but judging by the download numbers of obscure packages there are only about 20 places in the world doing that. Maybe people haven't bothered because crates.io is so stable, maybe everyone goes the vendoring route instead.

galangalalgol · 2 years ago

Why would you ask panamax to mirror everything though? You are maintaining a manifest of all your organizations dependencies right? Why would I mirror ancient versionsbof clap and that crate that checks your aol mail? Also, nexus doesn't have a functioning cargo plugin, it is community made and stagnant. Artifactory does support it. And I'm guessing artifactory is what most people are using for a private registry too given the state of the others.

api · 2 years ago

These days some laptops have enough storage to comfortably mirror all of multiple language package repositories especially if you use a compressed volume for them.

sgu999 · 2 years ago

> `cargo vendor`

I had no idea \o/ Thanks!

KronisLV · 2 years ago

> As someone who has worked in a big corporate their entire career this statement is bizarre. This has been solved for Java for over a decade.

> Builds don't access the internet directly. You host a proxy on-premise (Nexus/Artifactory/etc) and access that. If the internet location is down, you continue as normal.

Right now I'm using Sonatype Nexus (the free version) for my personal needs: https://www.sonatype.com/products/sonatype-nexus-repository

And I do mean most things: apt packages, Docker images, Helm charts, as well as packages for Go, Java, .NET, Node, Python and Ruby, maybe even more in the future. I'm very glad that it is possible to do something like that, but honestly it's hard to do and I can easily see why most wouldn't bother: even though the configuration UI is okay and you can create users with appropriate permissions, you'll still need to have plenty of storage, manage updates and backups, setup automated cleanup tasks (especially for Docker images), in addition to the whole process on the developer's side sometimes being a nuisance.

With npm you get --registry which is one of the simpler approaches, Ruby Gems needed some CLI commands along the lines of "bundle config mirror" and other stuff, Maven needs an XML file with the configuration, in addition to publishing your own packages to those also needing more configuration. Sadly, it seems like this fragmentation of approaches is unavoidable and complexity increases with the amount of languages you use.

On the bright side, I keep control over my data, as well as have really good download/upload speeds, even on a cheap VPS. Plus you learn some vaguely useful things the more you explore if you want to go further, like how I also start with Debian/Ubuntu base container images and provision whatever tools or runtimes I need for more specialized images (e.g. my own JDK, .NET, Node etc. images).

oblio · 2 years ago

I don't know if they ever implemented it, can it offload the storage to S3? The cost is peanuts and you stop bothering about the annoying recurrent storage issues.

tikkabhuna · 2 years ago

The UI does leave a lot to be desired, definitely! I wish they'd add specific UIs for the different formats.

I agree with the pain that some languages/package managers add as well. Its a shame package managers aren't necessarily designed to easily support writing your own mirror or switching repository.

Fair play for running your own Nexus for personal use!

hun3 · 2 years ago

> There are products (Nexus Firewall) that can check dependencies for vulnerabilities

Products can only do best effort scanning for known patterns of vulnerabilites. Those are only added after someone (or something) discovers it and verifies that it's not a false alarm. In between the time gap, scanners are ineffective and anything in your infrastructure can run the malicious code and get hacked. (Google supply chain incidents on npm or pypi.)

In general, there is always a way to bypass it, since the language is turing complete and unsandboxed. Anything going beyond that is a lie, or simply impractical. Systems with latest software and antivirus get hacked all the time. Nothing can really stop them. Why?

- https://web.archive.org/web/20160424133617/http://www.pcworl...

- https://security.stackexchange.com/questions/201992/has-it-b...

- https://en.wikipedia.org/wiki/Rice%27s_theorem

> and either block them from entering the network or fail CI pipelines.

Also, relying solely on an endpoint security product for protection is dangerous since antiviruses themselves get hacked all the time. Sonatype for an example:

- https://nvd.nist.gov/vuln/search/results?form_type=Basic&res...

hoosieree · 2 years ago

My research is about detecting semantically similar executable code inside obfuscated and stripped programs. I don't know what commercial antivirus or vulnerability scanners use internally, but it's possible to generate similarity scores between an obfuscated/stripped unknown binary and a bunch of known binaries. I suspect commercial scanners use a lot of heuristics. I know IDA Pro has a plugin for "fingerprinting" but it's based on hashes of byte sequences and can be spoofed.

My approach is basically: train a model on a large obfuscated dataset seeded with "known" examples. While you can't say with certainty what an unknown sample contains, you can determine how similar it is to a known sample, so you can spend more of your time analyzing the really weird stuff.

The hardest part in my opinion is generating the training data. You need a good source code obfuscator for your language. I've seen a lot of papers that use obfuscator-llvm[1] to obfuscate the IR during compilation. I use Tigress[2] to obfuscate the source code because it provides more diversity, but it only supports C.

[1]: https://github.com/obfuscator-llvm/obfuscator/wiki/Installat...

[2]: https://tigress.wtf/

tgv · 2 years ago

> There are products (Nexus Firewall) that can check dependencies for vulnerabilities and either block them from entering the network or fail CI pipelines.

Once they're known. I suppose nexus firewall let log4j pass.

ashishbijlani · 2 years ago

I’ve been building Packj [1] to detect publicly UNKNOWN dummy, malicious, abandoned, typo-squatting, and other "risky" PyPI/NPM/Ruby/PHP/Maven/Rust packages. It carries out static/dynamic/metadata analysis and scans for 40+ attributes such as num funcs/files, spawning of shell, use of SSH keys, network communication, use of decode+eval, etc. to flag risky packages. Packj Github action [2] can alert if a risky dependency is pulled into your build.

1. https://github.com/ossillate-inc/packj 2. https://github.com/ossillate-inc/packj-github-action

livealight · 2 years ago

It let log4j pass for as long as it was known to be good. Within hours of the CVE opening the tool was blocking it. The purpose of dependency firewalls is to avoid two things: known badly vulnerable packages AND known malicious packages that serve no other purpose than to steal data or drop a trojan. No security is 100% bulletproof, but it's really surprising how much of the damage is done by 7 year old CVEs. Firewalls can be useful in exactly that.

oblio · 2 years ago

> Once they're known.

Isn't that how life works? Unknown unknowns? :-)

> I suppose nexus firewall let log4j pass.

It should be reasonable to assume that a product dedicated to something will, on average, do this job better/faster than a random developer, though. Since that's their job.

So Nexus Firewall presumably blocks these vulnerabilities faster than the average app/system developer notices them. There's value in reducing the delta between the vulnerability exploit day and an entire ecosystem being patched/fixed.

vilunov · 2 years ago

Debian devs also let log4j pass.

tikkabhuna · 2 years ago

Yes, but nothing can prevent you from consuming unknown vulnerabilities.

Once log4shell was discovered it was trivial to eliminate it from the organisation. We knew exactly what projects were using it and updated them accordingly. Access logs showed us that it was no longer used and we delete it.

richardwhiuk · 2 years ago

Every serious enterprise runs a Rust proxy - and indeed Artifactory / Azure Artifacts and others support this.

jrpelkonen · 2 years ago

Sure, Artifactory, etc. have their places. But I am objecting to your "no true scotsman"[1] type of argument. Furthermore, the most bizarre setups I have seen have been in "serious" enterprises. Hardly an endorsement.

1: https://en.wikipedia.org/wiki/No_true_Scotsman

brightball · 2 years ago

Gitlab has this built in as well.

EDIT: Has a proxy for packages.

https://docs.gitlab.com/ee/user/packages/package_registry/

basedrum · 2 years ago

That isn't a proxy, it's a registry for packages you build

basedrum · 2 years ago

Has which built in?

keep_reading · 2 years ago

I have CI pipelines fail when Github goes down because of crates.io being inaccessible.

Does anyone have a proxy for Rust? I'd love to run it. Please. Someone... help

edit: just learned Panamax exists for this... will try it

duped · 2 years ago

You are fundamentally limiting your ability to build Rust projects if you rely on the system's package manager.

A Rust program `p` can depend on `q` and `v` which depend on `u`, but `q` requires `u` @ 1.0.0 and `v` requires `u` @ 2.0.0. In the Debian world, you would have to change `p` and `v` to support a mutually compatible version of `u`. If those changes introduce bugs, they're Debian's bugs, but the author of `p` is the one who gets the reports.

Cargo naturally handles this case, it installs multiple incompatible versions at unique locations in its cache, and correctly invokes the system linker to link the compiled objects correctly. If `p` was a C program, we could not handle this case, because `u` would be a shared library and you cannot link two different versions of the same shared object into a single address space at runtime (if they were static libraries, a sufficiently smart build system could deal with it, just like cargo - but there is no unified build system and package manager for C, so good luck).

Further, Cargo creates lock files for your projects, and authors commit and push them for executables. If you build against a lockfile. This prevents most of the cases that the author is bemoaning. People don't run Cargo update willy nilly.

Frankly, the "problems" the author is claiming don't exist in practice and the "solution" makes life worse as a Rust developer.

I'm getting tired of hearing people praise broken package managers like Debian. They make life worse for everyone for no benefit. How many times do people need to read testimony that "C/C++ package management sucked and Rust is great just for Cargo alone!" before they get it?

TeeMassive · 2 years ago

Lock files are only one part of the solution. Even with a lock file dependencies need to be fetched from somewhere and if a particular version of a package was deleted by its author then you're back to case one.

"People don't run Cargo update willy nilly."

We don't know the same people then. Most devs like to keep their stuff up to date. Auditing the dependencies tree before updating a lock file is like reading a ToS. Most people don't even if they should.The problem of securing the chain of production is not solved with a simple lock file.

estebank · 2 years ago

> Even with a lock file dependencies need to be fetched from somewhere and if a particular version of a package was deleted by its author then you're back to case one.

That's not quite correct. Even if a library version is yanked (which hides it in crates.io and makes it so cargo will skip that version during resolution), if it appears in a lock file, cargo will dutifully find it and use it. This is explicitly to avoid the leftpad behavior of a library disappearing breaking your build.

duped · 2 years ago

They get fetched from your local cargo cache, unless you purged that too. You're also not back to case 1, you get a compilation error that tells you that the package doesn't exist and you'll need to audit a replacement. This is a lot better than trusting someone to drop a replacement in automatically. But if you want that, there's cargo update - and like you said, it has issues (because updating dependencies always has issues).

> The problem of securing the chain of production is not solved with a simple lock file.

I didn't say it was. I said the issues the author describes almost all are. Relying on distro maintainers doesn't solve them either.

nindalf · 2 years ago

The author makes one big assumption - that the packaged code in Debian is better because it has been reviewed and vetted by someone competent. If this was true, it would make this approach worthwhile, but I don't think it is. The author could have pointed to some data that demonstrates this, but didn't. I remain unconvinced.

As far as I can tell, this is more about the feeling of security because of familiarity. I can completely understand why they would feel more comfortable using something they've successfully used for many years, but that feeling doesn't actually translate to improvements in security. There is a feeble attempt to suggest that slowing down the rate of change improves security, but I would prefer to have the latest versions of the language toolchain and libraries available immediately, so I can make a decision on what to use myself.

If security actually needs to be addressed, it's going to be by paying maintainers and paying people to review dependencies. And that would likely be in the framework of PyPI/crates.io/npm, because none of them place constraints on where the code can be used, unlike using a single Linux distro's package manager.

Arnavion · 2 years ago

>The author makes one big assumption - that the packaged code in Debian is better because it has been reviewed and vetted by someone competent.

They do not make this assumption, and in fact state the opposite.

nindalf · 2 years ago

They state that

> They might decide to give the diff at least a cursory look, which is better than nothing.

So yes, they do value the cursory review, which isn’t that useful. And they talk up the security benefit of only pulling in a subset of published updates, which is even more dubious.

Contrast this with the approach that Google is taking here - https://opensource.googleblog.com/2023/05/open-sourcing-our-.... Actual, careful review of the code that anyone can take advantage of. And no limitation of having to choose one flavour of one OS. This review helps you audit your supply chain regardless of OS you’re developing and deploying on.

hobofan · 2 years ago

> There is no mediation of any kind between when a new library/version is published and when it is consumed.

> A simple time delay will allow egregious malware like malicious build.rs scripts to be caught, whether that’s the super-long Debian stable cycle or even the several days required to migrate from unstable to testing.

Normal cargo + crates.io usage has the same property via the existence of lockfiles. For most people that care about stability, this will provide a natural delay (which you can also see in crates.io download graphs) to consuming the latest version.

-----

The other points brought up in the article are definitely valid, but if that's the main strategy of defense you are looking for, you might already have it.

dathinab · 2 years ago

> if that's the main strategy of defense

mostly, there can be a small attack gap when adding new dependencies to the project and at the specific point in time when you run `update`

through you can also pin versions in `Cargo.toml` and then review any updates, maybe except for a few highly trusted sources (it's a bit annoying and costly (time wise) but viable)

Through trying to vendor things, especially with Debian, seems like a horrible solution. And there is a lot of precedence for this causing tons of headaches, wrong bug reports and similar for developer (== time loss == time is money so we could probably be speaking about multiple millions of monetary damages).

Through I have been thinking for a while that it could be interesting to have a partial crates.io "proxy" which only contains and updates crates which have gotten a "basic screening" in all of their code and dependencies probably with some tool assistance this might not find bugs but it should find many forms of supply chain attacks. Through given that this is costly to provide it probably would not be a free service.

hobofan · 2 years ago

> there can be a small attack gap when adding new dependencies to the project

Most package managers will keep the versions of transitive dependencies as unchanged as possible when adding a new direct dependency.

Of course if the only solution to satisfy the dependencies of the new direct dependency is to upgrade a transitive dependency, that will be done.

(I've seen a lot of people treat dependency additions as completely unpredictable operations that regenerate the whole lockfile in the past, which is why I wanted to clear this up.)

frankjr · 2 years ago

> Normal cargo + crates.io usage has the same property via the existence of lockfiles. For most people that care about stability, this will provide a natural delay (which you can also see in crates.io download graphs) to consuming the latest version.

Be aware however that lock files are ignored by default. You need to specify "--locked" if you want Cargo to use the "locked" versions from Cargo.lock.

> By default, the Cargo.lock file that is included with the package will be ignored. This means that Cargo will recompute which versions of dependencies to use, possibly using newer versions that have been released since the package was published.

https://doc.rust-lang.org/cargo/commands/cargo-install.html#...

MrJohz · 2 years ago

This is installing binaries from crates.io directly (i.e. running "cargo install ripgrep" instead of "sudo apt..." or whatever equivalent). Admittedly, this is often a suggested installation method for some built-in-Rust tools, but it's usually an option of last resort, and most people should be using their usual package manager for this.

Cargo does respect the package lock file when building a project from source (i.e. when you checkout a project, and run `cargo build` in that project directory). This happens, for example, when running builds in CI, or preparing a new release for each individual package manager.

hobofan · 2 years ago

This only applies to the `cargo install` command to install binaries from crates.io.

If you work in a repository on your own project and use basically any other cargo command (e.g. `cargo build`, `cargo run`), the lock file is of course taken into account (otherwise it would be near-useless).

Dowwie · 2 years ago

These concerns pertain to every language using a centralized package authority. None of them-- crates.io, npm, pypi, hex -- are adequately funded by the multi-billion dollar industries they enable. So, they aren't as resilient and secure as they can be. The solution isn't to create yet another package authority but to reinforce those that the ecosystems are already relying on.

metaltyphoon · 2 years ago

Npm is owned by Microsoft no?

jjice · 2 years ago

Yes https://www.zdnet.com/article/microsoft-buys-javascript-deve...

pasloc · 2 years ago

In my opinion this just fragments the rust ecosystem and creates more problems than it solves. What about to just have mirrors of crates.io?

And I completely disagree that that someone else than the authors should distribute or manage the package. They know the best how and what there is to do in certain situation (security fixes and so on). Of course it is good to have independent reviews, but these independent people should not be in charge of updating, maintaining and updating these packages/libraries.

jkbbwr · 2 years ago

Having worked with C/C++ projects. Managing dependencies is downright painful. About the best option we found was to treat everything as build from source and then git submodule our dependencies in. This is still not good but at least it gave us a consistent environment.

palata · 2 years ago

> Managing dependencies is downright painful.

The risk when it is too easy is that you suddenly pull half of the Internet into your project. In C++, the cost of having a dependency makes me choose carefully if I want a dependency or not, and if I do, if I want this particular dependency (Who develops it? Is it maintained? If it stops being maintained, can I easily move to an alternative, or could I take over maintenance? ...).

> About the best option we found was to treat everything as build from source and then git submodule our dependencies in.

If you mean having a git submodule and doing something like `add_subdirectory(dependencyA)` in CMake, please don't do that. Building from source is fine, but then you can totally install the dependency locally and have your build system (be it CMake, Meson, ...) fetch it. At least I started doing it the way it is described here [1] and it works well for me.

[1]: https://www.acarg.ch/posts/cmake-deps/

vilunov · 2 years ago

Requirement to "install dependencies locally" is part of the pain with C++ dep management. Not having to do it makes builds much easier to define, as you don't need to supports lots of distros and hope that everyone of them is up to date.

kouteiheika · 2 years ago

> The risk when it is too easy is that you suddenly pull half of the Internet into your project. In C++, the cost of having a dependency makes me choose carefully if I want a dependency or not, and if I do, if I want this particular dependency (Who develops it? Is it maintained? If it stops being maintained, can I easily move to an alternative, or could I take over maintenance? ...).

There is no such risk, as you're not playing a roulette. No one is pulling a gun to your head and forcing you to pull in hundreds of dependencies.

You can do exactly the same in Rust as you do in C++ and be conservative in what you use for dependencies. This is what I do, and it works great.

Now, that said, I agree that for the ecosystem as a whole this is a problem and people do tend to be too trigger happy when pulling in dependencies.