The metadata problem is related to the problem that pip had an unsound resolution algorithm based on "try to resolve something optimistically and hope it works when you get stuck and try to backtrack".
I did a lot of research along the line that led to uv 5 years ago and came to the conclusion that installing out of wheels you can set up a SMT problem the same way maven does and solve it right the first time. They had a PEP to publish metadata files for wheels in PyPi but I'd built something before that could suck the metadata out of a wheel with just 3 http range requests. I believed that any given project might depend on a legacy egg and in those cases you can build that egg into a wheel via a special process and store it in a private repo (a must for the perfect Python build system)
Back in the case of eggs you couldn't count on having the metadata until you ran setup.py which forced pip to be unreliable because so much stuff got installed and uninstalled in the process of a build.
There is a need for a complete answer for dev and private builds, I'll grant that. Private repos like we are used to in maven would help.
Maybe the solution will be for tools like uv or poetry to warn if dynamic metadata is used and strongly discourage it. Then over time the users of packages that use dynamic metadata will start to urge the package authors to stop using it.
I wouldn’t bet on this one. I know a lot of python package maintainers who would likely rather kill their project than to adapt to a standard they don’t like. For example see flake8’s stance on even supporting pyproject.toml files which have been the standard for years: https://github.com/PyCQA/flake8/issues/234#issuecomment-8128...
I know because I’m the one that added pyproject.toml support to mypy 3.5 years ago. Python package developers can rival Linux kernel maintainers for resistance to change.
> The challenge with dynamic metadata in Python is vast, but unless you are writing a resolver or packaging tool, you're not going to experience the pain as much.
But that is by choice, I as a user, am forced to debug this pile of garbage whenever things go wrong, so in a way it's even worse for users. It's a running joke in the machine learning community that the hard part about machine learning is having to deal with python packages.
A lot of the problem seems to be driven by a desire to have editable installs. I personally have never understood why having editable installs is such an important need. When I'm working on a Python package and need to test something, I just run
python -m pip install --user <package_name>
and I now have a local installation that I can use for testing.
That would you require to make re-installations if your local app you develop against after every code change. Very few people will want to do that and it’s potentially very slow.
It’s also a step not needed by most other ecosystems.
Go (a.k.a. Golang), with its network-first import system (i.e. import "example.org/foo/bar"), has solved the problem in a surprisingly simple way. You just add a "replace" directive in a go.mod file and you can point your import (and all child imports) to any directory on the filesystem.
Potentially, perhaps. But it's certainly not for the cases where I use it: a pure python package, whose dependencies are already installed and are not changing (only the package itself is). Under those conditions, the command line I gave takes a couple of seconds to run.
https://about.scarf.sh/post/python-wheels-vs-eggs
The metadata problem is related to the problem that pip had an unsound resolution algorithm based on "try to resolve something optimistically and hope it works when you get stuck and try to backtrack".
I did a lot of research along the line that led to uv 5 years ago and came to the conclusion that installing out of wheels you can set up a SMT problem the same way maven does and solve it right the first time. They had a PEP to publish metadata files for wheels in PyPi but I'd built something before that could suck the metadata out of a wheel with just 3 http range requests. I believed that any given project might depend on a legacy egg and in those cases you can build that egg into a wheel via a special process and store it in a private repo (a must for the perfect Python build system)
Range requests are used by both uv and pip if the index supports it, but they have to make educated guesses about how reliable that metadata is.
The main problem are local packages during development and source distributions.
There is a need for a complete answer for dev and private builds, I'll grant that. Private repos like we are used to in maven would help.
I know because I’m the one that added pyproject.toml support to mypy 3.5 years ago. Python package developers can rival Linux kernel maintainers for resistance to change.
Deleted Comment
But that is by choice, I as a user, am forced to debug this pile of garbage whenever things go wrong, so in a way it's even worse for users. It's a running joke in the machine learning community that the hard part about machine learning is having to deal with python packages.
python -m pip install --user <package_name>
and I now have a local installation that I can use for testing.
It’s also a step not needed by most other ecosystems.
From what I can gather, most other ecosystems don't even have the problem under discussion.
Potentially, perhaps. But it's certainly not for the cases where I use it: a pure python package, whose dependencies are already installed and are not changing (only the package itself is). Under those conditions, the command line I gave takes a couple of seconds to run.
I get that Python is, strictly speaking, an older language. But, it isn't like these are at all new considerations.