I tried PDM earlier this year and there's a few things worth noting:
- PEP582 which it is based on is still in draft, and some tools (VS Code) won't fully support it until it's accepted.
- If you want to develop or test using different Python versions, you still need to use a virtual environment. PDM does handle this for you though.
IMO, the Python packaging ecosystem has been a dumpster fire for a long time, but the flurry of recent PEPs that provide a standard way of defining a package with pyproject.toml have made it so much better. Now it's just a matter of the tools catching up.
Dumpster fire compared to what? Nuget? NPM? It may be that some more recent languages managed to start a saner solution and keep it sane.
But really, it's a hard problem, between cross-platform support, backwards compatibility, security concerns, hosting, most authors being volunteers and so on.
And still, even with "just" pip or even conda I am enjoying the Python experience more than some other packaging solutions I've seen.
It's objectively a dumpster fire. I don't care about other languages also being a dumpster fire on the packaging front, because I don't use other languages :) It also doesn't help me to know that NPM or .NET have rubbish packaging systems, that's their problem and not mine. I'd primarily want python's packaging system to be good.
I mean, just look at this thread. Someone asks "So in light of this, what should I use for python packaging?", and they get two dozen different answers loaded with weirdness like pyenv, python-venv, virtualenvwrapper, etc... If I wasn't using python, I would've thought this is some cruel python in-joke that outsiders don't get.
Just looking at those names, I'm already confused as to what the hell each are doing and why do I need them?
But let's go back to pip and conda. Conda is unbearably slow. Pip is not entirely reliable in resolving versions properly. There's also not entirely cooperative interaction between conda and pip. If you use conda, you should not use pip (or just minimally) because it'll result in a mess.
Yes, packaging is hard, but it feels like python has managed to solve (or not solve) it in a uniquely obtuse and bad way so far.
Hopefully the slew of new PEPs will finally bring some clarity to this mess.
Whoa there. Setting aside the stagnation of Perl due to the v6 debacle (which, by the way, Python 3 came very close to succumbing to the same fate), CPAN is widely recognized as a very successful package system, and is frequently the envy of many other languages. DBI, the Net:: space, and many others just work, and the package names follow common sense.
I started a new Python project last month. I tried both Poetry and PDM but I decided not to use neither of them. PDM is currently basically one man show, and the Poetry's doc isn't great - The doc page seems pretty but it only describes command line usages and does not tell how I can configure metadata. Most importantly Poetry does not support the standard PEP621 yet.
So I stick with this setup:
- Use pyenv to manage different Python versions and virtual environments.
It's pretty simple. Check in the lock file, and then run
$ poetry install
to replicate it.
> - Use pip freeze > requirements.txt as a "lockfile".
There's lots of reasons to not do this anymore, and Dependency Hell is real, and has been for 25 years with RedHat RPM's, etc.
Even if you don't want to rely upon poetry for building in prod, poetry can still export a requirements.txt file for you, so you're not locked into using poetry, but you still get to specify the high level packages you want, and let it solve the dep graph for you.
That probably works for smaller projects without many dependencies, but it’s just going to install the sub-dependency versions that satisfy whatever comes last in the requirements file. The pip docs describe that situation here: https://pip.pypa.io/en/latest/topics/dependency-resolution/
The pip docs also suggest using pip-tools to create lock files. Pip-tools is only for creating lock files (it’s not trying to fix virtualenvs like poetry is), and it works great.
Yeah our codebase has a requirements file that takes over an hour to install with the new dependency resolver (and over 10 minutes using the deprecated resolver.) This is on a 6c/12t ryzen with 32g of ram and a gigabit connection.
I believe pipenv (not pyenv!) is also a viable option for correct versioning, though I'm not actually sure whether it or pip-tools is more actively developed these days. Last I used pipenv though (~2 years ago), it was a nicer virtualenv + pip-tools combination, but had worse version resolution / less useful verbose output for no apparent reason (since iirc it shares tons of code with pip-tools).
Putting that aside though: yes, 100% pip-tools or an equivalent (which pyenv is not). It's the only sane way to both freeze dependencies for reproducibility, and maintain upgradability. I've used pip-tools for years on many large, complex projects, and it has been a huge benefit every single time. And it routinely revealed significant library-incompatibility problems that teams had only luckily dodged due to not using feature X yet, because pip's resolver has been so braindead for forever.
> - Use pip freeze > requirements.txt as a "lockfile".
This is not and has never ever been correct. It makes it infinitely harder to install an application vs the standard `pip install -e .` which works on every package manager and avoids PYTHONPATH issues, as well as being able to publish your application to PyPI for easy installation (as simple as pip install --user app or pipx install app).
I 100% agree, I’ve made over 100 projects in the past 8 years of being deep into Python, ranging from enterprise software to little hobby projects. I’ve settled on exactly the setup. It’s great when you have multiple projects on the same machine, different dependencies, versions, maybe even specially forked changed and altered versions of libraries. It’s super versatile and easy to share. No problems running my software at all. I’ve even written a script that each time I commit for git it will quickly generate a new requirements file such that it’s always up to date. Thank you for sharing.
Genuine question: If I am starting a Python project NOW, which one do I use? I have been using pipenv for quite some time and it works great but locking speed has been problematic, specially after your project grows large enough (minutes waiting for it to lock without any progress warning at all).
Should I just upgrade to Poetry or should I just dive headfirst into PDM? Keep myself at Pipenv? I'm at a loss.
Python standard library is great and its a nice languate if you like the syntax but aside from a few constants like Django, Flask, and Pandas, the ecosystem feels like it is slowly turning into a fragmented mess.
If you're building a package, pip install -e . is preferable to -r requirements.txt. Most projects don't need and shouldn't use requirements.txt. The only ones that do are where you're shipping a whole tested environment out, like a docker image for deployment. And in that case you need to be using something like pip-tools to keep that requirements for up to date.
Same, I use two small bash scripts in my projects bin/ directory, "venv-create" for the creation of the .venv/ and "venv-python" for running the effective version of Python from the .venv/ -- this sets environment variables such as PYTHONPATH and PYTHONDONTWRITEBYTECODE and provides a single approach for running project files and packages.
I get versioned requirements files for the project base requirements, and also for each (version, implementation) of python, in case they are changes, and this has proven to be reliable for me.
It's all about finding the minimal-but-complete convenience / ergonomic solution over the, err, inconvenience of packaging. I also marvel at when I attempt to explain these things to experienced programmers, I only manage to convince them 50% of the time at most.
That's what I use as well. It works great, it's built in and it's easy to use and understand. Only issue is when you upgrade the version of Python you're running. In that case you might need to rebuild your virtualenv, but that's super easy.
I use the same solution to have multiple versions of Ansible installed.
If you need to run multiple version of Python, then virtualenvs might not be enough, but that's honestly not a problem I have. New version of Python, great, let's me just rebuild my virtualenv and get back to work.
One of the most important rules I have regarding working in Python is: Never, never ever, install ANYTHING with the global pip. Everything goes in to virtualenvs.
... bless my `zsh` shell history for these incantations. I don't think I have any hope of remembering it -- probably because of all the old virtualenv incantations!
Kind of agree with pipenv though. It's painfully slow, but it abstracts away having to worry about various requirements files (eg: dev vs prod) and the .lock keeps things consistent.
PDM author here, if anyone want to know the advantage of __pypackages__ over virtualenv-based tools(venv, virtualenv, poetry, pipenv), here it is:
The fact that virtualenvs come with a cloned(or symlinked) interpreter makes it vulnerable when users want to upgrade the host interpreter in-place unless you keep all the old installations in your system, which is what pyenv is doing. You can imagine how many interpreters, including virtualenv-embedded ones are on your machine.
You can regard __pypackages__ as a virtualenv WITHOUT the interpreter, it can easily work with any python interpreter you choose as long as it has the same major.minor version as the packages folder.
If you're building an application, use Poetry. If you're building a library, use Flit. Use PEP621 metadata in pyproject.toml regardless.
Poetry is much more focused on managing dependencies for applications than dealing with libraries that have to be used by other libraries or applications. See this deep discussion for some timely/relevant examples: https://iscinumpy.dev/post/bound-version-constraints/
Sold everyone on using poetry a few months ago and now red faced as we have a litany of problems and time waste due to it. We are now sitting on some bleeding edge branch because specific dependencies cannot work at all without some new fangled feature and everyone wishes we were just using plain virtualenv as we had much less problems with that.
Great. I have heard "anecdata evidence" that sometimes poetry fails to install a combination of packages or something along those lines, did you find any of those shenanigans in your own experience?
Alternatively, pyenv and pyenv-virtualenv for shell integration and seamless virtualenv activation.
To be fair, I'm not saying there's anything wrong with virtualenvwrapper, just that I've never used it and for my purposes the above solution works well.
This doesn’t solve dependency management?
All it does is it separates your env and you can install what you need there.
But installing with pip is still subject to version incompatibility etc.
I’d just use pip and requirements files if you can. It’s doubtful that your requirements are sufficiently complex as to require a more complex resolver, although that depends on your ML needs.
Having used PDM now for several projects, it's my preferred package manager over poetry and others. Its dependency resolver is both faster and more forgiving than poetry's. I also like the built-in task management system similar to npm's.
> PDM is meant to be a next generation Python package management tool. It was originally built for personal use. If you feel you are going well with Pipenv or Poetry and don't want to introduce another package manager, just stick to it. But if you are missing something that is not present in those tools, you can probably find some goodness in pdm.
Having used PDM a bit, its ambition in my opinion may not be to replace existing tools, but rather to experiment and implement the most recent PEPs related to packaging.
While you can argue about PEP 582[1] implementation (which is still in draft), PDM doesn't prevent anyone from using virtual environments, and even provides a plugin[2] to support that.
PDM also implements PEP 631[3], which most other package managers have been relunctant to support or slow to adopt.
Thanks for the kind words on PDM. At time of creating PDM I don't want it to be similar with any other package mangers, so I chose PEP 582, and I thought I can play more new stuff on it.
But as PDM becomes mature, it is acknowleged by the Python packaging people, I also work hard to make PDM fit more people's workflow, fortunately, it has a strong plugin system. You can add virtualenv support(pdm-venv), publish command(pdm-publish) and more. In the future, I would like to see it can eventually push the iteration of PEP 582 and make it finalized.
Just made an account to say this. I am really impressed by your projects. I first found out about pdm after writing a small plugin for marko (which is amazing by the way) and checking out your github profile. I find what you write to be really well thought out and approachable.
The big distinguisher of PDM is that it support PEP 582[0]. That means it works less like Pip and works more like NPM of the JS world. To quote PEP 582:
> This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__ directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate "virtual environments". Python will use the __pypackages__ from the base directory of the script when present.
Thus, the idea of PDM is that it will create a directory, called `__pypackages__` in the root of your project and in that folder it'll populate all the dependencies for that project. Then, when you run scripts in the root folder of your project, your Python install will see that there's a `__pypackages__` folder and use that folder to look up dependencies.
This style of "dependencies inside the project directory" is similar to how npm of the Javascript ecosystem works, where it creates a `node_modules/` folder in the root of your project and fills that folder with the dependencies for your project. This style of dependency management is different from other package managers such as Poetry (Python), Pip (Python), go (Golang), and cargo (Rust), all of which instead have a sort of "secret folder acting as cache of dependencies at particular versions", a folder that's usually pretty hidden out of the way, in which the package manager automatically manages the acquisition, storage, and versioning/version resolution (Poetry, Go, Cargo, all do this but Pip does not).
That's a very fast and probably wrong rundown on what makes this package manager different from others.
I’ve long been of the opinion that pip and venv (and sometimes pyenv) is good enough. PEP 582 is a rare instance where a new packaging proposal makes sense right away when I read it and could beat pip and venv in simplicity.
It seems functionally similar to venv but has the benefit of standardizing the location of dependencies to __pypackages__/3.x/*. With venv the developer selects some arbitrarily named directory that is sometimes but not always .venv/*.
Basically PDM supports project-specific python package installs. This is different from how python has traditionally worked where it has installed globally for the user running it. Why is this important? Because with virtual environments it's easy to forget to activate a virtual environment and run a pip install or upgrade and clobber your computer or server's python environment. It also avoids the confusing issue where someone updates their PATH variable while in a venv, but then its no longer there after exiting the virtual environment.
PDM may also be a good fit for Blender, because of the per-project approach. Blender doesn't come with a package manager and has a varied relation to the system installed Python interpreters depending on platform and install choices.
Scripts, Plugins etc for Blender are currently distributed in a very ad-hoc way, and it is hard to get adoption with plugins that require more elaborate dependencies, especially binary modules.
- PEP582 which it is based on is still in draft, and some tools (VS Code) won't fully support it until it's accepted.
- If you want to develop or test using different Python versions, you still need to use a virtual environment. PDM does handle this for you though.
IMO, the Python packaging ecosystem has been a dumpster fire for a long time, but the flurry of recent PEPs that provide a standard way of defining a package with pyproject.toml have made it so much better. Now it's just a matter of the tools catching up.
But really, it's a hard problem, between cross-platform support, backwards compatibility, security concerns, hosting, most authors being volunteers and so on.
And still, even with "just" pip or even conda I am enjoying the Python experience more than some other packaging solutions I've seen.
I mean, just look at this thread. Someone asks "So in light of this, what should I use for python packaging?", and they get two dozen different answers loaded with weirdness like pyenv, python-venv, virtualenvwrapper, etc... If I wasn't using python, I would've thought this is some cruel python in-joke that outsiders don't get.
Just looking at those names, I'm already confused as to what the hell each are doing and why do I need them?
But let's go back to pip and conda. Conda is unbearably slow. Pip is not entirely reliable in resolving versions properly. There's also not entirely cooperative interaction between conda and pip. If you use conda, you should not use pip (or just minimally) because it'll result in a mess.
Yes, packaging is hard, but it feels like python has managed to solve (or not solve) it in a uniquely obtuse and bad way so far.
Hopefully the slew of new PEPs will finally bring some clarity to this mess.
So I stick with this setup:
- Use pyenv to manage different Python versions and virtual environments.
- Use the standard PEP621 specification as a high-level dependency description: https://www.python.org/dev/peps/pep-0621/#example
- Use pip freeze > requirements.txt as a "lockfile".
> - Use pip freeze > requirements.txt as a "lockfile".
There's lots of reasons to not do this anymore, and Dependency Hell is real, and has been for 25 years with RedHat RPM's, etc.
Even if you don't want to rely upon poetry for building in prod, poetry can still export a requirements.txt file for you, so you're not locked into using poetry, but you still get to specify the high level packages you want, and let it solve the dep graph for you.
If you want to be relaxed about dependencies, you can use "pip-chill".
And, if you are even more relaxed,The pip docs also suggest using pip-tools to create lock files. Pip-tools is only for creating lock files (it’s not trying to fix virtualenvs like poetry is), and it works great.
Putting that aside though: yes, 100% pip-tools or an equivalent (which pyenv is not). It's the only sane way to both freeze dependencies for reproducibility, and maintain upgradability. I've used pip-tools for years on many large, complex projects, and it has been a huge benefit every single time. And it routinely revealed significant library-incompatibility problems that teams had only luckily dodged due to not using feature X yet, because pip's resolver has been so braindead for forever.
This is not and has never ever been correct. It makes it infinitely harder to install an application vs the standard `pip install -e .` which works on every package manager and avoids PYTHONPATH issues, as well as being able to publish your application to PyPI for easy installation (as simple as pip install --user app or pipx install app).
Should I just upgrade to Poetry or should I just dive headfirst into PDM? Keep myself at Pipenv? I'm at a loss.
Thanks in advance!
python3 -m venv venv && source venv/bin/activate && pip install -r requirements.txt
Python standard library is great and its a nice languate if you like the syntax but aside from a few constants like Django, Flask, and Pandas, the ecosystem feels like it is slowly turning into a fragmented mess.
I get versioned requirements files for the project base requirements, and also for each (version, implementation) of python, in case they are changes, and this has proven to be reliable for me.
It's all about finding the minimal-but-complete convenience / ergonomic solution over the, err, inconvenience of packaging. I also marvel at when I attempt to explain these things to experienced programmers, I only manage to convince them 50% of the time at most.
I use the same solution to have multiple versions of Ansible installed.
If you need to run multiple version of Python, then virtualenvs might not be enough, but that's honestly not a problem I have. New version of Python, great, let's me just rebuild my virtualenv and get back to work.
One of the most important rules I have regarding working in Python is: Never, never ever, install ANYTHING with the global pip. Everything goes in to virtualenvs.
... bless my `zsh` shell history for these incantations. I don't think I have any hope of remembering it -- probably because of all the old virtualenv incantations!
Kind of agree with pipenv though. It's painfully slow, but it abstracts away having to worry about various requirements files (eg: dev vs prod) and the .lock keeps things consistent.
You can regard __pypackages__ as a virtualenv WITHOUT the interpreter, it can easily work with any python interpreter you choose as long as it has the same major.minor version as the packages folder.
If you're building an application, use Poetry. If you're building a library, use Flit. Use PEP621 metadata in pyproject.toml regardless.
Poetry is much more focused on managing dependencies for applications than dealing with libraries that have to be used by other libraries or applications. See this deep discussion for some timely/relevant examples: https://iscinumpy.dev/post/bound-version-constraints/
I mostly code on JavaScript and (obviously) use NPM a lot, and it makes me wonder.
To be fair, I'm not saying there's anything wrong with virtualenvwrapper, just that I've never used it and for my purposes the above solution works well.
Maybe 3.11 can make python packaging less of a beautiful disaster.
This will mark me out as a Luddite but I am still quite happy with a 5 line setup.py and “pip install .”
https://news.ycombinator.com/item?id=29446715
I’d just use pip and requirements files if you can. It’s doubtful that your requirements are sufficiently complex as to require a more complex resolver, although that depends on your ML needs.
Quoting the GitHub page[0]:
> PDM is meant to be a next generation Python package management tool. It was originally built for personal use. If you feel you are going well with Pipenv or Poetry and don't want to introduce another package manager, just stick to it. But if you are missing something that is not present in those tools, you can probably find some goodness in pdm.
Having used PDM a bit, its ambition in my opinion may not be to replace existing tools, but rather to experiment and implement the most recent PEPs related to packaging.
While you can argue about PEP 582[1] implementation (which is still in draft), PDM doesn't prevent anyone from using virtual environments, and even provides a plugin[2] to support that. PDM also implements PEP 631[3], which most other package managers have been relunctant to support or slow to adopt.
[0]: https://github.com/pdm-project/pdm
[1]: https://www.python.org/dev/peps/pep-0582/
[2]: https://github.com/pdm-project/pdm-venv
[3]: https://www.python.org/dev/peps/pep-0631/
But as PDM becomes mature, it is acknowleged by the Python packaging people, I also work hard to make PDM fit more people's workflow, fortunately, it has a strong plugin system. You can add virtualenv support(pdm-venv), publish command(pdm-publish) and more. In the future, I would like to see it can eventually push the iteration of PEP 582 and make it finalized.
EDIT: Oh, I should say: If it's meant to take over the world, say so, as well!
Deleted Comment
> This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__ directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate "virtual environments". Python will use the __pypackages__ from the base directory of the script when present.
Thus, the idea of PDM is that it will create a directory, called `__pypackages__` in the root of your project and in that folder it'll populate all the dependencies for that project. Then, when you run scripts in the root folder of your project, your Python install will see that there's a `__pypackages__` folder and use that folder to look up dependencies.
This style of "dependencies inside the project directory" is similar to how npm of the Javascript ecosystem works, where it creates a `node_modules/` folder in the root of your project and fills that folder with the dependencies for your project. This style of dependency management is different from other package managers such as Poetry (Python), Pip (Python), go (Golang), and cargo (Rust), all of which instead have a sort of "secret folder acting as cache of dependencies at particular versions", a folder that's usually pretty hidden out of the way, in which the package manager automatically manages the acquisition, storage, and versioning/version resolution (Poetry, Go, Cargo, all do this but Pip does not).
That's a very fast and probably wrong rundown on what makes this package manager different from others.
[0] - https://www.python.org/dev/peps/pep-0582/
It seems functionally similar to venv but has the benefit of standardizing the location of dependencies to __pypackages__/3.x/*. With venv the developer selects some arbitrarily named directory that is sometimes but not always .venv/*.
I have not come across PEP 582, thank you for linking.
Also this will just pollute your source directories with generated directories and files that shouldn't be there.
Ill still use Poetry, but this could be paving the way for Poetry to work without virtualenvs as well one day.
Scripts, Plugins etc for Blender are currently distributed in a very ad-hoc way, and it is hard to get adoption with plugins that require more elaborate dependencies, especially binary modules.