Ask HN: Who's building on Python NoGIL?

PyO3 0.23.0 was a big release I’ve been tinkering with extensively. Support for “free-threaded Python” is a headline feature, and I imagine NoGIL Python will be extremely nice for Rust interoperability, so there is definitely interest in that crate. Also could be huge for queueing data for GPUs, api servers, and bulk data fetching.

For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python and it often takes a long time for popular libraries to support the latest and greatest, so I’d recommend patience and baby steps

https://github.com/PyO3/pyo3/releases/tag/v0.23.0

hamandcheese · 8 months ago

> For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python

Well, you'd think after the 2 to 3 debacle, python might take backwards compatibility more seriously, but they don't.

Follow semver, and stop breaking things on 3.x. If it's deprecated in 3.x, don't remove it until 4.

sgarland · 8 months ago

Python doesn’t follow semver [0], it follows a general major.minor.bugfix, but with an extremely liberal definition of minor (“less earth-shattering,” as they describe it).

PEP387 requires that introduced incompatibilities have a “large benefit to breakage ratio,” and that any deprecations last a minimum of two years.

FWIW, Kubernetes has a similar approach. Breaking changes occur all the time with “minor” version updates.

[0]: https://docs.python.org/3/faq/general.html

[1]: https://peps.python.org/pep-0387/

skeledrew · 8 months ago

I don't think there's even a plan for a v4. The fallout from 2 to 3 was that bad. So to keep improvements going takes deprecating something several versions before removing, and research is done to find how popular that particular thing is to determine its candidacy for removal. Thus it's best practice to pin all dependencies, and read the release notes before doing a version update.

ensignavenger · 8 months ago

Why should Python follow semver? There are plenty of successful projects that don't use semver. If you feel so strongly about them making the change, than make a case for it.

throwaway127482 · 8 months ago

What have they broken on 3.x? Genuine question as I haven't followed python's development super closely

Uptrenda · 8 months ago

My thoughts exactly. Python was supposed to be this ultra-portable thing. But... I am finding myself having to write patches to get my software to work on different Python versions.

People who have Python 3 installed can be on many different versions. The thing is, depending on the version, quite often bug fixes included in later versions aren't in older versions. So if you want to make your code work -- got to get the patches in manually, monkey patch broken code, and do it that way. Then there's the seemingly random deprecation of standard library modules / other breaking changes.

I take python version support seriously because if people install your packages you'll be outsourcing all of the above crap to the user. They might not even know how to 'upgrade' python. Or end up on the wrong version. If your package doesn't work when they install it they'll just move on to something else. Python is a total shit show for packaging.

brianwawok · 8 months ago

Meh. I’d rather we move faster and make the world better even if we have to update code a bit more. Never that hard.

CamouflagedKiwi · 8 months ago

> For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python

I don't think it's the 2 -> 3 thing any more, that was a while ago. Honestly there are just a lot of things that don't work well in 3.x.0 Python releases. For example, 3.12.0 had the per-interpreter GIL thing; I tried that in 3.12.0 and ran into a completely breaking issue almost immediately. They were responsive & helpful and did fix the first issue in 3.12.1, but we still had more issues with parts of the C API which seemed to work in 3.11 and better again in 3.13, but it really felt like that needed another release to solidify. (Also you can't import datetime in that setup in 3.12, which is also a pretty big deal-breaker).

I can only imagine the free-threading thing will need at least the same kind of time to work the kinks out, although it is nice to see them moving in that direction.

scott_w · 8 months ago

> For whatever reason (maybe post 2to3 PTSD), Python community seems not extremely eager to jump on latest versions of Python

I suspect this is a mix of Python 3 being “good enough” for most cases and companies not updating their stacks that often. I think most of us came into Python professionally around 2.7 so the need to keep updating our version hasn’t been heavily ingrained into our thinking.

weinzierl · 8 months ago

Oh, what a surprise, I thought PyO3 was dead. Glad to see it's not!

I've always used threads despite the GIL. I haven't tried NoGIL and am waiting to find out how many bugs it surfaces. I do get the impression that multi-threaded Python code is full of hazards that the GIL covers up. There will have to be locks inserted all over the place. CPython should have simply been retired as part of the 2 to 3 transition. It was great in its day, on 1-core machines with the constraints of that era. I have a feeling of tragedy that this didn't happen and now it can never be repaired. I probably wouldn't use Python for web projects these days. I haven't done anything in Elixir yet but it looks like about the best option. (I've used Erlang so I think I have a decent idea of what I'd be getting into with Elixir).

tgma · 9 months ago

In a strange way, Python being so bad at interpreting bytecodes and limited by GIL, plus being good at interfacing with C cheaply (unlike Go and Java,) induced a programming style that is extremely suited for data-parallel computing which is the way to efficiently scale compute in today's SIMD/GPU world. If you wanted to be efficient, you had to prepare your data ahead of time and hand it off. Any intermediate interaction with that data would ruin your performance. That's mostly how efficient Python libraries and ecosystem are built.

Weakness may have turned into a strength.

amelius · 8 months ago

Why would NoGIL change that, though? It's not like large data-parallel operations can suddenly be done efficiently in Python if you remove the GIL. The problem with GIL afaik is mostly latency problems in interactive applications.

crabbone · 8 months ago

Not really... What I see in practice is that Python's shortfalls are being covered by throwing more hardware at it. (Beside of the more efficient, but also more complex: rewriting in C).

There are all kinds of micro-optimizations, as in: one has to know which Pandas operations are going to be more expensive than others, and organize the code accordingly, but these things often teach programmers the wrong ideas. It's not uncommon in Python world that a superior solution (from algorithmic perspective, i.e. the one that should use less time or space) is in practice inferior to a solution that's implemented in C. And so, writing more efficient Python code comes down to knowing which functions are faster or cheaper in some other way, but it doesn't generalize and doesn't transfer to other languages.

What usually happens in situation like this is that the developers of the language (or a product, a framework etc. that suffers a similar fate) start optimizing the bad solutions (because they are the go-to tool for their users) instead of actually improving the language (the product, the framework etc.) To give some examples of this happening in Python: there's a lot of work dedicated to the performance of lists and dicts. But, if anyone really wanted performance, they'd have to look for more specialized collections, rather than optimizing very generic ones.

grandimam · 9 months ago

Can you elaborate more on the data-parallel computing part?

lucaswiman · 8 months ago

> I do get the impression that multi-threaded Python code is full of hazards that the GIL covers up.

It was a design goal of free-threading that any race condition which can occur in pure python code without the GIL must have already been possible on the same code with the GIL. To achieve that, they added various locks which fix cpython primitive operations that were only threadsafe because of the GIL. That's why single-threaded execution is slower without the GIL.

However, it's possible there were races which were very unlikely with the GIL that are much more likely now, e.g. one requiring multiple context switches in evaluating a single statement like `a.b = c[a.b].d()`. With the GIL, that might require multiple operations in that assignment taking >5ms. If all those operations are fast might almost never come up, but without the GIL, a different thread could mutate a.b as often as it wants.

I haven't heard of this coming up in practice, but it's certainly something that could in principle happen. The amount of FUD about it in every discussion about python finally making progress on threading seems excessive though.

fulafel · 8 months ago

So what would the alternative history have been if CPython was retired after Python 3 came out in 2008, what would we be using now? IronPython or GraalPy?

kevin_thibedeau · 8 months ago

It would have suffered the same fate as Perl 6 and we'd all be on 2.1x.

throwaway81523 · 8 months ago

PyPy was a thing and it could have been the basis of Python 3. In fact people DID keep using Python 2 long after Python 3 came out, even though Python 3 used CPython. So they may have gone for much bigger wins. Perl 6 was another matter since it changed the Perl language drastically.

upghost · 8 months ago

This is going to be bananas for libpython-clj[1]. One of the biggest limiting factors right now is that you can't mix Java/Clojure concurrency with Python concurrency, you need to have a really clear separation of concurrency models. But with this, you will be able to freely mix Clojure and Python concurrency. Just from a compositional standpoint, Clojure atoms and core.async with Python functions will be fantastic. More practically, this will unlock a lot of performance gains with PyTorch and Tensorflow which historically we've had to lock to single threaded mode. Yay!

[1]: https://github.com/clj-python/libpython-clj

Deleted Comment

jwindle47 · 8 months ago

I’m here for it :) love the Clojure approach to symbiosis. Parens consume all the things!

bobxmax · 8 months ago

Neat, guess we'll finally see if cross-language concurrency stops being such a pain.

bionhoward · 8 months ago

carlsborg · 8 months ago

Its merged into CPython 3.13 but labeled as experimental.

Single threaded cpu bound workloads suffer in benchmarks (vs i/o workloads) till they put back the specializing adaptive interpreter (PEP 659) in 3.14. Docs say a 40% hit now, target is 10% at next release.

C extensions will have to be re-built and ported to support free threaded mode.

Some interesting and impactful bits of open source work for those with a c++ multithreading background.

santiagobasulto · 8 months ago

May I ask which benchmarks you saw? I was looking for some reliable one and couldn’t find them.

The official docs reference those numbers.

Dead Comment

trollbridge · 8 months ago

In new code I try to use threads, but certain things like yield which rely on async are simply too common and useful to stop using.

So far in production if I need to use multiple cores, I use multiple processes and design apps that way. The discipline this imposes does seem to result in better apps than I wrote in an environment like Java with tons of threads.

bhouston · 8 months ago

> In new code I try to use threads, but certain things like yield which rely on async are simply too common and useful to stop using.

Huh? I python you have to choose either threads or asynchronous/await? Why not combine both of them? I am so confused. C# allows for both to be combined quite naturally. And JavaScript as well allows for workers with async/await.

elcomet · 8 months ago

What do you mean? Async/await uses threads

throwaway81523 · 9 months ago

shlomo_z · 9 months ago

I have the same question! I love Python and asynchronous stuff, and I do not know too much about threading.

Is threading potentially better for IO bound tasks than async?

MathMonkeyMan · 9 months ago

Potentially, but probably not. The benefit of a parallel-enabled interpreter would be two CPU cores executing bytecode instructions at the same time in the same interpreter. So, you could have one python thread working on one set of data, and another python thread working on another set of data, and the two threads would not interfere with each other much or at all. Today, with the global interpreter lock, only one of those threads can be executing bytecode at a time.

shlomo_z · 8 months ago

> Today, with the global interpreter lock, only one of those threads can be executing bytecode at a time.

Yes, but Python now has a version without GIL, which prompted this post in the first place. So my question is: Now, if I use a version of Python 3.13 without GIL, can a threaded Flask app do better than an AIOHTTP server.

KaiserPro · 8 months ago

> Is threading potentially better for IO bound tasks than async?

Its a pick your poison type of deal.

I _personally_ hate async. Its a suprise goto, which I don't like. However in my testing asyncio performs the same as threading. (this was with socket based IO)

If you loose the GIL, then threading _should_ be much faster, as it should scale to how many cores you have. But thats only if you have enough IO to saturate a core.

Quite honestly I'd tell you not to mix threads with asyncio. As you note: IO bound tasks aren't CPU hogs and there's little benefit to mixing it with threads. It will lead to unnecessary bugs, complexity, and problems with event loop management.

Asyncio can run tens of thousands of tasks if its used properly. If you think something will block it you should check out "process pool executors." Note that its very tricky to share resources like sockets between processes so its kind of another reason to avoid stuff like this.

I think Python 3.14 will have interpreter pools for even more concurrency options.

solidasparagus · 8 months ago

> there's little benefit to mixing it with threads

> If you think something will block it you should check out "process pool executors." Note that its very tricky to share resources like sockets between processes so its kind of another reason to avoid stuff like this.

Isn't that the benefit of no-gil? The ability to run CPU-intensive operations without incurring the overhead and friction of multiprocessing? Now you can do multicore processing while also having shared memory

PaulHoule · 8 months ago

My RSS reader is written in async Python but I think it was a mistake and I built my image sorter to use gunicorn which means I have to run it inside WSL on Windows but actually it works really well. My "image sorter" is actually a lot of different things (it has a webcrawler in it, a tagging system, will probably take over the RSS reader's job someday) but it does an unholy mix of )(i) "things that require significant CPU" (like... math) and (ii) "things that require just a touch of CPU" like serving images.

I found that (i) was blocking (ii) making the image sorter unusable.

So far though that is processes and not threads.

For the last few weeks for the hell of it I've been writing a small and very pedagogical chess playing program in Python (trying to outdo Lisp) and once I got the signs figured out in the alpha-beta negamax algorithm it can now beat my tester most of the time. (When I had the signs wrong it managed to find the fool's mate which is not too surprising in retrospect since it looks ahead enough plies)

That was my major goal but I'd also like to try an MCTS chess program which is more of a leap into the unknown. Unlike alpha-beta MCTS can be almost trivially parallelized (run 16 threads of it for, say, 0.1 s, merge the trees, repeat, ...) and threads would be a convenient way to handle concurrency here although multiprocessing out to be good enough. So I am thinking about using a non-GIL Python but on the other hand I could also rewrite in Java and get a 50x or so speedup and great thread support.

(Note the problem here is that unlike games where you fill up a board, chess doesn't really progress when you play out random moves. With random moves for instance you can't reproduce White's advantage at the beginning of the game and if your evaluation function can't see that you are doing really bad. I need a weak player for the playouts that plays well enough that it can take advantage of situations that real players can take advantage of at least some of the time. A really good move ordering function for an alpha-beta search might do the trick.

Started working on a no-gil gRPC Python implementation but super low priority.

Has anyone started deploying nogil at scale in prod?

melodyogonna · 8 months ago

I don't think it is ready to be used in prod. The feature is still experimental

No, I am not personally aware of anyone using it prod.