> Maybe now I'll be able to actually figure out what data to send libraries without actually reading their source code.
One could hope, but any library abusing kwargs in all their methods is showing they’re willing to go through the absolute minimum to make their code usable, let alone readable and self-documenting.
It feels like we're going in cycles. C was somewhat lax with type checking, thus C++ and Java were both made more strict. Looking to escape the tyranny of static typing, the rise of Python, Ruby, or JavaScript instead left us with a desire that Rust, Go, and TypeScript now fulfill. I wonder what's the next step? LLMs are extremely broad in what they accept, but don't exactly fill the same niches.
A big use case for kwargs is not breaking compatibility and not having to copy/paste a ton of parameters when just forwarding them. But that's exactly the case which is difficult to type correctly.
If you didn't use **kwargs you'd have to copy and paste every kwarg and its default value from the superclass into the subclass, which is ridiculous IMO.
> be able to actually figure out what data to send libraries without actually reading their source code
Just reading this sent a chill down my spine. I have horrible memories of having to read the code to figure out what something was doing (in JavaScript, Python, Ruby, etc), due to the disaster of an anti-feature called dynamic typing having been used.
Untyped languages tend to be at the forefront of paradigms, and typed languages come in toward the end when reliability and need for tooling are more important than innovation/discovery.
In the 90s a bunch of kids were building websites with LAMP stacks while serious engineers were building aging/about-to-be-irrelevant desktop software in serious, typed languages.
It's not an anti-feature. Those languages gained popularity in part because they were dynamically typed. But this debate is decades old and people always dogmatically choose one side, so not really any point in trying to convince you. Alan Kay made the argument back in the 1970s that type systems were always too limiting, therefore classes and late binding were the answer for him.
what's especially interesting about this is that it could create a new "meta" for static typing in Python.
One significant issue with static typing in Python is how much boilerplate is required to use types when also doing the sorts of things that dynamic Python is really good at - for instance proxying functions. If you want to do that now and preserve the types, you need to re-declare the types of everything in the wrapper.
Now, if the underlying function already made use of Unpack, you could "reuse" that type in your own wrapper with low boilerplate and less chance of things diverging in hard-to-refactor ways.
Yep, it’s incomplete, and much more importantly not machine readable. These days I want all my code to pass strict mypy. It’s mostly possible and a bliss when it works, but libraries (ab)using kwargs throw a spanner in that. Libraries where everything is kwarg and the docs have to be consulted are a killjoy. And they cause tons of bug from misuse!
Software is always user-upgradable on Linux. Just install it somewhere in your home directory. GNU Stow [0] can be helpful as a very lightweight way to manage the packages.
(Of course, then you take on the responsibility of keeping up with patch releases yourself, which is why we use distros. But if it's just a small number of packages on top of a distro-managed base system, it's perhaps not so bad.)
99% of my more_itertools imports are exactly for this.
there's 1-2 other stuff from more_itertools that I think should make it to itertools. I'd actually like to see statistics from huge monorepos/opensource about usage stats of various more_itertools functions.
What do you mean by "empty sequence in"? The function doesn't raise if the input iterable is empty: it only raises if the chunk size n is 0. While that does have a natural interpretation of returning an infinite sequence of empty tuples, such a behavior would be qualitatively different than for other chunk sizes. The caller would never be able to retrieve any elements from the input iterable, and the output would be infinite even if the input is finite. In that light, it makes some sense (IMO) to avoid letting applications hit such an edge case unintentionally.
> PEP 669 defines a new API for profilers, debuggers, and other tools to monitor events in CPython. It covers a wide range of events, including calls, returns, lines, exceptions, jumps, and more. This means that you only pay for what you use, providing support for near-zero overhead debuggers and coverage tools. See sys.monitoring for details.
Low-overhead instrumentation opens up a whole bunch of interesting interactive use cases (i.e. Jupyter etc.), and as the author of one library that relies heavily on instrumentation (https://github.com/ipyflow/ipyflow), I'm very keen to explore the possibilities here.
Summary, sorry for poor formatting, I'm not sure HN can do a list of any kind?
New features
More flexible f-string parsing, allowing many things previously disallowed (PEP 701).
Support for the buffer protocol in Python code (PEP 688).
A new debugging/profiling API (PEP 669).
Support for isolated subinterpreters with separate Global Interpreter Locks (PEP 684).
Even more improved error messages. More exceptions potentially caused by typos now make suggestions to the user.
Support for the Linux perf profiler to report Python function names in traces.
Many large and small performance improvements (like PEP 709 and support for the BOLT binary optimizer), delivering an estimated 5% overall performance improvement.
Type annotations
New type annotation syntax for generic classes (PEP 695).
New override decorator for methods (PEP 698).
Deprecations
The deprecated wstr and wstr_length members of the C implementation of unicode objects were removed, per PEP 623.
In the unittest module, a number of long deprecated methods and classes were removed. (They had been deprecated since Python 3.1 or 3.2).
The deprecated smtpd and distutils modules have been removed (see PEP 594 and PEP 632. The setuptools package continues to provide the distutils module.
A number of other old, broken and deprecated functions, classes and methods have been removed.
Invalid backslash escape sequences in strings now warn with SyntaxWarning instead of DeprecationWarning, making them more visible. (They will become syntax errors in the future.)
The internal representation of integers has changed in preparation for performance enhancements. (This should not affect most users as it is an internal detail, but it may cause problems for Cython-generated code.)
Isolated subinterpreters (PEP 684): Just how isolated are those? Is it simply a more complicated version of multiprocessing, with all the same drawbacks (communication via pipes/socket/some-other-stream)?
What is or isn't "pythonic" is largely determined by the community not the language. Nothing is stopping the Python community from monkey patching everything like the Ruby community does.
The f-string changes arrived because there was a need to formalize the syntax, so other Python parsers, for CPython to move off having a special sub-parser just for f-strings, and to be able decide whether weird edge cases were bugs or features.
Once formalized it was decided not to put arbitrary limits on it just because people can write badly formatted code, people are can already do that and it's up to the Python community to choose what is or isn't "Pythonic".
FYI, one of the things I'm really looking forward to is being able to write: f"Rows: {'\n'.join(rows)}\n Done!"
Python tends to be permissive and rely on convention over preventing certain practices, sometimes summed up as "we are all consenting adults." E.g., there's multiple inheritance, no private variables, and monkeypatching. I see this change as in the same vein. This change also makes it conceptually simpler [0]. It also appears to reduce technical debt by reducing differences between expressions in f-strings and in the rest of the language.
They keep getting improved error messaging and this is one of my favorite features. But I'd love if we could get some real rich text. Idk if anyone else uses rich, but it has infected all my programs now. Not just to print with colors, but because it makes debugging so much easier. Not just print(f"{var=}") but the handler[0,1]. Color is so important to these types of things and so is formatting. Plus, the progress bars are nice and have almost completely replaced tqdm for me[2]. They're just easier and prettier.
[2] Side note: does anyone know how to get these properly working when using DDP with pytorch? I get flickering when using this and I think it is actually down to a pytorch issue and how they're handling their loggers and flushing the screen. I know pytorch doesn't want to depend on rich, but hey, pip uses rich so why shouldn't everyone?
I love and use rich too, but gosh I hope that libraries don't start depending on it just because pip does.
It has a lot of dependencies of its own, and dependency creep is real. I know pytorch isn't exactly lightweight in terms of dependencies. But I prefer using libraries that make an effort do only pull in absolutely necessary dependencies.
Yeah sorry I don't think I was clear. I don't exactly want to just drop in rich into the python source (for reasons you mention). But I do think they could take some of the ideas from there and place them in. Formatting is really the most important aspect here, especially around traces because these are the real work amplifiers. So much time is spent debugging that the better tools we have to debug better the more work __everyone__ does. But debugging is strangely a underappreciated area. I think you could do colors with just simple ansii encodings (same as you'd do in posix). I'm just using rich as an example of style.
r.e. pytorch: It's a love hate with me. I do think they should incorporate things that are extremely common and solve things that are daily issues. As a simple example, new users are often confused with loading and saving models when using distributed data parallel (DDP) because it creates this extra "module" name in the state_dict and so can require different usage for saving/loading models if you're distributed training or not. This can be quite annoying. Similarly there are no built in infinite samplers, which are common among generative modelers. People who don't iterate over epochs of data, but rather steps. There's of course many solutions to deal with this, but it does make sense with how prolific it is (and has been since 2015) that there just be a built in dataloader. I'd argue things like progress bars and loggers would also be highly beneficial, especially because pytorch's forte is generating research code.
I think the support for isolated sub-interpreters with separate Global Interpreter Locks is the most interesting new feature in python. It is doubtful not the best way to offer some sort of concurrency but still a step closer to maybe one day get rid of GILs.
Since it currently lacks any way to transfer objects between interpreters other than pickling, does it offer any advantage over the multiprocessing module?
Not for pure python code; but there's massive advantages for mixed C(++) and Python: I can now have multiple sub interpreters running concurrently and accessing the same shared state in a thread-safe C++ library.
Previously this required rewriting the whole C++ library to support either pickling (multiplying the total memory consumption by the number of cores), or support allocating everything in shared memory (which means normal C++ types like `std::string` are unusable, need to switch e.g. to boost::interprocess).
Now is sufficient to pickle a pointer to a C++ object as an integer, and it'll still be a valid pointer in the other subinterpreter.
It may not be a step towards that. Ruby has guilds which is a very similar idea and they are explicitly not working towards removing them altogether at this point. Matz did a full keynote on the latest Euruku defending not working towards removing the whole of it. See https://www.youtube.com/watch?v=5WmhTMcnO7U&t=1244s for the full talk if you are interested.
I find the convergent evolution of features in these two languages pretty funny, as it is very clear that they don't really look at implementation details of the other language even if they quite often land on ideas that are pretty close in practice.
1. https://docs.python.org/3.12/whatsnew/3.12.html#pep-692-usin...
One could hope, but any library abusing kwargs in all their methods is showing they’re willing to go through the absolute minimum to make their code usable, let alone readable and self-documenting.
The Azure SDK is full of them, making liberal use of kwargs.pop. What a nightmare.
Just reading this sent a chill down my spine. I have horrible memories of having to read the code to figure out what something was doing (in JavaScript, Python, Ruby, etc), due to the disaster of an anti-feature called dynamic typing having been used.
Untyped languages tend to be at the forefront of paradigms, and typed languages come in toward the end when reliability and need for tooling are more important than innovation/discovery.
In the 90s a bunch of kids were building websites with LAMP stacks while serious engineers were building aging/about-to-be-irrelevant desktop software in serious, typed languages.
One significant issue with static typing in Python is how much boilerplate is required to use types when also doing the sorts of things that dynamic Python is really good at - for instance proxying functions. If you want to do that now and preserve the types, you need to re-declare the types of everything in the wrapper.
Now, if the underlying function already made use of Unpack, you could "reuse" that type in your own wrapper with low boilerplate and less chance of things diverging in hard-to-refactor ways.
Dead Comment
Now I just have to wait 5 more years until 3.12 is sufficiently old that work lets me use it.
Bets on user-upgradable Python on Linux by 2030?
(Of course, then you take on the responsibility of keeping up with patch releases yourself, which is why we use distros. But if it's just a small number of packages on top of a distro-managed base system, it's perhaps not so bad.)
[0] https://www.gnu.org/software/stow/
https://realpython.com/python-versions-docker/#running-pytho...
https://www.cherryservers.com/blog/install-python-on-ubuntu
Use asdf. You don’t want to manage your project’s dependencies at the system level any more than you have to.
for i in range(len(lst) // batch_size + 1): batch = lst[i * batch_size : (i + 1) * batch_size]
there's 1-2 other stuff from more_itertools that I think should make it to itertools. I'd actually like to see statistics from huge monorepos/opensource about usage stats of various more_itertools functions.
Deleted Comment
> PEP 669 defines a new API for profilers, debuggers, and other tools to monitor events in CPython. It covers a wide range of events, including calls, returns, lines, exceptions, jumps, and more. This means that you only pay for what you use, providing support for near-zero overhead debuggers and coverage tools. See sys.monitoring for details.
Low-overhead instrumentation opens up a whole bunch of interesting interactive use cases (i.e. Jupyter etc.), and as the author of one library that relies heavily on instrumentation (https://github.com/ipyflow/ipyflow), I'm very keen to explore the possibilities here.
Summary, sorry for poor formatting, I'm not sure HN can do a list of any kind?
New features
More flexible f-string parsing, allowing many things previously disallowed (PEP 701).
Support for the buffer protocol in Python code (PEP 688).
A new debugging/profiling API (PEP 669).
Support for isolated subinterpreters with separate Global Interpreter Locks (PEP 684).
Even more improved error messages. More exceptions potentially caused by typos now make suggestions to the user.
Support for the Linux perf profiler to report Python function names in traces.
Many large and small performance improvements (like PEP 709 and support for the BOLT binary optimizer), delivering an estimated 5% overall performance improvement.
Type annotations
New type annotation syntax for generic classes (PEP 695).
New override decorator for methods (PEP 698).
Deprecations
The deprecated wstr and wstr_length members of the C implementation of unicode objects were removed, per PEP 623.
In the unittest module, a number of long deprecated methods and classes were removed. (They had been deprecated since Python 3.1 or 3.2).
The deprecated smtpd and distutils modules have been removed (see PEP 594 and PEP 632. The setuptools package continues to provide the distutils module.
A number of other old, broken and deprecated functions, classes and methods have been removed.
Invalid backslash escape sequences in strings now warn with SyntaxWarning instead of DeprecationWarning, making them more visible. (They will become syntax errors in the future.)
The internal representation of integers has changed in preparation for performance enhancements. (This should not affect most users as it is an internal detail, but it may cause problems for Cython-generated code.)
It's basically web workers / isolates for Python.
It can, just use `-` characters for bullets, like in Markdown. They won't render as Unicode bullet points but dashes are fine enough.
- If you don't - Your list looks like this
Whereas:
- This list
- Has two newlines between
- Each item
Deleted Comment
The f-string changes arrived because there was a need to formalize the syntax, so other Python parsers, for CPython to move off having a special sub-parser just for f-strings, and to be able decide whether weird edge cases were bugs or features.
Once formalized it was decided not to put arbitrary limits on it just because people can write badly formatted code, people are can already do that and it's up to the Python community to choose what is or isn't "Pythonic".
FYI, one of the things I'm really looking forward to is being able to write: f"Rows: {'\n'.join(rows)}\n Done!"
Which in Python 3.11 is illegal syntax.
[0]: https://peps.python.org/pep-0701/#how-to-teach-this
[0] https://rich.readthedocs.io/en/stable/logging.html
[1] Try this example: https://github.com/Textualize/rich/blob/master/examples/exce...
[2] Side note: does anyone know how to get these properly working when using DDP with pytorch? I get flickering when using this and I think it is actually down to a pytorch issue and how they're handling their loggers and flushing the screen. I know pytorch doesn't want to depend on rich, but hey, pip uses rich so why shouldn't everyone?
It has a lot of dependencies of its own, and dependency creep is real. I know pytorch isn't exactly lightweight in terms of dependencies. But I prefer using libraries that make an effort do only pull in absolutely necessary dependencies.
r.e. pytorch: It's a love hate with me. I do think they should incorporate things that are extremely common and solve things that are daily issues. As a simple example, new users are often confused with loading and saving models when using distributed data parallel (DDP) because it creates this extra "module" name in the state_dict and so can require different usage for saving/loading models if you're distributed training or not. This can be quite annoying. Similarly there are no built in infinite samplers, which are common among generative modelers. People who don't iterate over epochs of data, but rather steps. There's of course many solutions to deal with this, but it does make sense with how prolific it is (and has been since 2015) that there just be a built in dataloader. I'd argue things like progress bars and loggers would also be highly beneficial, especially because pytorch's forte is generating research code.
But we're digressing. These are just opinions.
Previously this required rewriting the whole C++ library to support either pickling (multiplying the total memory consumption by the number of cores), or support allocating everything in shared memory (which means normal C++ types like `std::string` are unusable, need to switch e.g. to boost::interprocess).
Now is sufficient to pickle a pointer to a C++ object as an integer, and it'll still be a valid pointer in the other subinterpreter.
It's scheduled to be delivered next year if I'm not mistaken.
The new syntax for generic types is also a very nice QOL improvement, you can now just do:
I find the convergent evolution of features in these two languages pretty funny, as it is very clear that they don't really look at implementation details of the other language even if they quite often land on ideas that are pretty close in practice.
https://www.python.org/downloads/release/python-3115/