Clear articulation of why Numpy syntax is provincial and difficult to learn. Perhaps the clearest part of it clicks when the author compares the triple-nested loop to the numpy-function approach. The former is IMO (And the author thinks so) much easier to understand, and more universal.
Pretty sure Numpy’s einsum[1] function allows all of this reasoning in vanilla numpy (albeit with a different interface that I assume this author likes less than theirs). Quite sure that first example of how annoying numpy can be could be written much simpler with einsum.
I think you could for linear solves into a "generalized Einstein notation", but the other option you have is to support more complex array types, in which case a batched linearsolve can be reframed as a single linearsolve by a block-diagonal matrix
Sure, but einsum needs a syntax and concepts of its own, and more importantly does not work if you need to do something else than a limited set of matrix operations.
I think I'll just use Julia, since it has great libraries for doing matrix operations and mapping loops etc onto GPUs and handling indices etc. And if you need some Python library it's easily available using the fantastic PyCall library.
This is a big improvement over numpy, but I don't see much of a compelling reason to go back to Python.
> In DumPy, every time you index an array or assign to a dp.Slot, it checks that all indices have been included.
Not having to specify all indices makes for more generic implementations. Sure, the broadcasting rules could be simpler and more consistent, but in the meantime (implicit) broadcasting is what makes NumPy so powerful and flexible.
Also I think straight up vmap would be cleaner IF Python did not intentionally make lambdas/FP so restricted and cumbersome apparently due to some emotional reasons.
For solo works, the terseness might work, but usually only in short term. Code I wrote 6 months ago looks like someone else's code. For team work, I'd prefer to be explicit if possible. It saves both my teammates' time and my time (when I eventually forget my own code 6 months from now).
I’ve known some people who didn’t want to learn the syntax of numpy and did it all in loops, and the code was not easy to read. It was harder to read. The fundamental issue is that operations on high dimensional arrays are very difficult to reason about. Numpy can probably be improved, but I don’t think loops are the answer.
The point here is not that it’s loops per se, the point is that the indexing is explicit. It seems like a big win to me. The article’s ~10 non-trivial examples all make the code easier to read, and more importantly, to understand exactly what the code is doing. It is true that some operations are difficult to reason about, that’s where explicit indexing really helps. The article resonates with me because I do want to learn numpy syntax, I’ve written hundreds of programs with nympy, spent countless hours doing battle with it, and I feel like I’m no better off now than someone who’s brand new to it. The indexing is constantly confounding, nothing ever just works. Anytime you see “None” and “axis=“ inside an operation, it’s a tell: bound to be difficult to comprehend. I’m always having to guess how to use some combination of reshape, dstack, hstack, transpose, and five other shape changers I’m forgetting, just to get something to work and it’s difficult to read and understand later. It feels like there is no debugging, only rewriting. I keep reading the manual for einsum over again and I’ve used it, but I can’t explain how, why, or when to use it, it seems like this thing you have to resort to because no other indexing seems to work. The ability to do straightforward explicit non-clever indexing as if you were writing loops seems like a pretty big step forward.
I involuntarily whispered "reshape" to myself near the top of your comment. Numpy is a very different way for me to think and I have similar feelings to what you're describing.
I could never understand why people use dstack, hstack and the like. I think plain np.stack and specifying the axis explicitely is easier to write and to read.
For transposes, np.einsum can be easier to read as it let's you use (single character, admittedly) axis specifiers to "name" them.
The real question—to which I have absolutely no answer—is not about syntax, it's about concepts: what is a better way to think about higher-dimensional arrays rather than loops and indices? I'm convinced that something better exists and, if it existed, encoding it in a sufficiently expressive (ie probably not-Python) language would give us the corresponding syntax, but trying to come up with a better syntax without a better conceptual model won't get us very far.
Then again, maybe even that is wrong! "Notation as a tool for thought" and all that. Maybe "dimension-munging" in APL really is the best way to do these things, once you really understand it.
English. "Write me a Python function or program that does X, Y, and Z on U and V using W." That will be the inevitable outcome of current trends, where relatively-primitive AI tools are used to write slightly more sophisticated code than would otherwise be written, which in turn is used to create slightly less-primitive AI tools.
For example, I just cut-and-pasted the author's own cri de coeur into Claude: https://claude.ai/share/1d750315-bffa-434b-a7e8-fb4d739ac89a Presumably at least one of the vectorized versions it replied with will work, although none is identical to the author's version.
When this cycle ends, high-level programs and functions will be as incomprehensible to most mainstream developers as assembly is today. Today's specs are tomorrow's programs.
Not a bad thing, really. And overdue, as the article makes all too clear. But the transition will be a dizzying one, with plenty of collateral disruption along the way.
In the way that `ggplot2` tries to abstract those common "high dimensional" graphing components into an intuitive grammar, such that you may in many places be able to guess the sequence of commands correctly, I would love to see an equivalent ergonomic notation. This gets part way there by acknowledging the problem.
Mathematical operations aren't obliged to respect the Zen of Python, but I like the thought that we could make it so that most have an obvious expression.
It seems like a neat idea. If it can just be layered on top of Jax pretty easily… I dunno, seems so simple it might actually get traction?
I wish I could peek at the alternative universe where Numpy just didn’t include broadcasting. Broadcasting is a sort of ridiculous idea. Trying to multiply a NxM matrix by a 1x1 matrix… should return an error, not perform some other operation totally unrelated to matrix multiplication!
Broadcasting is an excellent idea. Implementations do have warts, but e.g. pytorch would be really painful without it.
Broadcasting is a sort of generalization of the idea of scalar-matrix product. You could make that less "ridiculous" by requiring a Hadamard product with a constant value matrix instead.
I’m not convinced by the loops proposal. TensorFlow had this kind of lazy evaluation (except I guess TF was the worst of both worlds) and it makes debugging very difficult to the point that I believe it’s the main reason PyTorch won out. Such systems are great if they work perfectly but they never do.
NumPy definitely has some rough edges, I sympathize with the frustrations for sure.
[1]: https://numpy.org/doc/stable/reference/generated/numpy.einsu...
https://dynomight.net/numpy/
This is a big improvement over numpy, but I don't see much of a compelling reason to go back to Python.
Being able to use sane programming workflows is quite a compelling reason.
> In DumPy, every time you index an array or assign to a dp.Slot, it checks that all indices have been included.
Not having to specify all indices makes for more generic implementations. Sure, the broadcasting rules could be simpler and more consistent, but in the meantime (implicit) broadcasting is what makes NumPy so powerful and flexible.
Also I think straight up vmap would be cleaner IF Python did not intentionally make lambdas/FP so restricted and cumbersome apparently due to some emotional reasons.
Deleted Comment
For transposes, np.einsum can be easier to read as it let's you use (single character, admittedly) axis specifiers to "name" them.
Then again, maybe even that is wrong! "Notation as a tool for thought" and all that. Maybe "dimension-munging" in APL really is the best way to do these things, once you really understand it.
For example, I just cut-and-pasted the author's own cri de coeur into Claude: https://claude.ai/share/1d750315-bffa-434b-a7e8-fb4d739ac89a Presumably at least one of the vectorized versions it replied with will work, although none is identical to the author's version.
When this cycle ends, high-level programs and functions will be as incomprehensible to most mainstream developers as assembly is today. Today's specs are tomorrow's programs.
Not a bad thing, really. And overdue, as the article makes all too clear. But the transition will be a dizzying one, with plenty of collateral disruption along the way.
Mathematical operations aren't obliged to respect the Zen of Python, but I like the thought that we could make it so that most have an obvious expression.
I wish I could peek at the alternative universe where Numpy just didn’t include broadcasting. Broadcasting is a sort of ridiculous idea. Trying to multiply a NxM matrix by a 1x1 matrix… should return an error, not perform some other operation totally unrelated to matrix multiplication!
Broadcasting is a sort of generalization of the idea of scalar-matrix product. You could make that less "ridiculous" by requiring a Hadamard product with a constant value matrix instead.
NumPy definitely has some rough edges, I sympathize with the frustrations for sure.