If your arrays have more than two dimensions, please consider using Xarray [1], which adds dimension naming to NumPy arrays. Broadcasting and alignment then becomes automatic without needing to transpose, add dummy axes, or anything like that. I believe that alone solves most of the complaints in the article.
Compared to NumPy, Xarray is a little thin in certain areas like linear algebra, but since it's very easy to drop back to NumPy from Xarray, what I've done in the past is add little helper functions for any specific NumPy stuff I need that isn't already included, so I only need to understand the NumPy version of the API well enough one time to write that helper function and its tests. (To be clear, though, the majority of NumPy ufuncs are supported out of the box.)
I'll finish by saying, to contrast with the author, I don't dislike NumPy, but I do find its API and data model to be insufficient for truly multidimensional data. For me three dimensions is the threshold where using Xarray pays off.
Xarray is great. It marries the best of Pandas with Numpy.
Indexing like `da.sel(x=some_x).isel(t=-1).mean(["y", "z"])` makes code so easy to write and understand.
Broadcasting is never ambiguous because dimension names are respected.
It's very good for geospatial data, allowing you to work in multiple CRSs with the same underlying data.
We also use it a lot for Bayesian modeling via Arviz [1], since it makes the extra dimensions you get from sampling your posterior easy to handle.
Finally, you can wrap many arrays into datasets, with common coordinates shared across the arrays. This allows you to select `ds.isel(t=-1)` across every array that has a time dimension.
Is there anything similar to this for something like Tensorflow, Keras or Pytorch? I haven't used them super recently, but in the past I needed to do all of the things you just described in painful to debug ways.
Thanks for sharing this library. I will give it a try.
For a while I had a feeling that I was perhaps a little crazy for seeming to be only person to really dislike the use of things like ‘array[:, :, None]’ and so forth.
along those lines, for biosignal processing, NeuroPype[0] also builds on numpy and implements named axes for n-dimensional tensors, with the ability to store per-element data (i.e. channel names, positions, etc.) for each axis
Life goes full circle sometimes. I remember that Numpy roughly came out of the amalgamation of the Numeric and Numarray libraries; I want to imagine that the Numarray people kept fighting these past 20 years to prove they were the right solution, at some point took some money from Elon Musk and renamed to Xarray [0], and finally started beating Numpy.
One of the reasons why I started using Julia was because the NumPy syntax was so difficult. Going from MATLAB to NumPy I felt like I suddenly became a mediocre programmer, spending less time on math and more time on "performance engineering" of just trying to figure out how to use NumPy right. Then when I went to Julia it made sense to vectorize when it felt good and write a loop when it felt good. Because both are fast, focus on what makes the code easiest to read an understand. This blog post encapsulates exactly that experience and feeling.
Also treating things like `np.linalg.solve` as a black box that is the fastest thing in the world and you could never any better so please mangle code to call it correctly... that's just wrong. There's many reasons to build problem specific linear algebra kernels, and that's something that is inaccessible without going deeper. But that's a different story.
The reason is quite simple. Julia is a language designed for scientific computation. Numpy is a library frankenstein-grafted onto a language that isn't really designed for scientific computation.
We can only hope that Julia somehow wins and those of forced to work in python because of network effects can be freed.
Julia really needs to generalize and clarify its deployment story before it could possibly take off. It was built with a promise of generality but its tethered packaging is an albatross.
> Going from MATLAB to NumPy I felt like I suddenly became a mediocre programmer, spending less time on math and more time on "performance engineering" of just trying to figure out how to use NumPy right.
Matlab is about as slow without readily vectorized operations as Python.
Slowness of Python is a huge pain point, and Julia has a clear advantage here. Sadly Julia is practically just unusable beyond quite niche purposes.
Python does now have quite serviceable jit hacks that let one escape vectorization tricks, but they are still hacks and performant Python alternative would be very welcome. Sadly there aren't any.
One of the things i suspect will happen very soon is that all languages will become as fast as each other and your reason to use one over the other for performance reasons might not exist. I know this sounds optimistic and wishful thinking but hear me out here;
Modern AIs are literal translation engines. That's their origin. The context that lets them do what they do was originally there to allow good translations. It doesn't matter if the translation is programming language output, or actual language output. I can today ask an AI to rewrite a chunk of Matlab code into Rust and it works! There's even code generators that will utilize the GPU where sensible.
We're really not that far off that we can write code in any language and transparently behind the scenes have it actually running on a more performant backend when needed. Ideally you keep the Matlab/Python frontend and the translation will be on the intermediate layers in a two way fashion so step through/debug works as expected.
Based on playing with the above steps manually with good results we're almost at the stage of just needing some glue to make it all work. Write in Matlab/Python, and run as fast as any LLVM backed language.
I don’t understand this. Julia is a general-purpose language that I increasingly use to replace bash scripts. I’ve also happily used it for web scraping tasks¹, interactive web² projects³, generation of illustrations and animations, and, yes, scientific calculations. Aside from the language itself, which is fun to program in, its brilliant package system massively facilitates composing libraries into a solution.
"Then when I went to Julia it made sense to vectorize when it felt good and write a loop when it felt good. Because both are fast, focus on what makes the code easiest to read an understand."
Is MATLAB materially different? Loops are slow (how slow depends on the version), and the fastest thing in the world is the utter perfection of the black box called '\'.
The last time I seriously used Matlab, over 10 years ago, they had implemented JIT compilation and often straightforward loops were a lot faster than trying to vectorize. And definitely less error-prone.
Iirc, ‘\’ was just shorthand for solving a system of equations, which you could mentally translate into the appropriate LAPACK function without loss of precision. You did have to be a little more careful about making extra copies than in Fortran or C (or even Python). But nothing was magic.
Compared to Matlab (and to some extent Julia), my complaints about numpy are summed up in these two paragraphs:
> Some functions have axes arguments. Some have different versions with different names. Some have Conventions. Some have Conventions and axes arguments. And some don’t provide any vectorized version at all.
> But the biggest flaw of NumPy is this: Say you create a function that solves some problem with arrays of some given shape. Now, how do you apply it to particular dimensions of some larger arrays? The answer is: You re-write your function from scratch in a much more complex way. The basic principle of programming is abstraction—solving simple problems and then using the solutions as building blocks for more complex problems. NumPy doesn’t let you do that.
Usually when I write Matlab code, the vectorized version just works, and if there are any changes needed, they're pretty minor and intuitive. With numpy I feel like I have to look up the documentation for every single function, transposing and reshaping the array into whatever shape that particular function expects. It's not very consistent.
Yeah, in case anyone else has the misfortune of having to work with multi-dimensional data in MATLAB, I'd recommend the Tensor Toolbox, Tensorlab, or the N-way Toolbox.
There are certainly some aspects of it that are inelegant, in the interests of backwards compatibility, but otherwise I don't know what you are talking about. Matlab supports >2d arrays just fine, and has for at least 20 years.
> Now, how do you apply it to particular dimensions of some larger arrays?
What exactly is the complaint here? If I write a function on a 2x2 array, there's no way to apply it to tiles in a 3x2x2 array? But there is - take a slice and squeeze. What's the issue?
Edit: if the issue is some weird version of "why can't I apply my function to an arbitrary set of indices" (like the 0,1 in a 2x2x2 array) I'm at a loss for words (because that's a completely different shape - indeed it's the entire array).
My main issue with numpy is that it's often unclear what operations will be vectorized or how they will be vectorized, and you can't be explicit about it the way you can with Julia's dot syntax.
There are also lots of gotchas related to types returned by various functions and operations.
A particularly egregious example: for a long time, the class for univariate polynomial objects was np.poly1d. It had lots of conveniences for doing usual operations on polynomials
If you multiply a poly1d object P on the right by a complex scalar z0, you get what you probably expect: a poly1d with coefficients scaled by z0.
If however you multiply P on the left by z0, you get back an array of scaled coefficients -- there's a silent type conversion happening.
So
P*z0 # gives a poly1d
z0*P # gives an array
I know that this is due to Python associativity rules and laissez-faire approach to datatypes, but it's fairly ugly to debug something like this!
Another fun gotcha with poly1d: if you want to access the leading coefficient for a quadratic, you can do so with either P.coef[0] or P[2]. No one will ever get these confused, right?
These particular examples aren't really fair because the numpy documentation describes poly1d as a part of the "old" polynomial API, advising new code to be written with the `Polynomial` API -- although it doesn't seem it's officially deprecated and no warnings are emitted when you use poly1d.
Anyway, there are similar warts throughout the library. Lots of gotchas having the shape of silent type conversions and inconsistent datatypes returned by the same method depending on its inputs that are downright nightmarish to debug
The main problem (from my point of view) of python data science ecosystem is a complete lack of standardization on anything.
You have ten different libraries that try to behave like 4 other languages and the only point of standardization is that there is usually something like .to_numpy() function. This means that most of the time I was not solving any specific problem related to data processing, but just converting data from a format one library would understand to something another library would understand.
In Julia (a language with it's own problems, of course) things mostly just work. The library for calculating uncertainties interacts well with a library handling units and all this works fine with the piloting library, or libraries solving differential equations etc. In python, this required quite a lot of boilerplate.
Nobody has mentioned array-api (and data-apis more generally), which is trying to standardize the way people interact with arrays across the Python ecosystem.
Sounds like a great idea, but difficult to achieve. The announcement blog post was almost 5 years ago, do you know maybe what the impact of this project has been in practice?
The author brings up some fair points. I feel like I had all sorts of small grievances transitioning from Matlab to Numpy. Slicing data still feels worse on Numpy than Matlab or Julia, but this doesn't justify the licensing costs for the Matlab stats/signal processing toolbox.
The issues that are presented in this article mostly related to tensors of rank >2. Numpy is originally just matrices so it is not surprising that it has problems in this domain. A dedicated library like Torch is certainly better. But Torch is difficult in its own ways. IDK, I guess the author's conclusion that numpy is “the worst array language other than all the other array languages” feels correct. Maybe a lack of imagination on my part.
I thought I'd do something smart and inline all the matrix multiplications into the einsums of the vectorized multi-head attention implementation from the article and set optimize="optimal" to make use of the optimal matrix chain multiplication algorithm https://en.wikipedia.org/wiki/Matrix_chain_multiplication to get a nice performance boost.
This is indeed twice as fast as the vectorized implementation, but, disappointingly, the naive implementation with loops is even faster. Here is the code if someone wants to figure out why the performance is like that: https://pastebin.com/raw/peptFyCw
My guess is that einsum could do a better job of considering cache coherency when evaluating the sum.
My main problem with numpy is that the syntax is verbose. I know from the programming language perspective it may not be considered a drawback (might even be a strength). But in practice, the syntax is a pain compared to matlab or Julia. The code for the latter is easier to read, understand, consistent with math notation.
The syntax of actual array languages can be beautifully concise and expressive. You can express mathematical formulas in ways that makes sense when you read a single line, and once you get used to the notation and some originally non-intuitive quirks (from a programming background perspective), you can write in very few lines what otherwise would take you several lines of rather ugly, less readable code.
In my view, python+numpy is not actually an array language. Numpy as a library adds vectorization operations to python in order to help with speed. This is different. It does not (intend to) bring the advantages that array language syntax has, even if it was a bit more consistent.
Compared to NumPy, Xarray is a little thin in certain areas like linear algebra, but since it's very easy to drop back to NumPy from Xarray, what I've done in the past is add little helper functions for any specific NumPy stuff I need that isn't already included, so I only need to understand the NumPy version of the API well enough one time to write that helper function and its tests. (To be clear, though, the majority of NumPy ufuncs are supported out of the box.)
I'll finish by saying, to contrast with the author, I don't dislike NumPy, but I do find its API and data model to be insufficient for truly multidimensional data. For me three dimensions is the threshold where using Xarray pays off.
[1] https://xarray.dev
Indexing like `da.sel(x=some_x).isel(t=-1).mean(["y", "z"])` makes code so easy to write and understand.
Broadcasting is never ambiguous because dimension names are respected.
It's very good for geospatial data, allowing you to work in multiple CRSs with the same underlying data.
We also use it a lot for Bayesian modeling via Arviz [1], since it makes the extra dimensions you get from sampling your posterior easy to handle.
Finally, you can wrap many arrays into datasets, with common coordinates shared across the arrays. This allows you to select `ds.isel(t=-1)` across every array that has a time dimension.
[1] https://www.arviz.org/en/latest/
The docs say that it's a prototype feature, and I think it has been that way for a few years now, so no idea how production-ready it is.
https://docs.pytorch.org/docs/stable/named_tensor.html
For a while I had a feeling that I was perhaps a little crazy for seeming to be only person to really dislike the use of things like ‘array[:, :, None]’ and so forth.
[0] https://www.neuropype.io/docs/
[0] most of the above is fiction
Deleted Comment
Also treating things like `np.linalg.solve` as a black box that is the fastest thing in the world and you could never any better so please mangle code to call it correctly... that's just wrong. There's many reasons to build problem specific linear algebra kernels, and that's something that is inaccessible without going deeper. But that's a different story.
We can only hope that Julia somehow wins and those of forced to work in python because of network effects can be freed.
Matlab is about as slow without readily vectorized operations as Python.
Slowness of Python is a huge pain point, and Julia has a clear advantage here. Sadly Julia is practically just unusable beyond quite niche purposes.
Python does now have quite serviceable jit hacks that let one escape vectorization tricks, but they are still hacks and performant Python alternative would be very welcome. Sadly there aren't any.
Modern AIs are literal translation engines. That's their origin. The context that lets them do what they do was originally there to allow good translations. It doesn't matter if the translation is programming language output, or actual language output. I can today ask an AI to rewrite a chunk of Matlab code into Rust and it works! There's even code generators that will utilize the GPU where sensible.
We're really not that far off that we can write code in any language and transparently behind the scenes have it actually running on a more performant backend when needed. Ideally you keep the Matlab/Python frontend and the translation will be on the intermediate layers in a two way fashion so step through/debug works as expected.
Based on playing with the above steps manually with good results we're almost at the stage of just needing some glue to make it all work. Write in Matlab/Python, and run as fast as any LLVM backed language.
I don’t understand this. Julia is a general-purpose language that I increasingly use to replace bash scripts. I’ve also happily used it for web scraping tasks¹, interactive web² projects³, generation of illustrations and animations, and, yes, scientific calculations. Aside from the language itself, which is fun to program in, its brilliant package system massively facilitates composing libraries into a solution.
1 https://lee-phillips.org/amazonJuliaBookRanks
2 https://www.admin-magazine.com/Archive/2025/85
3 https://lee-phillips.org/pluckit
Deleted Comment
Why? Just the ecosystem?
This is true of modern Fortran also.
[1] https://lfortran.org/
[2] "in recent years" because that's when I became aware of this stuff, not to say there wasn't effort before then
Iirc, ‘\’ was just shorthand for solving a system of equations, which you could mentally translate into the appropriate LAPACK function without loss of precision. You did have to be a little more careful about making extra copies than in Fortran or C (or even Python). But nothing was magic.
> Some functions have axes arguments. Some have different versions with different names. Some have Conventions. Some have Conventions and axes arguments. And some don’t provide any vectorized version at all.
> But the biggest flaw of NumPy is this: Say you create a function that solves some problem with arrays of some given shape. Now, how do you apply it to particular dimensions of some larger arrays? The answer is: You re-write your function from scratch in a much more complex way. The basic principle of programming is abstraction—solving simple problems and then using the solutions as building blocks for more complex problems. NumPy doesn’t let you do that.
Usually when I write Matlab code, the vectorized version just works, and if there are any changes needed, they're pretty minor and intuitive. With numpy I feel like I have to look up the documentation for every single function, transposing and reshaping the array into whatever shape that particular function expects. It's not very consistent.
What exactly is the complaint here? If I write a function on a 2x2 array, there's no way to apply it to tiles in a 3x2x2 array? But there is - take a slice and squeeze. What's the issue?
Edit: if the issue is some weird version of "why can't I apply my function to an arbitrary set of indices" (like the 0,1 in a 2x2x2 array) I'm at a loss for words (because that's a completely different shape - indeed it's the entire array).
There are also lots of gotchas related to types returned by various functions and operations.
A particularly egregious example: for a long time, the class for univariate polynomial objects was np.poly1d. It had lots of conveniences for doing usual operations on polynomials
If you multiply a poly1d object P on the right by a complex scalar z0, you get what you probably expect: a poly1d with coefficients scaled by z0.
If however you multiply P on the left by z0, you get back an array of scaled coefficients -- there's a silent type conversion happening.
So
I know that this is due to Python associativity rules and laissez-faire approach to datatypes, but it's fairly ugly to debug something like this!Another fun gotcha with poly1d: if you want to access the leading coefficient for a quadratic, you can do so with either P.coef[0] or P[2]. No one will ever get these confused, right?
These particular examples aren't really fair because the numpy documentation describes poly1d as a part of the "old" polynomial API, advising new code to be written with the `Polynomial` API -- although it doesn't seem it's officially deprecated and no warnings are emitted when you use poly1d.
Anyway, there are similar warts throughout the library. Lots of gotchas having the shape of silent type conversions and inconsistent datatypes returned by the same method depending on its inputs that are downright nightmarish to debug
Deleted Comment
The types returned as you described are exactly what I’d expect and idiomatic for Python. This isn’t a problem with numpy.
You have ten different libraries that try to behave like 4 other languages and the only point of standardization is that there is usually something like .to_numpy() function. This means that most of the time I was not solving any specific problem related to data processing, but just converting data from a format one library would understand to something another library would understand.
In Julia (a language with it's own problems, of course) things mostly just work. The library for calculating uncertainties interacts well with a library handling units and all this works fine with the piloting library, or libraries solving differential equations etc. In python, this required quite a lot of boilerplate.
https://github.com/data-apis/array-api
https://data-apis.org/blog/announcing_the_consortium/
The issues that are presented in this article mostly related to tensors of rank >2. Numpy is originally just matrices so it is not surprising that it has problems in this domain. A dedicated library like Torch is certainly better. But Torch is difficult in its own ways. IDK, I guess the author's conclusion that numpy is “the worst array language other than all the other array languages” feels correct. Maybe a lack of imagination on my part.
My guess is that einsum could do a better job of considering cache coherency when evaluating the sum.
On CPU or GPU?
In my view, python+numpy is not actually an array language. Numpy as a library adds vectorization operations to python in order to help with speed. This is different. It does not (intend to) bring the advantages that array language syntax has, even if it was a bit more consistent.
A = [1, 2; 3, 4];
x = [5; 6];
y = A * x;
with this uglier version:
import numpy as np
A = np.array([[1, 2], [3, 4]])
x = np.array([[5], [6]])
y = A @ x