I'll take this opportunity to point out that if you're doing anything numpy related that seems too slow you should run numba on it, in my case we were doing a lot of cosine distance calculations and our inference time sped up 10x by simply running the cosine distance function from numpy through numba and it's as easy as adding a decorator.
Taichi vs. Numba: As its name indicates, Numba is tailored for Numpy. Numba is recommended if your functions involve vectorization of Numpy arrays. Compared with Numba, Taichi enjoys the following advantages:
Taichi supports multiple data types, including struct, dataclass, quant, and sparse, and allows you to adjust memory layout flexibly. This feature is extremely desirable when a program handles massive amounts of data. However, Numba only performs best when dealing with dense NumPy arrays.
Taichi can call different GPU backends for computation, making large-scale parallel programming (such as particle simulation or rendering) as easy as winking. But it would be hard even to imagine writing a renderer in Numba.
Except some people don't read the article and already assume numpy is "very" optimized that they might gloss over that line without reading much into it. That line also doesn't say that you might get a 10x speed-up while using numba. I remember when I first came across numba I searched HN for references and didn't find many stories or comments praising it so I skipped over it initially so having HN comments might be useful for future HN'ers.
There is Numba and then there is Nutika if you want to compile to a binary. I’m not sure the two work together. But Taichi may work with Nutika for runtime optimization as a binary.
The equivalent of Nutika for numerical code is pythran. Which compiles to highly optimized c++ code. I have been getting the best speedup with the least changes when using it compared to numba or cython (haven't tested taichi yet).
Do jax/tensorflow/pytorch work with numba? I.e. can you pass one of their arrays through a numba function and have it (a) not crash (b) support backprop?
1) "Diffussion" is species vs time equals species spatial laplacian.
2) The "reaction" equations are non-painfully derived from Baez stochastic Petri nets/chemical reaction networks in [1] (species vs time = multivariate polynomial in species, "space dependant rate equation")
So Reaction-Diffusion is just adding up. Species vs time = species spatial laplacian plus multivariate polynomial in species. One more for the toolbox!
It's just alternating between sharpen and smoothen. Sharpen hallucinates new information. Smoothen diffuses and erases information. Thus there is this interplay with constant hallucinations that take place over previous ones.
Mathematicians and biologists have a hammer, so everything looks like a nail.
I'm interested in understanding this comment but I don't know where to start. I love the way Reaction Diffusion simulations look and I've coded it up a few times. But I don't understand what you mean by "species vs time". (Some of the other technical language seems more Google-able, but "species vs time" isn't turning up anything obvious.)
I think GP is referring to the heat equation. "species" is the concentration of the "stuff" that you're describing mathematically. Call that u(x), where is a spatial coordinate and u is a real valued function.
Then the diffusive part says
du(x, t)/dt = \nabla_x u(x, t).
The \nabla term is the laplacian: a multivariate form of the second derivative.
The equation says that a short time from now, u(x, t) will change in proportion to the average value of u, calculated over a small ball surrounding the point x, minus the value of u at the point x itself.
If there's less "stuff" in the points that neighbour x than at x itself, the function will decrease over time. Similarly if there's more stuff at the neighbours of x, u(x, t) will increase. This is the basis of diffusive behaviour.
(Edit: I think the equation in the article is wrong, unless I've misunderstood something: they have a delta (first derivative) when they should have a nabla (laplacian))
An extremely interesting area. I keep wanting to use it for something but haven't had a good use case yet, nor frankly do I think I really understand it.
This code is recursive and generate set partitions for large N values (N larger than 12), it essentially works by skipping small partitions and small subsets to target desirable set partitions. Solutions that don't skip those suffer from "combinatory explosion".
I did not write this code, I want to test it later with taichi, but I'm curious if taichi can run this faster.
Slightly off topic but the choice of name is interesting given that Tai chi is well-known for its slow movements and being practiced by the elderly at the park.
I practice tai chi and I'm not elderly (though not exactly young either :-) ) tai chi is actually very hard to do well because it requires a lot of flexibility, I mean a lot).
You should see what Taichi lessons in Chinese colleges look like: full with students who like no kinds of sports but must choose a PE class, and yes, I was one of them.
Unfortunately, phytran is missing in the comparison. Phytran works in a lot of cases and it easy to use by just using python types. I would like to see a comparison with taichi, as taichi also seems to be interesting.
What do you mean disappointing? I have consistently been getting the best results with pythran. That said it is strongly focused on numerical code, so your milage might vary for other code. You also should add compiler optimisation flags to get the best performance.
Regarding Nutika AFAIK its goal is not a speed up and performance gains are pretty modest in most cases.
I thought it was a parsing issue in Python when doing "import taichi as ti" vs "import taichi". No it's just presenting Taichi, a Python package to do parallel computation.
EDIT: title of the thread was "Accelerate Python code 100x by import taichi as ti" like TFA
Me too - it wouldn't be unheard of in a language where referencing multiple.levels.of.variable in a loop is orders of magnitude slower than doing "a = multiple.levels.of.variable" outside the loop and referencing a inside of it.
*may have been fixed in recent versions of Python - I heard of this many years ago!
Isn’t that expected behaviour, as you’re only looking up “a” once when you do it outside the loop, while doing it every time when inside the loop?
Because any reference in the whole hierarchy could change during the looping (e.g. one could say “multiple.levels = {}” at some point), the interpreter really would need to check it every time unless it can somehow “prove” that these changes will never happen / haven’t happened.
Just keeping a reference to “a” is semantically very different, and I’d consider that a normal optimisation.
Taichi vs. Numba: As its name indicates, Numba is tailored for Numpy. Numba is recommended if your functions involve vectorization of Numpy arrays. Compared with Numba, Taichi enjoys the following advantages:
Taichi supports multiple data types, including struct, dataclass, quant, and sparse, and allows you to adjust memory layout flexibly. This feature is extremely desirable when a program handles massive amounts of data. However, Numba only performs best when dealing with dense NumPy arrays. Taichi can call different GPU backends for computation, making large-scale parallel programming (such as particle simulation or rendering) as easy as winking. But it would be hard even to imagine writing a renderer in Numba.
https://en.m.wikipedia.org/wiki/Nuitka
1) "Diffussion" is species vs time equals species spatial laplacian.
2) The "reaction" equations are non-painfully derived from Baez stochastic Petri nets/chemical reaction networks in [1] (species vs time = multivariate polynomial in species, "space dependant rate equation")
So Reaction-Diffusion is just adding up. Species vs time = species spatial laplacian plus multivariate polynomial in species. One more for the toolbox!
[1] https://arxiv.org/abs/1209.3632
Mathematicians and biologists have a hammer, so everything looks like a nail.
Then the diffusive part says
du(x, t)/dt = \nabla_x u(x, t).
The \nabla term is the laplacian: a multivariate form of the second derivative.
The equation says that a short time from now, u(x, t) will change in proportion to the average value of u, calculated over a small ball surrounding the point x, minus the value of u at the point x itself.
If there's less "stuff" in the points that neighbour x than at x itself, the function will decrease over time. Similarly if there's more stuff at the neighbours of x, u(x, t) will increase. This is the basis of diffusive behaviour.
(Edit: I think the equation in the article is wrong, unless I've misunderstood something: they have a delta (first derivative) when they should have a nabla (laplacian))
An extremely interesting area. I keep wanting to use it for something but haven't had a good use case yet, nor frankly do I think I really understand it.
how faster can the code in those SO answers be?
https://stackoverflow.com/questions/73473074/speed-up-set-pa...
This code is recursive and generate set partitions for large N values (N larger than 12), it essentially works by skipping small partitions and small subsets to target desirable set partitions. Solutions that don't skip those suffer from "combinatory explosion".
I did not write this code, I want to test it later with taichi, but I'm curious if taichi can run this faster.
Dead Comment
Regarding Nutika AFAIK its goal is not a speed up and performance gains are pretty modest in most cases.
Was Nuitka better? Pythran is quite simple to install and use in Jupyter.
EDIT: title of the thread was "Accelerate Python code 100x by import taichi as ti" like TFA
*may have been fixed in recent versions of Python - I heard of this many years ago!
Because any reference in the whole hierarchy could change during the looping (e.g. one could say “multiple.levels = {}” at some point), the interpreter really would need to check it every time unless it can somehow “prove” that these changes will never happen / haven’t happened.
Just keeping a reference to “a” is semantically very different, and I’d consider that a normal optimisation.
Deleted Comment