At the risk of shedding it to bikes, one point that the author makes is that Zig's lack of operator overloading makes him write vector math like this:
if (discriminant > 0.0) {
// I stared at this monster for a while to ensure I got it right
return uv.sub(n.mul(dt)).mul(ni_over_nt).sub(n.mul(math.sqrt(discriminant)));
}
He signs off with:
> How do C programmers manage?
The answer is simple: we assign names to intermediate results. Now, I have absolutely no idea what that expression computes, because I suck at graphics programming and math in general. Please pretend that these are proper mathy terms:
if (discriminant > 0.0) {
const banana = uv.sub(n.mul(dt))
const apple = banana.mul(ni_over_nt)
const pear = n.mul(math.sqrt(discriminant)
return apple.sub(pear)
}
I'm convinced that there's a proper mathy/lighting-y word for each component in that expression. Of course this approach totally breaks down if you're copying expressions from papers or articles without understanding why they are correct (which is how I do all my graphics programming). I do find that naming variables is often a great way to force myself to grok what's going on.
> I’m convinced that there’s a proper mathy / lighting-y word for each component in that expression.
Sometimes yes, but often no. Frequently this kind of expression is the result of solving an equation, so it’s just an expression.
Graphics people often use two approaches for sub-expressions:
- You can name them with the same letters that are in the expression, just with the punctuation & operators removed, for example:
const uvMinusNTimesDt = uv.sub(n.mul(dt))
- Alternatively, just like with equations, math people often freely assign single letter names to variables without worrying about semantic meaning.
const q = uv.sub(n.mul(dt))
Nothing really wrong with naming sub-expressions after fruits, or single letters, or spelling them out explicitly.
It might be worth reflecting on what the goals are with your naming, and whether it matters what they’re named. As software engineers, our biases lean toward making choices that improve readability and maintainability. But for a specific equation that will never change once it works correctly, our preconceived notions about good software design and best practices might not actually apply to this situation. It might be more important to document the source of the equation than to make the implementation readable.
People say “the two hardest problems in computer science are cache invalidation, naming and off-by-one errors” and we often design systems so that we don’t have to think about cache invalidation and off by one errors (e.g. list.map rather than a for loop). I often wonder if we should think about minimizing how often we name things as well.
Symbolic math software does more “optimizations” than even offline compilers. For example, I have never saw an optimizer which is aware that sin(x)^2 + cos(x)^2 = 1.
In my experience writing math code the intermediary values get quite goofy Ex: orthogonal-vector-to-plane-bisecting-input-vector-and-first-column-vector
But I rather use crazy descriptive names than hiding it away. Otherwise it gets quite incomprehensible when you reread it 3 months later
Anyone else hit this problem? I suspect most people just reference a paper or book and use the letters to match the source ( x/y/n/m/etc. )
I think people's brains must work different, because for me this would be a terrible way to do it. I simply cannot read and comprehend math with very long descriptive variable names.
Whenever I see people do that, I have to write down the equation with single letters, and then look at it.
I am more of a literate programming type of person. I prefer writing longer explanations of code. But usually as a header. I like keeping the core of the code as clean and noise free as possible.
So I write code more like a math or physics book I guess. The equations are kept simple and clutter free, and then there is a body of text above or below explaining how to think about it. I tend to prefer using a lot of unicode, because following conventions helps me a lot. If I see a t₀, t and Δt variable e.g. I immediately get a sense of what sort of variables this is and how they are related. If instead it said start_time_of_incident, current_time and time_difference_between_events I could not quickly parse and internalize that.
I suspect most people just reference a paper or book and use the letters to match the source ( x/y/n/m/etc. )
If you do this (and I'll certainly admit that I do this as well) make sure you link to the paper in question somewhere in either comments or the documentation so that future developers can find the canonical reference to what the variables mean.
That assumes there is a geometrically meaningful description. A lot of times a computation has no intrinsic meaning. Algebra is going to rearrange and cancel terms into nonsense.
A long name can tell you _what_ a value is. But it’s of zero use in explaining _why_ it’s being used.
The only solution is a lengthy comment explaining the process. Here some ballistic trajectory code I wrote awhile back. Good luck making sense of any single term!
Yep, agree! I think extremely long variable names (when used locally), if necessary, are wildly underrated. I mean, we all know horror stories like SimpleBeanFactoryAwareAspectInstanceFactory, but they are really design problems much more than naming problems. They gave long variables a bad rep, undeservedly so. Inside an expression, names like that truly do wonders.
So why not just write the original formula into the comment above the code? Maybe it's not the purest approach, but it sure would help parsing the code in this case.
Yep. If the code doesn't clearly show the original intent, document your intent in a comment next to the code so future you or others can double check. Esp in gfx engines where some blocks are just math translated one to one to code.
(This is partly a joke and partly a serious suggestion, there are definitely people out there who would find that easier to write, especially if the IDE rendered it properly. I stole it by searching for ni_over_nt and finding http://viclw17.github.io/2018/08/05/raytracing-dielectric-ma... )
The most principled approach there, I think, would be to build a little Expression data-structure, and then feed it to an evaluation routine, trusting the optimizer to compile the whole thing down to something efficient. If Rust didn't have operator overloading anyway, you could do the job with procedural macros.
In practice, if I had to write that quasi-monstrosity in something like C, I'd probably just comment it as clearly and liberally as possible, to the point where I manage to reassure the reader that the final expression is correct; and then add a // See above: Please DO NOT edit this expression directly!// comment as an extra caution.
Why would you need to do that with small vectors? The compiler is going to do that anyway. Eigen originally took this approach because it could avoid heap allocating temporary variable before C++ had move operations. This isn't applicable here.
The fruity method should be preferred: Any compiler worth using will immediately eliminate/register-allocate the variables and now you don't have undebuggable spaghetti code
This is how I've been helping my daughter understand some maths problems at school. Break it down and name everything, then write the sum with the names in and it makes sense.
> But rendering in separate threads turned out to be (unsurprisingly) harder than the way I would do it in C++...It was a bit frustrating to figure out how to accomplish this. Googling yielded a few stack overflow posts with similar questions, and were answered by people basically saying use my crate!
Based on some discussion in r/rust (https://www.reddit.com/r/rust/comments/c7t5za/writing_a_smal...) I went ahead and added a Rayon-based answer to that SO question (https://stackoverflow.com/a/56840441/823869). That's been the de facto standard for data parallelism in Rust for the last few years. But the article highlights that discovering the de facto standards is still a challenge for new Rust users -- does anyone know of a well-maintained list of the 10-20 most critical crates that new users should familiarize themselves with after reading The Book? Things like Rayon and lazy_static. The ranked search results at https://crates.io/crates?sort=recent-downloads are almost good enough, but they include a lot of transitive dependencies that new users shouldn't care about. (I.e. `regex` is a very important crate, but `aho-corasick` is usually only downloaded as a dependency of `regex`.)
> But the article highlights that discovering the de facto standards is still a challenge for new Rust users -- does anyone know of a well-maintained list of the 10-20 most critical crates that new users should familiarize themselves with after reading The Book?
I will say, the one example they have there which is sort-of analogous to "render each pixel of this image in parallel" is the "draw a julia set" one [0], and it's a very bad way of convincing a C/C++ programmer that Rust is good at this sort of thing. Even if the "loop over all rows in the main thread, adding to the pool a lambda that loops over each column" is somehow optimized in a good data-parallel way (I doubt it compares favorably performance-wise with "#pragma omp parallel for"), the lambdas then push each finished pixel into a channel along with their coordinates. The main thread then has to literally loop through every pixel and read from the channel for each and every one.
The natural way to do that in C/C++ is to just write the pixel to the bitmap in each thread. There are no race conditions here (everything is embarassingly parallel), just write the resulting pixel to the bitmap and be done with it. The only reason to have that channel with all that overhead (and that final synchronization on the main thread) is to satisfy the borrow checker, which is just silly in this case. It adds a tremendous amount of overhead just to make it idiomatic Rust.
It's true that you can do it the "C++ way" in Rust using unsafe and raw pointers (and there's probably crates that can do the "parallel for" in a way that compares well with OpenMP), but as a graphics programmer who's done a lot of this sort of thing, that piece of code made a very bad first impression of Rust as a high-performance language.
EDIT: also, the description is wrong. It says: "ThreadPool::execute receives each pixel as a separate job". No it doesn't, it receives each scanline as a separate job. It might be better if the pool had each pixel as a separate job, but that's not what the code is doing.
I still don't understand why Rayon is not part of the Rust standard library.
Rust was created to have easy and safe multithreading on the CPU, and Rayon is the clear winner in this space, for me it feels like something that should be part of Rust.
The Rust stdlib is specifically intended to be as lightweight as possible (while still providing those idiomatic abstractions that might be needed throughout the ecosystem, e.g. std.future) in order to avoid the Python "dead batteries" problem.
What the Rust ecosystem is still lacking is a quasi-standard "Rust Platform" of best-practice library components where the community can freely deprecate something when a clearly better replacement comes along. There are a few "community" websites that try to provide guidance wrt. some parts of the Rust ecosystem, but nothing that feels even close to official or consensus-driven.
If Rayon were in the standard library, we would never be able to make breaking changes, whereas in a crate we can possibly release a Rayon 2.0. For instance, we might want to make some low-level changes in the way iterators split their workload, per current experiments in rayon-adaptive: https://github.com/rayon-rs/rayon/issues/616
What a delightful post. Author wrote a very nice description of things they learned from a little weekend project. No preaching or ranting or opinionating. Just “I did a thing and here’s what I learned”. That’s easily my favorite type of blog post.
> I wrapped my objects in atomic reference counters, and wrapped my pixel buffer in a mutex
Rust people, is there a way to tell the compiler that each thread gets its own elements? Do you really have to either (unnecessarily) add a lock or reach for unsafe?
It's a library, so only half an answer to your question, but there's a fantastic library called rayon[1] created by one of the core contributors the the Rust language itself, Niko Matsakis. It lets you use Rust's iterator API to do extremely easy parallelism:
list.iter().map(<some_fn>)
becomes:
list.par_iter().map(<some_fn>)
Seeing as in the original example code, the final copies into the minifb have to be sequential due to the lock anyway, all the usage of synchronization primitives and in fact the whole loop could be replaced with something like:
let rendered = buffers.par_iter().map(<rendering>).collect();
for buffer in rendered.iter() {
// The copy from the article
}
I've not written much Rust in a while, so maybe the state of the art is different now, but there are a lot of ways to avoid having to reach specifically for synchronization primitives.
If you want to use completely safe Rust, you could probably get the Vec<u32> as a `&mut [u32]`, then use `.split_at()` on the slice to chop up the buffer into multiple contiguous sub-pieces for each thread. Collect up those pieces behind a struct for easier usage. It would cost you an extra pointer + length for each subpiece, but that's the price for guaranteeing that no thread reaches outside the contiguous intervals assigned to it.
EDIT: As mentioned by a sibling, `chunks_mut` is probably closer to what you want in this instance. If you have to get chunks of various sizes -- for instance, if the number of threads doesn't evenly divide the buffer into nice uniform tiles -- you'd need to drop down to the `split_at` level anyway.
> Rust people, is there a way to tell the compiler that each thread gets its own elements?
That's what `local_pixels` does in the post. Where things get trickier is when you want to share write access to a single shared buffer in a non-overlapping way (e.g. `buffer` in the post.) To do this you need to either resort to unsafe, or to prove to the compiler that the writes aren't overlapping. One way to do the latter this is to get a slice (which Vec is convertable into), and then split up that slice (which the standard library has plenty of methods for: https://doc.rust-lang.org/std/slice/index.html ), and then give each thread those non-overlapping slices.
Yes, the standard library has many methods for splitting up a single mutable slice into multiple non-overlapping mutable slices. There's split_at_mut() which just splits at an index, or split_mut() which splits using a predicate, or chunks() which gives you an iterator over non-overlapping subslices of a given length, and more.
> The ability to return values from if expressions and blocks is awesome and I don’t know how I’ve managed to live up until now without it. Instead of conditionally assigning to a bunch of variables it is better to return them from an if expression instead.
The example shown (aside from the fact it's assigning a tuple, which is a different point) would naturally be a ternary in C/C++. Does the awesomeness kick in for more complicated examples where you want intermediate vars etc in the two branches?
Really cool post. I always dreamed of writing a small realtime, raytraced game, with lights etc. Basically a roguelike in 3D. I never managed to finish it, and this project reminds me of it.
I really don't think these languages should be set opposite to one another as much as they do. I mean, I get why they are. They take up almost the same position and have quite different ways to view security and lang design. And they both try to grow and compete.
But still. I think there are some space for them both.
I would really think it would be cool to spend my working hours programming Rust for safety and switching to Zig whenever I have to code an unsafe block (all of it compiling to webasm :o )
In my mind/opinion, Rust is a potential replacement for C++, while Zig is a potential replacement for C, and both have their place in the world.
Rust feels more restrictive, but that may be the right approach for building large software projects with big, "diverse" (in terms of skill level) teams.
Zig is smaller and feels more nimble, and might be better suited for smaller teams working on smaller projects, and for getting results faster, while avoiding most of C's darker corners.
At the risk of shedding it to bikes, one point that the author makes is that Zig's lack of operator overloading makes him write vector math like this:
He signs off with:> How do C programmers manage?
The answer is simple: we assign names to intermediate results. Now, I have absolutely no idea what that expression computes, because I suck at graphics programming and math in general. Please pretend that these are proper mathy terms:
I'm convinced that there's a proper mathy/lighting-y word for each component in that expression. Of course this approach totally breaks down if you're copying expressions from papers or articles without understanding why they are correct (which is how I do all my graphics programming). I do find that naming variables is often a great way to force myself to grok what's going on.Sometimes yes, but often no. Frequently this kind of expression is the result of solving an equation, so it’s just an expression.
Graphics people often use two approaches for sub-expressions:
- You can name them with the same letters that are in the expression, just with the punctuation & operators removed, for example:
- Alternatively, just like with equations, math people often freely assign single letter names to variables without worrying about semantic meaning. Nothing really wrong with naming sub-expressions after fruits, or single letters, or spelling them out explicitly.It might be worth reflecting on what the goals are with your naming, and whether it matters what they’re named. As software engineers, our biases lean toward making choices that improve readability and maintainability. But for a specific equation that will never change once it works correctly, our preconceived notions about good software design and best practices might not actually apply to this situation. It might be more important to document the source of the equation than to make the implementation readable.
Some time ago I’ve stopped solving equations, using software for that. Nowadays, when I need them solved, I sometimes don’t even write code, here’s an example: https://stackoverflow.com/questions/1351746/find-a-tangent-p...
Symbolic math software does more “optimizations” than even offline compilers. For example, I have never saw an optimizer which is aware that sin(x)^2 + cos(x)^2 = 1.
But I rather use crazy descriptive names than hiding it away. Otherwise it gets quite incomprehensible when you reread it 3 months later
Anyone else hit this problem? I suspect most people just reference a paper or book and use the letters to match the source ( x/y/n/m/etc. )
Whenever I see people do that, I have to write down the equation with single letters, and then look at it.
I am more of a literate programming type of person. I prefer writing longer explanations of code. But usually as a header. I like keeping the core of the code as clean and noise free as possible.
So I write code more like a math or physics book I guess. The equations are kept simple and clutter free, and then there is a body of text above or below explaining how to think about it. I tend to prefer using a lot of unicode, because following conventions helps me a lot. If I see a t₀, t and Δt variable e.g. I immediately get a sense of what sort of variables this is and how they are related. If instead it said start_time_of_incident, current_time and time_difference_between_events I could not quickly parse and internalize that.
If you do this (and I'll certainly admit that I do this as well) make sure you link to the paper in question somewhere in either comments or the documentation so that future developers can find the canonical reference to what the variables mean.
A long name can tell you _what_ a value is. But it’s of zero use in explaining _why_ it’s being used.
The only solution is a lengthy comment explaining the process. Here some ballistic trajectory code I wrote awhile back. Good luck making sense of any single term!
https://github.com/forrestthewoods/lib_fts/blob/26a0d115eb44...
In practice, if I had to write that quasi-monstrosity in something like C, I'd probably just comment it as clearly and liberally as possible, to the point where I manage to reassure the reader that the final expression is correct; and then add a // See above: Please DO NOT edit this expression directly!// comment as an extra caution.
Based on some discussion in r/rust (https://www.reddit.com/r/rust/comments/c7t5za/writing_a_smal...) I went ahead and added a Rayon-based answer to that SO question (https://stackoverflow.com/a/56840441/823869). That's been the de facto standard for data parallelism in Rust for the last few years. But the article highlights that discovering the de facto standards is still a challenge for new Rust users -- does anyone know of a well-maintained list of the 10-20 most critical crates that new users should familiarize themselves with after reading The Book? Things like Rayon and lazy_static. The ranked search results at https://crates.io/crates?sort=recent-downloads are almost good enough, but they include a lot of transitive dependencies that new users shouldn't care about. (I.e. `regex` is a very important crate, but `aho-corasick` is usually only downloaded as a dependency of `regex`.)
I came across this exact thing recently: https://github.com/brson/stdx
[0]: https://users.rust-lang.org/t/error-chain-is-no-longer-maint...
The natural way to do that in C/C++ is to just write the pixel to the bitmap in each thread. There are no race conditions here (everything is embarassingly parallel), just write the resulting pixel to the bitmap and be done with it. The only reason to have that channel with all that overhead (and that final synchronization on the main thread) is to satisfy the borrow checker, which is just silly in this case. It adds a tremendous amount of overhead just to make it idiomatic Rust.
It's true that you can do it the "C++ way" in Rust using unsafe and raw pointers (and there's probably crates that can do the "parallel for" in a way that compares well with OpenMP), but as a graphics programmer who's done a lot of this sort of thing, that piece of code made a very bad first impression of Rust as a high-performance language.
EDIT: also, the description is wrong. It says: "ThreadPool::execute receives each pixel as a separate job". No it doesn't, it receives each scanline as a separate job. It might be better if the pool had each pixel as a separate job, but that's not what the code is doing.
[0]: https://rust-lang-nursery.github.io/rust-cookbook/concurrenc...
Rust was created to have easy and safe multithreading on the CPU, and Rayon is the clear winner in this space, for me it feels like something that should be part of Rust.
What the Rust ecosystem is still lacking is a quasi-standard "Rust Platform" of best-practice library components where the community can freely deprecate something when a clearly better replacement comes along. There are a few "community" websites that try to provide guidance wrt. some parts of the Rust ecosystem, but nothing that feels even close to official or consensus-driven.
Thanks for sharing! ️
Rust people, is there a way to tell the compiler that each thread gets its own elements? Do you really have to either (unnecessarily) add a lock or reach for unsafe?
[0]: https://doc.rust-lang.org/std/primitive.slice.html#method.ch...
EDIT: As mentioned by a sibling, `chunks_mut` is probably closer to what you want in this instance. If you have to get chunks of various sizes -- for instance, if the number of threads doesn't evenly divide the buffer into nice uniform tiles -- you'd need to drop down to the `split_at` level anyway.
That's what `local_pixels` does in the post. Where things get trickier is when you want to share write access to a single shared buffer in a non-overlapping way (e.g. `buffer` in the post.) To do this you need to either resort to unsafe, or to prove to the compiler that the writes aren't overlapping. One way to do the latter this is to get a slice (which Vec is convertable into), and then split up that slice (which the standard library has plenty of methods for: https://doc.rust-lang.org/std/slice/index.html ), and then give each thread those non-overlapping slices.
> The ability to return values from if expressions and blocks is awesome and I don’t know how I’ve managed to live up until now without it. Instead of conditionally assigning to a bunch of variables it is better to return them from an if expression instead.
The example shown (aside from the fact it's assigning a tuple, which is a different point) would naturally be a ternary in C/C++. Does the awesomeness kick in for more complicated examples where you want intermediate vars etc in the two branches?
https://ziglang.org/documentation/master/#switch
https://doc.rust-lang.org/reference/expressions/match-expr.h...
* you can more easily handle more than 2 options
* you can more easily have a bunch of code in each case
* you don't have a new syntax for the compiler or programmer to have to know about, everything is just consistently an expression
It is nice, and those of us who say that are usually aware of various alternative workarounds with ternaries etc.
Rust feels more restrictive, but that may be the right approach for building large software projects with big, "diverse" (in terms of skill level) teams.
Zig is smaller and feels more nimble, and might be better suited for smaller teams working on smaller projects, and for getting results faster, while avoiding most of C's darker corners.