Readit News logoReadit News
totalperspectiv commented on The Impossible Optimization, and the Metaprogramming to Achieve It   verdagon.dev/blog/impossi... · Posted by u/melodyogonna
omnicognate · 2 months ago
The language here is Mojo, which the article seems to assume you know and doesn't say enough for you to deduce until half way through and after multiple code examples. I don't know how you're supposed to know this as even the blog it's on is mostly about Vale. From the intro I was expecting it to be about C++.
totalperspectiv · 2 months ago
The author works for Modular. He shared the write up on the Mojo Discord. I think Mojo users were the intended audience.
totalperspectiv commented on Removing newlines in FASTA file increases ZSTD compression ratio by 10x   log.bede.im/2025/09/12/zs... · Posted by u/bede
bede · 3 months ago
Thanks for reminding me to benchmark this!
totalperspectiv · 3 months ago
I've only tested this when writing my own parser where I could skip the record end checks, so idk if this improves perf on a existing parser. Excited to see what you find!
totalperspectiv commented on Removing newlines in FASTA file increases ZSTD compression ratio by 10x   log.bede.im/2025/09/12/zs... · Posted by u/bede
totalperspectiv · 3 months ago
Removing the wrapping newline from the FASTA/FASTQ convention also dramatically improves parsing perf when you don't have to do as much lookahead to find record ends.
totalperspectiv commented on Removing newlines in FASTA file increases ZSTD compression ratio by 10x   log.bede.im/2025/09/12/zs... · Posted by u/bede
semiinfinitely · 3 months ago
FASTA is a candidate for the stupidest file format ever invented and a testament to the massive gap in perceived vs actual programming ability of the average bioinformatician.
totalperspectiv · 3 months ago
> a testament to the massive gap in perceived vs actual programming ability of the average bioinformatician.

This is not really a fair statement. Literally all of software bears the weight of some early poor choice that then keeps moving forward via weight of momentum. FASTA and FASTQ formats are exceptionally dumb though.

totalperspectiv commented on Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul   modular.com/blog/matrix-m... · Posted by u/robertvc
whimsicalism · 4 months ago
jw: why do you use mojo here over triton or the new pythonic cute/cutlass?
totalperspectiv · 4 months ago
Because I was originally writing some very CPU intensive SIMD stuff, which Mojo is also fantastic for. Once I got that working and running nicely I decided to try getting the same algo running on GPU since, at the time, they had just open sourced the GPU parts of the stdlib. It was really easy to get going with.

I have not used Triton/Cute/Cutlass though, so I can't compare against anything other than Cuda really.

totalperspectiv commented on Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul   modular.com/blog/matrix-m... · Posted by u/robertvc
whimsicalism · 4 months ago
Yes, it looks like they have some sort of metaprogramming setup (nicer than C++) for doing this: https://www.modular.com/mojo
totalperspectiv · 4 months ago
I can confirm, it’s quite nice.
totalperspectiv commented on Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul   modular.com/blog/matrix-m... · Posted by u/robertvc
subharmonicon · 4 months ago
The blog post is about using an NVIDIA-specific tensor core API that they have built to get good performance.

Modular has been pushing the notion that they are building technology that allows writing HW-vendor neutral solutions so that users can break free of NVIDIA's hold on high performance kernels.

From their own writing:

> We want a unified, programmable system (one small binary!) that can scale across architectures from multiple vendors—while providing industry-leading performance on the most widely used GPUs (and CPUs).

totalperspectiv · 4 months ago
They allow you to write a kernel for Nvidia, or AMD, that can take full advantage of the Hardware of either one, then throw a compile time if-statement in there to switch which kernel to use based on the hardware available.

So, you can support either vendor with as-good-vendor-library performance. That’s not lock-in to me at least.

It’s not as good as the compiler being able to just magically produce optimized kernels for arbitrary hardware though, fully agree there. But it’s a big step forward from Cuda/HIP.

totalperspectiv commented on Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul   modular.com/blog/matrix-m... · Posted by u/robertvc
saagarjha · 4 months ago
Is anyone using Modular? Curious how you find it compares against the competitors in this space.
totalperspectiv · 4 months ago
I have used Mojo quite a bit. It’s fantastic and lives up to every claim it makes. When the compiler becomes open source I fully expect it to really start taking off for data science.

Modular also has its paid platform for serving models called Max. I’ve not used that but heard good things.

totalperspectiv commented on Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul   modular.com/blog/matrix-m... · Posted by u/robertvc
subharmonicon · 4 months ago
TLDR: In order to get good performance you need to use vendor-specific extensions that result in the same lock-in Modular has been claiming they will enable you to avoid.
totalperspectiv · 4 months ago
I don’t follow your logic. Mojo can target multiple gpu vendors. What is the Modular specific lock in?
totalperspectiv commented on Optimising for maintainability – Gleam in production at Strand   gleam.run/case-studies/st... · Posted by u/Bogdanp
0cf8612b2e1e · 4 months ago
For Gleam//Erlang is there an easy way to package up an executable you can distribute without also shipping Erlang?
totalperspectiv · 4 months ago
I can't speak to Gleam, but for Elixir I just used Burrito to create a single executable: https://github.com/burrito-elixir/burrito I think it works for just Erlang too.

u/totalperspectiv

KarmaCake day992November 7, 2016View Original