sujayakar (u/sujayakar)

sujayakar commented on GCP Outage status.cloud.google.com/... · Posted by u/thanhhaimai

atonse · 3 months ago

Getting a lot of errors for Claude Sonnet 4 (Cursor) and Gemini Pro.

Nooooo I'm going to have to use my brain again and write 100% of my code like a caveman from December 2024.

sujayakar · 3 months ago

switch to auto mode and it should still work!

sujayakar commented on Data Compression Nerds Hate This One Trick [video] media.ccc.de/v/eh22-8-mor... · Posted by u/doener

sujayakar · 4 months ago

Here's an interesting negative result.

After watching this video, my first thought was whether recent results from columnar compression (e.g. https://docs.vortex.dev/references#id1) applied "naively" like QOI would have good results.

I started with a 1.79MiB sprite file for a 2D game I've been hacking on, and here are the results:

  PNG: 1.79 MiB
  QOI: 2.18 MiB
  BtrBlocks: 3.69 MiB

(Source: https://gist.github.com/sujayakar/aab7b4e9df01f365868ec7ca60...)

So, there's magic to being Quite OK that is more than just applying compression techniques than elsewhere :)

sujayakar commented on Parameter-free KV cache compression for memory-efficient long-context LLMs arxiv.org/abs/2503.10714... · Posted by u/PaulHoule

hinkley · 5 months ago

Will we see models built on b-trees to deal with memory requirements? Have we already?

sujayakar · 5 months ago

Deepseek is already using SSDs for their KV cache: https://github.com/deepseek-ai/3FS

sujayakar commented on Succinct data structures blog.startifact.com/posts... · Posted by u/pavel_lishin

sujayakar · 6 months ago

I really love this space: Navarro's book is an excellent survey.

Erik Demaine has a few great lectures on succinct data structures too: L17 and L18 on https://courses.csail.mit.edu/6.851/spring12/lectures/

sujayakar commented on Why Rust nextest is process-per-test sunshowers.io/posts/nexte... · Posted by u/jicea

amelius · 8 months ago

According to some of these reasons every library should run in its own process too.

sujayakar · 8 months ago

that's roughly what the wasm component model is aiming for!

https://hacks.mozilla.org/2019/11/announcing-the-bytecode-al...

sujayakar commented on Ask HN: Have you ever seen a pathfinding algorithm of this type? blog.breathingworld.com/r... · Posted by u/Farer

sujayakar · 8 months ago

can you specify the algorithm in more detail?

this looks to be solving a different problem than A*, which operates over discrete graphs. this looks to be operating in 2D continuous space instead.

so, what is the algorithm for finding the optimal point on the obstacle's outline for bypass (4)? is it finding the point on the outline nearest the destination?

then, how do you subsequently "backtrack" to a different bypass point on the obstacle if the first choice of bypass point doesn't work out?

there's something interesting here for trying to directly operate on 2D space rather than discretizing it into a graph, but I'm curious how the details shake out.

sujayakar commented on Static search trees: faster than binary search curiouscoding.nl/posts/st... · Posted by u/atombender

sujayakar · 8 months ago

this is unbelievably cool. ~27ns overhead for searching for a u32 in a 4GB set in memory is unreal.

it's interesting that the wins for batching start diminishing at 8. I'm curious then how the subsequent optimizations fare with batch size 8 (rather than 128).

smaller batch sizes are nice since it determines how much request throughput we'd need to saturate this system. at batch size 8, we need 1s / ~30ns * 8 = 266M searches per second to fully utilize this algorithm.

the multithreading results are also interesting -- going from 1 to 6 threads only improves overhead by 4x. curious how this fares on a much higher core count machine.

sujayakar commented on WebGL Fluid Simulation paveldogreat.github.io/We... · Posted by u/ChadNauseam

Mossly · 8 months ago

I'll always have a soft spot for this earlier implementation which at lower resolutions has a kind of cyberpunk netrunner aesthetic, and at higher resolutions an almost ethereal ghostlike quality: https://haxiomic.github.io/projects/webgl-fluid-and-particle...

sujayakar · 8 months ago

I love playing with it at UltraHigh quality and 1 solver iterations. It reminds me of gradually incorporating one ingredient into another when cooking: like incorporating flour into eggs when making pasta.

sujayakar commented on Show HN: Whirlwind – Async concurrent hashmap for Rust github.com/fortress-build... · Posted by u/willothy

conradludgate · 10 months ago

I don't think I'd recommend using this in production. The benchmarks look good, but by immediately waking the waker[0], you've effectively created a spin-lock. They may work in some very specific circumstances, but they will most likely in practice be more costly to your scheduler (which likely uses locks btw) than just using locks

[0]: https://github.com/fortress-build/whirlwind/blob/0e4ae5a2aba...

sujayakar · 10 months ago

+1. I'd be curious how much of a pessimization to uncontended workloads it'd be to just use `tokio::sync::RwLock`.

and, if we want to keep it as a spinlock, I'm curious how much the immediate wakeup compares to using `tokio::task::yield_now`: https://docs.rs/tokio/latest/tokio/task/fn.yield_now.html

sujayakar commented on The PlanetScale vectors public beta planetscale.com/blog/anno... · Posted by u/ksec

tanoku · 10 months ago

These are very relevant questions! Thank you!

We're storing IDs from a ghost column that is created in the table where you're inserting vector data. This works very well in practice and allows updating the value of the vectors in the table, because they're translated into a delete + insert in the vector index by updating the ghost ID.

We have abstracted away the quantization system from the index; for the initial release, vector data is stored in raw blocks, like in the paper. Query performance is good, but disk usage is high. We're actively testing different quantization algorithms to see which ones we end up offering on GA. We're hoping our beta users will help us guide this choice!

Incremental updates and MVCC are _extremely tricky_, for both correctness and performance. As you've surely noticed, the hard thing here is that the original paper is very focused on LSM trees, because it exploits the fact that LSM trees get compacted lazily to perform incremental updates to the posting lists ('merges'). MySQL (and Postgres, and all relational databases, really) are B-tree based, and in-place updates for B-trees are expensive! I think we came up with very interesting workarounds for the problem, but it's a quite a bit to drill down in a HN comment. Please stay tuned for our whitepaper. :)

sujayakar · 10 months ago

looking forward to it!

I'd be curious if y'all end up supporting adding filter attributes to the inverted index that can then be pushed down into the posting list traversal.

for example, a restaurant search app may have (1) an embedding for each restaurant but also (2) a cuisine. then, if a restaurant has `cuisine = Italian`, we'd also store its ghost ID in a `cuisine:Italian` posting list.

at query time, the query planner could take a query like `SELECT * FROM t1 WHERE cuisine = 'Italian' ORDER BY DISTANCE(..)` and emit a plan that efficiently intersects the `cuisine:Italian` posting list with the union of the partitions' posting lists.

this feels to me like a potential strength of the inverted indexing approach compared to graph-based approaches, which struggle with general filtering (e.g. the Filtered-DiskANN paper).