skinner_ (u/skinner_)

skinner_ commented on Claude's Cycles [pdf] www-cs-faculty.stanford.e... · Posted by u/fs123

mwigdahl · 10 days ago

Agreed with all of that, but with the added point that Knuth has done a lot of work in this exact area in The Art of Computer Programming Volume 4. If he considers this conjecture open given his particular knowledge of the field, it likely is (although agreed, it's not guaranteed).

skinner_ · 10 days ago

Also, if Claude had regurgitated a known solution, it would have come up with it in the first exploration round, not the 31st, as it actually did.

skinner_ commented on Writing code is cheap now simonwillison.net/guides/... · Posted by u/swolpers

jimbokun · 17 days ago

Seems relevant again:

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

skinner_ · 16 days ago

I think the nuanced take on Joel's rant is this: it was good advice for 26 years. It became slightly less good advice a few months ago. This is a good time to warn overenthuastic people that it’s still good advice in 2026, and to start a discussion about which of its assumptions remain to be true in 2027 and later.

skinner_ commented on The Q, K, V Matrices arpitbhayani.me/blogs/qkv... · Posted by u/yashsngh

ianand · 2 months ago

I'm not a fan of the database lookup analogy either.

The analogy I prefer when teaching attention is celestial mechanics. Tokens are like planets in (latent) space. The attention mechanism is like a kind of "gravity" where each token is influencing each other, pushing and pulling each other around in (latent) space to refine their meaning. But instead of "distance" and "mass", this gravity is proportional to semantic inter-relatedness and instead of physical space this is occurring in a latent space.

https://www.youtube.com/watch?v=ZuiJjkbX0Og&t=3569s

skinner_ · 2 months ago

Then I think you’ll like our project which aims to find the missing link between transformers and swarm simulations:

https://github.com/danielvarga/transformer-as-swarm

Basically a boid simulation where a swarm of birds can collectively solve MNIST. The goal is not some new SOTA architecture, it is to find the right trade-off where the system already exhibits complex emergent behavior while the swarming rules are still simple.

It is currently abandoned due to a serious lack of free time (*), but I would consider collaborating with anyone willing to put in some effort.

(*) In my defense, I’m not slacking meanwhile: https://arxiv.org/abs/2510.26543 https://arxiv.org/abs/2510.16522 https://www.youtube.com/watch?v=U5p3VEOWza8

skinner_ commented on AI-Triggered Delusional Ideation as Folie a Deux Technologique arxiv.org/abs/2512.11818... · Posted by u/kelseyfrog

skinner_ · 3 months ago

https://www.astralcodexten.com/p/in-search-of-ai-psychosis is very relevant, but the main reason I’m posting it here is that, unlike this paper, it takes the opportunity to build the cleverest pun out of the same ingredients:

Folie A Deux Ex Machina

skinner_ commented on Human Fovea Detector shadertoy.com/view/4dsXzM... · Posted by u/AbuAssar

krisoft · 4 months ago

I also don’t think it is downvote worthy.

The first part of the comment is very valuable. “I looked at it and it made me feel extremely strange almost immediately“. That is very good to know.

The second bit I’m less sure about. What do they mean by “check to make sure this can't trigger migraines or seizures”? Like what check are they expecting? Literature research? Or experiments? The word “check” makes it sound as if they think this is some easy to do thung, like how you could “double check” the spelling of a word using a dictionary.

skinner_ · 4 months ago

I interpreted it loosely, as "be aware of the possibility, and stop looking at it at the first signs of issues".

skinner_ commented on Show HN: Extending LLM SVG generation beyond pelicans and bicycles gally.net/temp/20251107pe... · Posted by u/tkgally

skinner_ · 4 months ago

100% frontpage-worthy! Frankly I was already bored with all those pelicans, and a bit worried that the labs are overfitting on pelicans specifically. This test clearly demonstrates that they are not.

skinner_ commented on Why can't transformers learn multiplication? arxiv.org/abs/2510.00184... · Posted by u/PaulHoule

smartmic · 5 months ago

Yesterday, I learned the opposite. Simon Willison demonstrated in another thread how this works out … see https://news.ycombinator.com/item?id=45686295

skinner_ · 5 months ago

That's very cool, but it's not an apples to apples comparison. The reasoning model learned how to do long multiplication. (Either from the internet, or from generated examples of long multiplication that were used to sharpen its reasoning skills. In principle, it might have invented it on its own during RL, but no, I don't think so.)

In this paper, the task is to learn how to multiply, strictly from AxB=C examples, with 4-digit numbers. Their vanilla transformer can't learn it, but the one with (their variant of) chain-of-thought can. These are transformers that have never encountered written text, and are too small to understand any of it anyway.

skinner_ commented on Why can't transformers learn multiplication? arxiv.org/abs/2510.00184... · Posted by u/PaulHoule

carodgers · 5 months ago

Because they produce output probabilistically, when multiplication is deterministic. Why is this so hard for everyone?

skinner_ · 5 months ago

If being probabilistic prevented learning deterministic functions, transformers couldn’t learn addition either. But they can, so that can't be the reason.

skinner_ commented on Ortega hypothesis en.wikipedia.org/wiki/Ort... · Posted by u/Caiero

d--b · 5 months ago

People in humanities still haven’t understood that pretty mich everything in their fields is never all black or all white.

It’s a bizarre debate when it’s glaringly obvious that small contributions matter and big contributions matter as well.

But which contributes more, they ask? Who gives a shit, really?

skinner_ · 5 months ago

> But which contributes more, they ask? Who gives a shit, really?

Funding agencies? Should they prioritize established researchers or newcomers? Should they support many smaller grant proposals or fewer large ones?