quatonion (u/quatonion)

quatonion commented on Galton Board Softmax O(N2) Replacement github.com/Foundation42/g... · Posted by u/quatonion

quatonion · 2 months ago

For some time I have been trying to replace the very costly attention pass in LLMs.

Here is my current attempt at fixing things.

This is applicable beyond LLMs, but that is certainly an important use case.

Description, Ready to use Code and Interactive Educational materials inside.

quatonion commented on We reverse-engineered Flash Attention 4 modal.com/blog/reverse-en... · Posted by u/birdculture

quatonion · 3 months ago

This is really interesting. I always wondered how it works.

Couple of years ago I did some experiments using a surrogate for attention using a feed forward network (MLP) to avoid the quadratic explosion.

It worked but had problems at the time, and my mind wasn't really in it.

This has dug it back out again with the benefit of time and additional insights.

So now I'm thinking, you can use a lot of the insights in the work here, but also shoot for a full linear scaling surrogate.

The trick is to use the surrogate as a discriminator under an RL regime during training.

Instead of just applying better/faster math and optimizations alone, have the model learn to work with a fundamentally better inference approach during training.

If you do that, you can turn the approximation error present in the FFN surrogate inference method into a recovery signal encoded into the model itself.

I haven't tried it, but don't see a reason it shouldn't work. Will give it a go on a GPT-2 model ASAP.

Thanks again for the awesome article.

quatonion commented on Cap'n Web: a new RPC system for browsers and web servers blog.cloudflare.com/capnw... · Posted by u/jgrahamc

quatonion · 3 months ago

Really not a big fan of batteries included opinionated protocols.

Even Cap'n Proto and Protobuf is too much for me.

My particular favorite is this. But then I'm biased coz I wrote it haha.

https://github.com/Foundation42/libtuple

No, but seriously, it has some really nice properties. You can embed JSON like maps, arrays and S-Expressions recursively. It doesn't care.

You can stream it incrementally or use it a message framed form.

And the nicest thing is that the encoding is lexicographically sortable.

quatonion commented on GPT-5: "How many times does the letter b appear in blueberry?" bsky.app/profile/kjhealy.... · Posted by u/minimaxir

Erem · 5 months ago

With data starvation driving ai companies towards synthetic data I’m surprised that an easily synthesized problem like this hasn’t been trained out of relevance. Yet here we are with proof that it hasn’t

quatonion · 5 months ago

Are we a hundred percent sure it isn't a watermark that is by design?

A quick test anyone can run and say, yup, that is a model XYZ derivative running under the hood.

Because, as you quite rightly point out, it is trivial to train the model not to have this behaviour. For me, that is when Occam kicks in.

I remember initially believing the explanation for the Strawberry problem, but one day I sat down and thought about it, and realized it made absolutely zero sense.

The explanation that Karpathy was popularizing was that it has to do with tokenization.

However, models are not conscious of tokens, and they certainly don't have any ability to count them without tool help.

Additionally, if it were a tokenization issue, we would expect to spot the issue everywhere.

So yeah, I'm thinking it's a model tag or insignia of some kind, similar to the fun logos you find when examining many silicon integrated circuits under a microscope.

quatonion commented on GPT-5 openai.com/gpt-5/... · Posted by u/rd

rapind · 5 months ago

OpenAI wants AGI, or at least something they can argue is AGI because it changes their relationship with Microsoft. That's what I remember, although I don't really stay up to date (https://www.wired.com/story/microsoft-and-openais-agi-fight-...).

As long as this is the case though I would expect Altman will be hyping up AGI a lot, regardless of it's veracity.

quatonion · 5 months ago

That is just a made up story that gets passed around with nobody ever stopping to obtain formal verification. The image of the whole AI industry is mostly an illusion designed for tight narrative control.

Notice how despite all the bickering and tittle tattle in the news, nothing ever happens.

When you frame it this way, things make a lot more sense.

quatonion commented on GPT-5 openai.com/gpt-5/... · Posted by u/rd

highfrequency · 5 months ago

It is frequently suggested that once one of the AI companies reaches an AGI threshold, they will take off ahead of the rest. It's interesting to note that at least so far, the trend has been the opposite: as time goes on and the models get better, the performance of the different company's gets clustered closer together. Right now GPT-5, Claude Opus, Grok 4, Gemini 2.5 Pro all seem quite good across the board (ie they can all basically solve moderately challenging math and coding problems).

As a user, it feels like the race has never been as close as it is now. Perhaps dumb to extrapolate, but it makes me lean more skeptical about the hard take-off / winner-take-all mental model that has been pushed.

Would be curious to hear the take of a researcher at one of these firms - do you expect the AI offerings across competitors to become more competitive and clustered over the next few years, or less so?

quatonion · 5 months ago

I know right, if I didn't know any better one might think they are all customized versions of the same base model.

To be honest that is what you would want if you were digitally transforming the planet with AI.

You would want to start with a core so that all models share similar values in order they don't bicker etc, for negotiations, trade deals, logistics.

Would also save a lot of power so you don't have to train the models again and again, which would be quite laborious and expensive.

Rather each lab would take the current best and perform some tweak or add some magic sauce then feed it back into the master batch assuming it passed muster.

Share the work, globally for a shared global future.

At least that is what I would do.

quatonion commented on Multics multicians.org/multics.ht... · Posted by u/unleaded

HarHarVeryFunny · 5 months ago

We used Multics when I was at Bristol Uni in the UK c.1980. I can only remember two things about it:

1) The system was initially deemed slow, so they installed an extra 256 KB of RAM (for a system serving dozens/hundreds of students - Bristol was regional computing center), and that made a difference! This was a big deal - apparently quite expensive!

2) Notwithstanding 1), it was fast, and typical student FORTRAN assignments of a 100 or so lines of code would compile and link essentially instantly - hit enter and get prompt back. I wish compilers were this fast today on 2025's massively faster hardware!

quatonion · 5 months ago

Same where I went at Leeds Uni in the mid 80s.

Ours was just for CS undergrads mostly when I was there, and wasn't too overloaded. I guess we had about fifty terminals maybe on campus at least.

I remember we could dial it up from a couple of terminals in our Halls of Residence over JANET.

You are right, I never found it that slow either - loved that machine and the terminal to terminal messaging was crazy fun.

quatonion commented on Ask HN: What do you dislike about ChatGPT and what needs improving? · Posted by u/zyruh

Fade_Dance · 5 months ago

#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.

You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.

quatonion · 5 months ago

I have no glazing built into my custom instructions, but it still does it.

It used to be a lot better before glazegate. Never did quite seem to recover.

I don't mind us having fun of course, but it needs to pick up on emotional queues a lot better and know when to be serious.

quatonion commented on Ask HN: What do you dislike about ChatGPT and what needs improving? · Posted by u/zyruh

quatonion · 5 months ago

Why you can't download an entire chat as markdown

Copy/Pasting sections of the chat on mobile is laborious

That it still gets manic and starts glazing

That it can remember some things and keeps bringing them up, but forgets other, more pertinent things

If you switch away from it while it is in the middle of generating an image it often cancels the image generation

Image editing accuracy seems to have gone down significantly in quality based on intent.

You can't turn a temporary chat into a permanent one.. sometimes you start a temporary and realize half way it should be permanent - but too late.

The em dashes need to go

And so do the "it's not this, it's that!"

Is it really necessary to make so many lists all the time

Canvas needs a bunch of work