Let's say that we have 15k unique tokens (going by modern open models). Let's also say that we have an embedding dimensionality of 1k. This implies that we have a maximum 1k degrees of freedom (or rank) on our output. The model is able to pick any single of the 15k tokens as the top token, but the expressivity of the _probability distribution_ is inherently limited to 1k unique linear components.
12288 dimensions (GPT3 size) can fit more than 40 billion nearly perpendicular vectors.
Not Scaling is about crossing and shoring up your moat. Scaling is when your app has enough hype around it that your customers recruit new customers, and all you have to do is add new machines or shard the database or whatever to handle increased demand.
Antiscaling is when you turn into the thing that everyone hates about the modern web. Antiscaling is when intelligence agencies want to talk to you about how your chat app is used by terrorists, when cities want to licence access to your ridesharing or takeaway delivery app, when pieces of legislation are passed that are specifically designed to target your company, when you're sufficiently well-known as the founder of an app that people are making memes about you and tracking your personal movements.
You don't have to take over the world, you just have to make money. People who try to change the world often change it for the worse; just try to make something useful, and maybe the world will like it.
How about extreme and utter irrelevance (such as after building a thing nobody wants)?
Or how about this, arguably the most common: slightly successful; nobody hates it but nobody loves it either. Something people feel mildly positive about, but there is zero “hype” and also no “moat” and nobody cares enough to hate it.