So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
I solved my spam problem with gemma3:27b-it-qat, and my benchmarks show that this is the size at which the current models start becoming useful.
- On the keyboard on iphones some sort of tiny language model suggest what it thinks are the most likely follow up words when writing. You only have to pick a suggested next word if it matches what you were planning on typing.
- Speculative decoding is a technique which utilized smaller models to speed up the inference for bigger models.
I'm sure smart people will invent other future use cases too.
What they are actually saying: Given one correct quoted sentence, the model has 42% chance of predicting the next sentence correctly.
So, assuming you start with the first sentence and tell it to keep going, it has a 0.42^n odds of staying on track, where n is the n-th sentence.
It seems to me, that if they didn't keep correcting it over and over again with real quotes, it wouldn't even get to the end of the first page without descending into wild fanfiction territory, with errors accumulating and growing as the length of the text progressed.
EDIT: As the article states, for an entire 50 token excerpt to be correct the probability of each output has to be fairly high. So perhaps it would be more accurate to view it as 0.985^n where n is the n-th token. Still the same result long term. Unless every token is correct, it will stray further and further from the correct source.
VCs are already doubting if the billions invested into data centers are going to generate a profit [1 and 2].
AI companies will need to generate profits at some point. Would people still be optimistic about Claude etc if they had to pay say $500 per month to use it given its current capabilities? Probably not.
So far the only company generating real profits out of AI is Nvidia.
[1] https://www.goldmansachs.com/insights/articles/will-the-1-tr...
[2] https://www.nytimes.com/2025/06/02/business/ai-data-centers-...
Sure, they are perhaps 6 months behind the closed-source models, and the hardware to run the biggest and best models isn't really consumer-grade yet (How many years could it be before regular people have GPUs with 200+ gigabytes vram? That's merely one order of magnitude away).
But they're already out there. They will only ever get better. And they will never disappear due to the company going out of business or investors raising prices.
I personally only care about the closed sourced proprietary models in so far as they let me get a glimpse of what I'll soon have access to freely and privately on my own machine. Even if all of them went out of business today, LLMs would still have a permanent effect on our future and how I'd be working.
What do you mean?
If I have some task that requires 1000 hours, and I'm able to shave it down to one hour, then I did just "save" 999 hours -- just in the same way that if something costs $5 and I pay $4, I saved $
But the llm bill will always invoice you for all the saved work regardless.
Isn’t software engineering a lot more than just writing code? And I mean like, A LOT more?
Informing product roadmaps, balancing tradeoffs, understanding relationships between teams, prioritizing between separate tasks, pushing back on tech debt, responding to incidents, it’s a feature and not a bug, …
I’m not saying LLMs will never be able to do this (who knows?), but I’m pretty sure SWEs won’t be the only role affected (or even the most affected) if it comes to this point.
Where am I wrong?
* The world is increasingly ran on computers.
* Software/Computer Engineers are the only people who actually truly know how computers work.
Thus it seems to me highly unlikely that we won't have a job.
What that job entails I do not know. Programming like we do today might not be something that we spend a considerable amount of time doing in the future. Just like most people today don't spend much time handing punched-cards or replacing vacuum tubes. But there will still be other work to do, I don't doubt that.
I find this sentiment increasingly worrisome. It's entirely clear that every last human will be beaten on code design in the upcoming years (I am not going to argue if it's 1 or 5 years away, who cares?)
I wished people would just stop holding on to what amounts to nothing, and think and talk more about what can be done in a new world. We need good ideas and I think this could be a place to advance them.
However, in real life work situations, that 'perfect information' prerequisite will be a big hurdle I think. Design can depend on any number of vague agreements and lots of domain specific knowledge, things a senior software architect has only learnt because they've been at the company for a long time. It will be very hard for a LLM to take all the correct decisions without that knowledge.
Sure, if you write down a summary of each and every meeting you've attended for the past 12 months, as well as attach your entire company confluence, into the prompt, perhaps then the LLM can design the right architecture. But is that realistic?
More likely I think the human will do the initial design and specification documents, with the aforementioned things in mind, and then the LLM can do the rest of the coding.
Not because it would have been technically impossible for the LLM to do the code design, but because it would have been practically impossible to craft the correct prompt that would have given the desired result from a blank sheet.
Now add a twist: • Senders pay a small fee to send a message. • Relaying devices earn a micro-payment (could be tokens, sats, etc.) for carrying the message one hop further. • End-to-end encrypted, fully decentralized, optionally anonymous.
Basically, a “postal network” built on people’s phones, without needing a traditional internet connection. Works best in areas with patchy or no internet, or under censorship.
Obvious challenges: • Latency and reliability (it’s not real-time). • Abuse/spam prevention. • Power consumption and user opt-in. • Viable incentive structures.
What do you think? Is this viable? Any real-world use cases where this might be actually useful — or is it just a neat academic toy?