Readit News logoReadit News
aubanel commented on AGI is an engineering problem, not a model training problem   vincirufus.com/posts/agi-... · Posted by u/vincirufus
aubanel · 16 hours ago
The premise "LLMs have reached a plateau" is false IMO.

Here are the metrics by which the author defines this plateau: "limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning."

If you try to benchmark any proxy of the points above, for instance "can models solve problems that require multi steps in agentic mode" (PlanBench, BrowseComp, I've even built custom benchmarks), the progress between models is very clear, and shows no sign of slowing down.

And this does convert to real-world tasks : yesterday, I had GPT-5 build me complex react charts in one-shot, whereas previous models needed more constant supervision.

I think we're moving goalposts too fast for LLMs, that's what can lead us to believe they've plateaued : but just try using past models for your current tasks (you can use use open models to be sure they were not updated) and see them struggle.

aubanel commented on Open models by OpenAI   openai.com/open-models/... · Posted by u/lackoftactics
_ache_ · 19 days ago
To be fair, it's with the help of OpenAI. They did it together, before the official release.

https://ollama.com/blog/gpt-oss

aubanel · 19 days ago
From experience, it's much more engineering work on the integrator's side than on OpenAI's. Basically they provide you their new model in advance, but they don't know the specifics of your system, so it's normal that you do most of the work. Thus I'm particularly impressed by Cerebras: they only have a few models supported for their extreme perf inference, it must have been huge bespoke work to integrate.
aubanel commented on OpenAI claims gold-medal performance at IMO 2025   twitter.com/alexwei_/stat... · Posted by u/Davidzheng
aubanel · a month ago
- AI competing is "wholly unfair"

- "[AI is] far away from being substantially being better than MCTs"

^ pick only one

aubanel commented on I fought in Ukraine and here's why FPV drones kind of suck   warontherocks.com/2025/06... · Posted by u/_tk_
aubanel · 2 months ago
One of the key points of the article is "I feel FPV drones to be mostly a failure because their success rate is low" Why is that a failure? If one 500$ drone has even only 10% success rate, if the target is a 1M$ equipment it's still a win!
aubanel commented on Guess I'm a rationalist now   scottaaronson.blog/?p=890... · Posted by u/nsoonhui
elt895 · 2 months ago
Are there other philosophy- or history-grounded sources that are comparable? If so, I’d love some recommendations. Yudkowsky and others have their problems, but their texts have an interesting points, are relatively easy to read and understand, and you can clearly see which real issues they’re addressing. From my experience, alternatives tend to fall into two categories: 1. Genuine classical philosophy, which is usually incredibly hard to read and after 50 pages I have no idea what the author is even talking about anymore. 2. Basically self help books that take one or very few idea and repeat them ad nouseam for 200 pages.
aubanel · 2 months ago
I've read Bertrand Russell's "A History of Western Philosophy" and it's the first ever philosophy book that I didn't drop after 10 pages, because of 2 things: 1- He's logic (or at least has the same STEM kind of logic that we use), so he builds his reasoning logically and not via bullshit associations like plays on words or contrived jumps. 2- He's not afraid to tell "this philosopher said that, it was an error", which is extremely new compared to other scholars who don't feel authorized to criticise even obvious errors. Really recommend!
aubanel commented on Is-even-ai – Check if a number is even using the power of AI   npmjs.com/package/is-even... · Posted by u/modinfo
thiht · 3 months ago
> I am looking to hire someone with 10 years of experience with is-even-ai

is-even-ai is only 7 month old so 10 years of experience is impossible, this is clearly a joke!

aubanel · 3 months ago
No it's not for a 17.1428x engineer like me!
aubanel commented on DDoSecrets publishes 410 GB of heap dumps, hacked from TeleMessage   micahflee.com/ddosecrets-... · Posted by u/micahflee
ulrikrasmussen · 3 months ago
"I'm a CEO. We're SaaS. I'm a CEO."
aubanel · 3 months ago
Don't be too harsh, he added "we're telecom" somewhere
aubanel commented on Is-even-ai – Check if a number is even using the power of AI   npmjs.com/package/is-even... · Posted by u/modinfo
66yatman · 3 months ago
This is a joke right?
aubanel · 3 months ago
Certainly not, it's actually possible to add 3 float32 numbers with 90% precision using AI! With a recent breakthrough, the team is working on pushing that to 10, we have enough cracked engineers to hope to make it happen soon!
aubanel commented on Qwen3: Think deeper, act faster   qwenlm.github.io/blog/qwe... · Posted by u/synthwave
gtirloni · 4 months ago
> We believe that the release and open-sourcing of Qwen3 will significantly advance the research and development of large foundation models

How does "open-weighting" help other researchers/companies?

aubanel · 4 months ago
There's already a lot of info in there: model architecture and mechanics.

Using the model to generate synthetic data also allows to distil its reasoning power into other models that you train, which is very powerful.

On top of these, Qwen's technical reports follow model releases by some time, they're generally very information rich. For instance, check this report for Qwen Omni, it's really good: https://huggingface.co/papers/2503.20215

aubanel commented on Widespread power outage in Spain and Portugal   bbc.com/news/live/c9wpq8x... · Posted by u/lleims
aubanel · 4 months ago
-> Power outage -> Wild guess that Russia has been doing it, 0 proof or even hint -> "This is a war of agression, we should respond"

Are you suggesting to attack Russia, based on absolute thin air?

u/aubanel

KarmaCake day397August 3, 2022View Original