dbagr (u/dbagr) - Readit News

dbagr commented on GPT-5 openai.com/gpt-5/... · Posted by u/rd

ChatGPT5 in this demo:

> For an airplane wing (airfoil), the top surface is curved and the bottom is flatter. When the wing moves forward:

> * Air over the top has to travel farther in the same amount of time -> it moves faster -> pressure on the top decreases.

> * Air underneath moves slower -> pressure underneath is higher

> * The presure difference creates an upward force - lift

Isn't that explanation of why wings work completely wrong? There's nothing that forces the air to cover the top distance in the same time that it covers the bottom distance, and in fact it doesn't. https://www.cam.ac.uk/research/news/how-wings-really-work

Very strange to use a mistake as your first demo, especially while talking about how it's phd level.

dbagr · 19 days ago

See also about this misconception: https://www.grc.nasa.gov/www/k-12/VirtualAero/BottleRocket/a...

dbagr commented on Sleep all comes down to the mitochondria science.org/content/blog-... · Posted by u/A_D_E_P_T

dbagr · a month ago

This has been known for a long time to those interested in the field.

dbagr commented on Hierarchical Reasoning Model arxiv.org/abs/2506.21734... · Posted by u/hansmayer

esafak · a month ago

Composition is the whole point of deep learning. Deep as in multilayer, multilevel.

dbagr · a month ago

You need recursion at some point: you can't account for all possible scenarios of combinations, as you would need an infinite number of layers.

dbagr commented on Grok 4 Launch [video] twitter.com/xai/status/19... · Posted by u/meetpateltech

z7 · 2 months ago

How do you explain Grok 4 achieving new SOTA on ARC-AGI-2, nearly doubling the previous commercial SOTA?

https://x.com/arcprize/status/1943168950763950555

dbagr · 2 months ago

As I said, either by benchmark contamination (it is semi-private and could have been obtained by persons from other companies which model have been benchmarked) or by having more compute.

dbagr commented on Grok 4 Launch [video] twitter.com/xai/status/19... · Posted by u/meetpateltech

vessenes · 2 months ago

Agreed. I noticed a quick flyby of a bad “reasoning smell” in the baseball World Series simulation, though - it looks like it pulled some numbers from polymarket, reasoned a long time, and then came back with the polymarket number for the Dodgers but presented as its own. It was a really fast run through, so I may be wrong, but it reminds me that it’s useful to have skeptics on the safety teams of these frontier models.

That said, these are HUGE improvements. Providing we don’t have benchmark contamination, this should be a very popular daily driver.

On coding - 256k context is the only real bit of bad news. I would guess their v7 model will have longer context, especially if it’s better at video. Either way, I’m looking forward to trying it.

dbagr · 2 months ago

Either they overtook other LLMs by simply using more compute (which is reasonable to think as they have a lot of GPUs) or I'm willing to bet there is benchmark contamination. I don't think their engineering team came up with any better techniques than used in training other LLMs, and Elon has a history of making deceptive announcements.

dbagr commented on Ask HN: Who wants to be hired? (May 2025) · Posted by u/whoishiring

dbagr · 4 months ago

Location: France Remote: Yes or hybrid Willing to relocate: Yes Technologies: Playwright, JavaScript/HTML, Bash Résumé/CV: https://drive.google.com/file/d/1SkC2fa3sKozpCvDC1QPMUBT9--3... Email: dbagory[at]icloud[dot]com

I am a QA with experience in automated test development for the web, web development, system administration, software integration, writing documentation. I am interested in any role that requires one or multiple of these skills.

dbagr commented on QwQ: Alibaba's O1-like reasoning LLM qwenlm.github.io/blog/qwq... · Posted by u/amrrs

dbagr · 9 months ago

This sounds like an RNN with extra steps.