Readit News logoReadit News
dbagr commented on GPT-5   openai.com/gpt-5/... · Posted by u/rd
kybernetikos · 19 days ago
ChatGPT5 in this demo:

> For an airplane wing (airfoil), the top surface is curved and the bottom is flatter. When the wing moves forward:

> * Air over the top has to travel farther in the same amount of time -> it moves faster -> pressure on the top decreases.

> * Air underneath moves slower -> pressure underneath is higher

> * The presure difference creates an upward force - lift

Isn't that explanation of why wings work completely wrong? There's nothing that forces the air to cover the top distance in the same time that it covers the bottom distance, and in fact it doesn't. https://www.cam.ac.uk/research/news/how-wings-really-work

Very strange to use a mistake as your first demo, especially while talking about how it's phd level.

dbagr · 19 days ago
dbagr commented on Sleep all comes down to the mitochondria   science.org/content/blog-... · Posted by u/A_D_E_P_T
dbagr · a month ago
This has been known for a long time to those interested in the field.
dbagr commented on Hierarchical Reasoning Model   arxiv.org/abs/2506.21734... · Posted by u/hansmayer
esafak · a month ago
Composition is the whole point of deep learning. Deep as in multilayer, multilevel.
dbagr · a month ago
You need recursion at some point: you can't account for all possible scenarios of combinations, as you would need an infinite number of layers.
dbagr commented on Grok 4 Launch [video]   twitter.com/xai/status/19... · Posted by u/meetpateltech
z7 · 2 months ago
How do you explain Grok 4 achieving new SOTA on ARC-AGI-2, nearly doubling the previous commercial SOTA?

https://x.com/arcprize/status/1943168950763950555

dbagr · 2 months ago
As I said, either by benchmark contamination (it is semi-private and could have been obtained by persons from other companies which model have been benchmarked) or by having more compute.
dbagr commented on Grok 4 Launch [video]   twitter.com/xai/status/19... · Posted by u/meetpateltech
vessenes · 2 months ago
Agreed. I noticed a quick flyby of a bad “reasoning smell” in the baseball World Series simulation, though - it looks like it pulled some numbers from polymarket, reasoned a long time, and then came back with the polymarket number for the Dodgers but presented as its own. It was a really fast run through, so I may be wrong, but it reminds me that it’s useful to have skeptics on the safety teams of these frontier models.

That said, these are HUGE improvements. Providing we don’t have benchmark contamination, this should be a very popular daily driver.

On coding - 256k context is the only real bit of bad news. I would guess their v7 model will have longer context, especially if it’s better at video. Either way, I’m looking forward to trying it.

dbagr · 2 months ago
Either they overtook other LLMs by simply using more compute (which is reasonable to think as they have a lot of GPUs) or I'm willing to bet there is benchmark contamination. I don't think their engineering team came up with any better techniques than used in training other LLMs, and Elon has a history of making deceptive announcements.
dbagr commented on Ask HN: Who wants to be hired? (May 2025)    · Posted by u/whoishiring
dbagr · 4 months ago
Location: France Remote: Yes or hybrid Willing to relocate: Yes Technologies: Playwright, JavaScript/HTML, Bash Résumé/CV: https://drive.google.com/file/d/1SkC2fa3sKozpCvDC1QPMUBT9--3... Email: dbagory[at]icloud[dot]com

I am a QA with experience in automated test development for the web, web development, system administration, software integration, writing documentation. I am interested in any role that requires one or multiple of these skills.

dbagr commented on QwQ: Alibaba's O1-like reasoning LLM   qwenlm.github.io/blog/qwq... · Posted by u/amrrs
dbagr · 9 months ago
This sounds like an RNN with extra steps.

u/dbagr

KarmaCake day22August 9, 2024
About
This account was made for sharing under my civil identity (I have another one).
View Original