The thing to remember about the HN crowd is it can be a bit cynical. At the same time, realize that everyone's judging AI progress not on headroom and synthetic data usage, but on how well it feels like it's doing, external benchmarks, hallucinations, and how much value it's really delivering. The concern is that for all the enthusiasm, generative AI's hard problems still seem unsolved, the output quality is seeing diminishing returns, and actually applying it outside language settings has been challenging.
- offline and even online benchmarks are terrible unless actually a standard product experiment (a/b test etc). Evaluation science is extremely flawed.
- skepticism is healthy!
- measure on delivered value vs promised value!
- there are hard problems! Possibly ones that require paradigm shifts that need time to develop!
But
- delivered value and developments alone are extraordinary. Problems originally thought unsolvable are now completely tractable or solved even if you rightfully don’t trust eval numbers like LLMArena, market copy, and offline evals.
- output quality is seeing diminishing returns? I cannot understand this argument at all. We have scaled the first good idea with great success. People really believe this is the end of the line? We’re out of great ideas? We’ve just scratched the surface.
- even with a “feels” approach, people are unimpressed?? It’s subjective, you are welcome to be unimpressed. But I just cannot understand or fathom how
There's a general negativity bias on the internet (and probably in humans at large) which skews the discourse on this topic as any other - but there are plenty of active, creative LLM enthusiasts here.
Would be interesting to see some analysis from HN data to understand just how accurate my perception is; of course doesn’t clear up the bias issue.