Edit: Summary for anyone who didn't follow this saga at the time: https://www.ignorance.ai/p/the-fable-of-reflection-70b
Shumer is at best a fool and at worst a con artist.
Edit: Summary for anyone who didn't follow this saga at the time: https://www.ignorance.ai/p/the-fable-of-reflection-70b
Shumer is at best a fool and at worst a con artist.
Note that nothing about that depends on it being a local or remote model, it was just less of a concern for local models in the past because most of them did not have tool calling. OpenClaw, for all the cool and flashy uses, is also basically an infinite generator for lethal trifecta problems because its whole pitch is combining your data with tools that can both read and write from the public internet.
I mean in 2 years the entire mentality shifted. Most people on HN were just completely and utterly wrong (also quite embarrassing if you read how self assured these people were, this is like 70 percent of HN at the time).
First AI is clearly not a stochastic parrot and second it hasn’t taken our jobs yet but we can all see that potential up ahead.
Now we get articles like this saying your skills will atrophy with AI because the entire industry is using it now.
I think it’s clear. Everyone’s skills will atrophy. This is the future. I fully expect in the coming decades that the generation after zoomers have never coded ever without the assistance of AI and they will have an even harder time finding jobs in software.
Also: because the change happened so fast you see tons of pockets of people who aren’t caught up yet. People who don’t realize that the above is the overarching reality. You’ll know you’re one of these people if AI hasn’t basically taken over your work place and you and your coworkers aren’t going all in on Claude or Codex. Give it another 2 years and everyone will flip here too.
Your memory of the discourse of that era has apparently been filtered by your brain in order to support the point you want to make. Nobody who thoughtlessly adopted an extreme position at a hinge point where the future was genuinely uncertain came out of that looking particularly good.
YouTube playlist: https://www.youtube.com/playlist?list=PLiPvV5TNogxIS4bHQVW4p...
Using Octave for a beginning ML class felt like the worst of both worlds - you got the awkward, ugly language of MATLAB without any of the upsides of MATLAB-the-product because it didn't have the GUI environment or the huge pile of toolbox functions. None of that is meant as criticism at Octave as a project, it's fine for what it is, it just ended up being more of a stumbling block for beginners than a booster in that specific context.
I understand what point you're trying to make, but Protasevich would have been a better example. Beware of whose airspace you fly over.
Interesting; I thought I'd read that even the very best players only average ~90% accuracy, whereas the best engines average 99.something%?
Edit: Even lower-level cheated games are rarely 100% accurate for the whole game, cheaters usually mix in some bad or natural moves knowing that the engine will let them win anyways. That's why analysis is usually on critical sections, if someone normally plays with a 900 rating but spikes to 100% accuracy every time there's a critical move where other options lose, that's a strong suggestion they're cheating. One of the skills of a strong GM is sniffing out situations like that and being able to calculate a line of 'only moves' under pressure, so it's not nearly as surprising when they pull it off.
The way people cheat online is by running a chess engine that analyzes the state of the board in their web browser/app and suggests moves and/or gives a +/- rating reflecting the balance of the game. Sometimes people run it on another device like their phone to evade detection, but the low-effort ways are a browser extension or background app that monitors the screen. The major online chess platforms are constantly/daily banning significant amounts of people trying to cheat in this way.
Chess.com and Lichess catch these cheaters using a variety of methods, some of which are kept secret to make it harder for cheaters to circumvent them. One obvious way is to automatically compare people's moves to the top few engine moves and look for correlations, which is quite effective for, say, catching people who are low-rated but pull out the engine to help them win games occasionally. It's not that good for top-level chess because a Magnus or Hikaru or basically anyone in the top few hundred players can bang out a series of extremely accurate moves in a critical spot - that's why they're top chess players, they're extremely good. Engine analysis can still catch high-level cheaters, but it often takes manual effort to isolate moves that even a world-champion-class human would not have come up with, and offers grounds for suspicion and further investigation rather than certainty.
For titled events and tournaments, Chess.com has what's effectively a custom browser (Proctor) that surveils players during their games, capturing their screen and recording the mics and cameras that Chess.com requires high-level players to make available to show their environment while they play. This is obviously extremely onerous for players, but there's often money on the line and players do not want to play against cheaters either so they largely put up with the inconvenience and privacy loss.
Despite all of the above, high-level online cheating still happens and some of it is likely not caught.
Edit: More information on Proctor here: https://www.chess.com/proctor
I use Opus 4.6 all day long and this is not my experience at all. Maybe if you're writing standard CRUD apps or other projects well-represented in the training data. Anyone who has written "real" software knows that it's lots of iterating, ambiguity and shifting/opposing requirements.
The article seems to be written in order to feed into some combination of hype/anxiety. If the author wants to make a more compelling case for their stance I would suggest they build and deploy some of this software they're supposedly getting the LLM to perfectly create.
Yes, it's a very useful tool, but these sort of technically-light puff pieces are pretty tiresome and reflect poorly on the people who author and promote them. Also, didn't this guy previously make up some benchmarks that turned out to be bogus? https://www.reddit.com/r/LocalLLaMA/comments/1fd75nm/out_of_...
https://www.newsweek.com/i-couldnt-play-rules-so-i-became-en...
It's kind of a sad read. He would benefit a lot from getting outside the startup bubble and talking to some people who do useful work for a living instead of riding internet fads and growthmaxxing via viral social media posts.