In other words, how much of this improvement is true generalization vs memorization?
That said, this writeup itself will probably be scraped and influence Gemini 4.
In other words, how much of this improvement is true generalization vs memorization?
That said, this writeup itself will probably be scraped and influence Gemini 4.
Does this even have any effect?
KazeEmmanuar did a great job analyzing exactly this so we don't have to!
Do we know if they grow faster than busy beavers?
Alternatively, look at the system prompt, where Anthropic attempted to get it to stop doing this: > Claude never starts its response by saying a question or idea or observation was good, great, fascinating, profound, excellent, or any other positive adjective. It skips the flattery and responds directly. https://docs.anthropic.com/en/release-notes/system-prompts#a...
This problem seems highly specific to Claude. It's not exactly sycophancy so much as it is a strong bias towards this exact type of reaction to everything.
1) You're successful.
2) You mess up checks-and-balances at the beginning.
OpenAI did both.
Personally, I think at some point, the AGs ought to take over and push it back into a non-profit format. OAI undermines the concept of a non-profit.
More interestingly and more surprisingly, some of the people who work on exploiting games _don't_ do any sort of tech work and have no background in compsci - they're purely self educated just for the sole purpose of breaking the one game they're interested in. This was the case for some of the biggest contributors to ACE in Zelda Ocarina of Time.
Of course there's also the fact that exploiting 20-30 year old games is just vastly easier than modern software, due to the total lack of mitigations in them. And that's on top of the fact that with popular games, you're building on decades of reverse engineering work rather than (potentially) starting from scratch. And the arguably superior toolset (savestates etc).
But I think a very big factor is the one this blogpost is trying to address - most people just don't know anything at all about the vuln research industry, which is not exactly searching for attention in the ways that speedruns broadcast to hundreds of thousands of viewers for charity are.
That said, it's definitely Gem's fault that it struggled so long, considering it ignored the NPCs that give clues.