So the test seems like a nonsensical test to me.
All 7 books come to ~1.75M tokens, so they don't quite fit yet. (At this rate of progress, mid-April should do it ) For now you can fit the first 4 books (~733K tokens).
Results: Opus 4.6 found 49 out of 50 officially documented spells across those 4 books. The only miss was "Slugulus Eructo" (a vomiting spell).
Freaking impressive!
https://www.wizardemporium.com/blog/complete-list-of-harry-p...
Why is this impressive?
Do you think it's actually ingesting the books and only using those as a reference? Is that how LLMs work at all? It seems more likely it's predicting these spell names from all the other references it has found on the internet, including lists of spells.
I did not ask for random company that went under for any reason but specific question related to users and revenue.
Uber, which you cite as a success, is only just starting to make any money, and any original investors are very unlikely to see a return given the huge amounts ploughed in.
MicroStrategy has transformed itself, same company, same founder, similar scam 20 years later, only this time they're peddling bitcoin as the bright new future. I'm surprised they didn't move on to GAI.
Qualcomm is now selling itself as an AI first company, is it, or is it trying to ride the next bubble?
Even if GAI becomes a roaring success, the prominent companies now are unlikely to be those with lasting success.
This LLM did it in (checks notes):
> Over nearly 2,000 Claude Code sessions and $20,000 in API costs
It may build, but does it boot (was also a significant and distinct next milestone)? (Also, will it blend?). Looks like yes!
> The 100,000-line compiler can build a bootable Linux 6.9 on x86, ARM, and RISC-V.
The next milestone is:
Is the generated code correct? The jury is still out on that one for production compilers. And then you have performance of generated code.
> The generated code is not very efficient. Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.
Still a really cool project!
I'm curious on your take on the references the GAI might have used to create such a project and whether this matters.
How long did it take for horses to be super-seeded by cars?
How long did powertool take to become the norm for tradesmen?
This has gone unbelievably fast.
AI will get there and replace humans at many tasks, machine learning already has, I'm not completely sure that generative AI will be the route we take, it is certainly superficially convincing, but those three years have not in fact seen huge progress IMO - huge amounts of churn and marketing versions yes, but not huge amounts of concrete progress or upheaval. Lots of money has been spent for sure! It is telling for me that many of the real founders at OpenAI stepped away - and I don't think that's just Altman, they're skeptical of the current approach.
PS Superseded.
In my experience this isn't true. People just assume their code is wrong and mess with it until they inadvertently do something that works around the bug. I've personally reported 17 bugs in GCC over the last 2 years and there are currently 1241 open wrong-code bugs.
Here's an example of a simple to understand bug (not mine) in the C frontend that has existed since GCC 4.7: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105180
LLMS on the other hand are non-deterministic and unpredictable and fuzzy by design. That makes them not ideal when trying to produce output which is provably correct - sure you can output and then laboriously check the output - some people find that useful, some are yet to find it useful.
It's a little like using Bitcoin to replace currencies - sure you can do that, but it includes design flaws which make it fundamentally unsuited to doing so. 10 years ago we had rabid defenders of these currencies telling us they would soon take over the global monetary system and replace it, nowadays, not so much.
Everyone sees the downsides but the upside is the one everyone is in denial about. It's like yeah, there's downsides but why is literally everyone using it?
At least, Bitcoin transactions are deterministic.
Not many would want to use a AI currency (mostly works; always shows "Oh, you are 100% right" after losing one's money).
In a similar way LLMs seem to me to be solving the wrong problem - an elegant and interesting solution, but a solution to the wrong problem (how can I fool humans into thinking the bot is generally intelligent), rather than the right problem (how can I create a general intelligence with knowledge of the world). It's not clear to me we can jump from the first to the second.