Well, those spec issues are usually not documented or new engineers won't know where to find a full list. That means the architecturally-insecure OS's might be more secure in specific areas due to all the investment put into them over time. So, recommending the "higher-security design" might actually lower security.
For techniques like Fil-C, the issues include abstraction gap attacks and implementation problems. For the former, the model of Fil-C might mismatch the legacy code in some ways. (Ex: Ada/C FFI with trampolines.) Also, the interactions between legacy and Fil-C might introduce new bugs because integrations are essentially a new program. This problem did occur in practice in a few, research works.
I haven't reviewed Fil-C. I've forgotten too much C and the author was really clever. It might be hard to prove the absence of bugs in it. However, it might still be very helpful in securing C programs.
The minimum, though, is that all copyrighted works the supplier has legal access to can be copied, transformed arbitrarily, and used for training. And they can share those and transformed versions with anyone else who already has legal access to that data. And no contract, including terms of use, can override that. And they can freely scrape it but maybe daily limits imposed to avoid destructive scraping.
That might be enough to collect, preprocess, and share datasets like The Pile, RefinedWeb, uploaded content the host shares (eg The Stack, Youtube). We can do a lot with big models trained that way. We can also synthesize other data from them with less risk.
Now they add another run on top of it that is in principle prone to the same issues, except they reward the model for factuality instead of likeability. This is cool, but why not apply the same reward strategy to the answer itself?
I don't consider that very intelligent or more emergent than other behaviors. Now, if nothing like that was in training data (pure honesty with no confessions), it would be very interesting if it replied with lies and confessions. Because it wasn't pretrained to lie or confess like the above model likely was.
We have done grocery pickup for years but the pickup lanes are almost always empty while dozens of shoppers walk into the store.
To me, shopping for groceries by hand is a waste of time but it clearly has some utility for a lot of people.
I wonder if that inertia is making traditional grocery shopping stickier than it should be and disincentivizing optimization.
I hope consumer tastes will change because there’s no reason for us to all walk into a giant warehouse every week.
Finally, walking into stores lets you connect to people. Those who repent and follow Jesus Christ are told to share His Gospel with strangers so they can be forgiven and have eternal life. We're also to be good to them in general, listening and helping, from the short person reaching for items too high to the cashier that needs a friendly word.
We, along with non-believers, also get opportunities out of this when God makes us bump into the right people at the right time. They may become spouses, friends, or business partners. It's often called networking. However, Christians are to keep in mind God's sovereign control of every detail. Many are one-time or temporary events or observations just meant to make our lives more interesting.
Most of the above isn't available in online ordering which filters almost all of the human experience down to a narrow, efficient process a cheap AI could likely do. That process usually has no impact on eternity for anyone. Further, it has less impact on other people. Then, I have less of the experiences God designed us to have. Which includes the bad ones that build our character, like patience and forgiveness.
So, while I prefer online shopping, I try to pray God motivate me to shop in stores at times and do His will in there. Many interestings things, including impacts on people, continue to happen. Some events hit the person so hard that, even as a non-believer, they know God was behind it. I'm grateful for these stores that provide these opportunities to us.
Alternatively, use a language like ZL that embeds C/C++ in a macro-supporting, high-level language (eg Scheme). Encode higher level concepts in it with generation of human-readable, low-level code. F* did this. Now, you get C with higher-level features we can train AI's on
So, I think they could default on doing it for small demonstrators.
https://arxiv.org/abs/2501.00663
https://arxiv.org/pdf/2504.13173
Is there any other company that's openly publishing their research on AI at this level? Google should get a lot of credit for this.
We post a lot of research on mlscaling sub if you want to look back through them.
I don't know what's so special about this paper.
- They claim to use MLA to reduce KV cache by 90%. Yeah, Deepseek invented that for Deepseek V2 (and also V3 and Deepseek R1 etc)
- They claim to use a hybrid linear attention architecture. So does Deepseek V3.2 and that was weeks ago. Or Granite 4, if you want to go even further back. Or Kimi Linear. Or Qwen3-Next.
- They claimed to save a lot of money not doing a full pre-train run for millions of dollars. Well, so did Deepseek V3.2... Deepseek hasn't done a full $5.6mil full pretraining run since Deepseek V3 in 2024. Deepseek R1 is just a $294k post train on top of the expensive V3 pretrain run. Deepseek V3.2 is just a hybrid linear attention post-train run - i don't know the exact price, but it's probably just a few hundred thousand dollars as well.
Hell, GPT-5, o3, o4-mini, and gpt-4o are all post-trains on top of the same expensive pre-train run for gpt-4o in 2024. That's why they all have the same information cutoff date.
I don't really see anything new or interesting in this paper that isn't already something Deepseek V3.2 has already sort of done (just on a bigger scale). Not exactly the same, but is there anything amazingly new that's not in Deepseek V3.2?
Don't forget the billion dollars or so of GPU's they had access to that they left out of that accounting. Also, the R&D cost of the Meta model they originally used. Then, they added $5.6 million on top of that.