Elon's stuck with this 12-year-old-boy absurdity about "becoming interplanetary to save the species" as if Mars could ever be a practical lifeboat when we inevitably drive the planet into the ground or a meteor hits. It's... absurd, puerile fantasy.
Elon's stuck with this 12-year-old-boy absurdity about "becoming interplanetary to save the species" as if Mars could ever be a practical lifeboat when we inevitably drive the planet into the ground or a meteor hits. It's... absurd, puerile fantasy.
This is such a key feature. Lots of people will tell you that you shouldn't use a relational database as a worker queue, but they inevitably miss out on how important transactions are for this - it's really useful to be able to say "queue this work if the transaction commits, don't queue it if it fails".
Brandur Leach wrote a fantastic piece on this a few years ago: https://brandur.org/job-drain - describing how, even if you have a separate queue system, you should still feed it by logging queue tasks to a temporary database table that can be updated as part of those transactions.
Also it uses the Claude models but afaik it is constantly changing which one is using depending on the perceived difficulty.
Deleted Comment
Boots on the ground, baby.
Progress is so fast right now anecdotes are sometimes more interesting than proper benchmarks. "Wow it can do impressive thing X" is more interesting to me than a 4% gain on SWE Verified Bench.
In early days of a startup "this one user is spending 50 hours/week in our tool" is sometimes more interesting than global metrics like average time in app. In the early/fast days, the potential is more interesting than the current state. There's work to be done to make that one user's experience apply to everyone, but knowing that it can work is still a huge milestone.
A benchmark? probably was gamed. A guy made an app to right click and convert an image? prolly true, have to assume it may have a lot of issues but prima facie I just make a mental note that this is possible now.
Changing passwords relies on mail 99% of the time anyway. So if you are using mail+password to authenticate, you are basically doing magic links with extra steps.
The central claim in particular is not proven because a physical theory P need not be able to express statements like "there exists a number G, which, when interpreted as the text of a theory T, essentially states that the theory T itself is unprovable in the broader physical theory P" as an empirical physical fact.
Godel then latches onto that to create an alphabet of the symbols which then are mapped to numbers; thus formulas are even bigger numbers, and derivations are even bigger bigger numbers. So for any statements there should be a derivation that prove the statement is true or a derivation that proves the statement is false. Of course most statements will be false, but even then there will be a derivation showing so.
Then Godel does some clever manipulation to show that there will be some statements for which there can be no such derivation in either way. But that does not need the physics theory to express things about itself. It only requires to be mathematically complex enough (it'd be weird if a theory of everything was simpler than Robinson Arithmetic) and that it has rules of derivation of its statements (ie, that mechanical math can be applied to deduce the truth of the matter from the first principles of the theory).
Of course, the actual undecidable godel number and the associated physical proposition would be immensely complex. But that is only cause nobody has tried to improve on Godel's methodology of assigning numbers to propositions. He used what was simpler, prime factorization, cause it was easy to reason about, but results in astronomical numbers. But there is no reason a better, less explosive way of encoding propositions could be found that made an undecidible Godel number to be translated into something comprehensible.
But this is largely unnecessary; Godel proof forces the mathematical system to speak about itself and then abuses this reflection to create a contradiction. It means the system is not complete, that there are statements in the system that cannot be proven from its first principles and derivation rules; the fact that the one Godel showed to exist is self referential does not mean all the undecidable propositions _are_ self referential. There well could be other, non self referential undecidable propositions, that could very well have a comprehensible physical interpretation.
And, regardless of the universe being a simulation or not, the physical theory will ultimately need to deal with this incompleteness.
If I want to create a React app with X amount of pages, some Redux stores, Auth, etc. then it can smash that out in minutes. I can say "now add X" and it'll do it. Generally with good results.
But when it comes to maintaining existing systems, or adding more complicated features, or needing to know business domain details, a LLM is usually not that great for me. They're still great as a code suggestion tool, finishing lines and functions. But as far as delivering whole features, they're pretty useless once you get past the easy stuff. And you'll spend as much time directing the LLM to do this kind of this as you would just writing it yourself.
What I tend to do is write stubbed out code in the design I like, then I'll get an LLM to just fill in the gaps.
These people who say LLMs make them 100x more productive probably are only working on greenfield stuff and haven't got to the hard bit yet.
Like everyone says, the first 90% is the easy bit. The last 10% is where you'll spend most of your time, and I don't see LLMs doing the hard bit that well currently.
It is something that future versions could fix, if the context a llm can handle grows and also if you could fix it so it could handle debugging itself. Right now it can do it for short burst and it is not bad at it, but it will get distracted quickly and do other things I did not ask for
One of these problems has a technical fix that is only limited by money; the other does not
Let's say, you simulate a long museum hallway with some vases in it. Who holds what? The basic game engine has the geometry, but once the player pushes it and moves it, it needs to inform the engine it did, and then to draw the next frame, read from the engine first, update the position in the video feed, then again feed it back to the engine.
What happens if the state diverges. Who wins? If the AI wins then...why have the engine at all?
It is possible but then who controls physics. The engine? or the AI? The AI could have a different understanding of the details of the base. What happens if the vase has water inside? who simulates that? what happens if the AI decides to break the vase? who simulates the AI.
I don't doubt that some sort of scratchpad to keep track of stuff in game would be useful, but I suspect the researchers are expecting the AI to keep track of everything in its own "head" cause that's the most flexible solution.