elfly (u/elfly) - Readit News

elfly commented on Project Genie: Experimenting with infinite, interactive worlds blog.google/innovation-an... · Posted by u/meetpateltech

MillionOClock · 10 days ago

An hybrid approach could maybe work, have a more or less standard game engine for coherence and use this kind of generative AI more or less as a short term rendering and physics sim engine.

elfly · 10 days ago

I've thought about this same idea but it probably gets very complicated.

Let's say, you simulate a long museum hallway with some vases in it. Who holds what? The basic game engine has the geometry, but once the player pushes it and moves it, it needs to inform the engine it did, and then to draw the next frame, read from the engine first, update the position in the video feed, then again feed it back to the engine.

What happens if the state diverges. Who wins? If the AI wins then...why have the engine at all?

It is possible but then who controls physics. The engine? or the AI? The AI could have a different understanding of the details of the base. What happens if the vase has water inside? who simulates that? what happens if the AI decides to break the vase? who simulates the AI.

I don't doubt that some sort of scratchpad to keep track of stuff in game would be useful, but I suspect the researchers are expecting the AI to keep track of everything in its own "head" cause that's the most flexible solution.

elfly commented on Tesla ending Models S and X production cnbc.com/2026/01/28/tesla... · Posted by u/keyboardJones

disillusioned · 11 days ago

The Mars obsession absolutely blows me away. Like, he's obviously read KSR's Red Mars. He's obviously aware of the conditions out there. Mars is a fuckin' bummer. It is absolutely hostile to human life. Sure, we'll land people there, and maybe set up some sort of station if we really want to throw a few trillion dollars away from actual problems here on earth... but it's not going to be pleasant. Not anytime, ever. The gravity sucks. The dust and fines suck. The storms suck. And last for months. The temperatures suck. There's no "outside". There's no trivial way to generate power at scale. There's no magnetosphere, so you'll get cancer. The soil is poisonous.

Elon's stuck with this 12-year-old-boy absurdity about "becoming interplanetary to save the species" as if Mars could ever be a practical lifeboat when we inevitably drive the planet into the ground or a meteor hits. It's... absurd, puerile fantasy.

elfly · 11 days ago

The funniest part is that the Mars Trilogy is hella optimistic about the tech needed to get and live there.

elfly commented on Oban, the job processing framework from Elixir, has come to Python dimamik.com/posts/oban_py... · Posted by u/dimamik

simonw · 12 days ago

> Oban allows you to insert and process jobs using only your database. You can insert the job to send a confirmation email in the same database transaction where you create the user. If one thing fails, everything is rolled back.

This is such a key feature. Lots of people will tell you that you shouldn't use a relational database as a worker queue, but they inevitably miss out on how important transactions are for this - it's really useful to be able to say "queue this work if the transaction commits, don't queue it if it fails".

Brandur Leach wrote a fantastic piece on this a few years ago: https://brandur.org/job-drain - describing how, even if you have a separate queue system, you should still feed it by logging queue tasks to a temporary database table that can be updated as part of those transactions.

elfly · 11 days ago

Lots of people are also trying to just use postgres for everything, tho.

elfly commented on Anthropic blocks third-party use of Claude Code subscriptions github.com/anomalyco/open... · Posted by u/sergiotapia

sirtaj · a month ago

I use Q/aka kiro-cli at work with opus and it's clearly inferior to CC within the first 30s or so of usage. So no, not quite

elfly · a month ago

Kiro is such a disaster. It starts well with all the planning, but I haven't been able to control it. It changes files on a whim and changes opinion from paragraph to paragraph.

Also it uses the Claude models but afaik it is constantly changing which one is using depending on the perceived difficulty.

elfly commented on Polymarket refuses to pay bets that US would 'invade' Venezuela ft.com/content/985ae542-1... · Posted by u/petethomas

elfly · a month ago

Yeah but that's still not an invasion. Does your boss invade your home every day you work at home?

Boots on the ground, baby.

elfly commented on Opus 4.5 is not the normal AI agent experience that I have had thus far burkeholland.github.io/po... · Posted by u/tbassetto

scosman · a month ago

> It’s a bit strange how anecdotes have become acceptable fuel for 1000 comment technical debates.

Progress is so fast right now anecdotes are sometimes more interesting than proper benchmarks. "Wow it can do impressive thing X" is more interesting to me than a 4% gain on SWE Verified Bench.

In early days of a startup "this one user is spending 50 hours/week in our tool" is sometimes more interesting than global metrics like average time in app. In the early/fast days, the potential is more interesting than the current state. There's work to be done to make that one user's experience apply to everyone, but knowing that it can work is still a huge milestone.

elfly · a month ago

At this point I believe the anecdotes more than benchmarks, cause I know the LLM devs train the damn things on the benchmarks.

A benchmark? probably was gamed. A guy made an app to right click and convert an image? prolly true, have to assume it may have a lot of issues but prima facie I just make a mental note that this is possible now.

elfly commented on Good riddance to Auth0 and social logins bitbytebit.substack.com/p... · Posted by u/recroad

elfly · 3 months ago

It is probably the best solution if you can't/won't do real MFA.

Changing passwords relies on mail 99% of the time anyway. So if you are using mail+password to authenticate, you are basically doing magic links with extra steps.

elfly commented on Mathematical proof debunks idea the universe is a computer simulation phys.org/news/2025-10-mat... · Posted by u/dxs

ozb · 3 months ago

Almost every statement in this paper is wrong.

The central claim in particular is not proven because a physical theory P need not be able to express statements like "there exists a number G, which, when interpreted as the text of a theory T, essentially states that the theory T itself is unprovable in the broader physical theory P" as an empirical physical fact.

elfly · 3 months ago

This is not really necessary tho; it only requires that the mathematical model has a certain arithmetic complexity. The usual demo is Robinson Arithmetic, which is addition, multiplication on the natural numbers, and a successor operation.

Godel then latches onto that to create an alphabet of the symbols which then are mapped to numbers; thus formulas are even bigger numbers, and derivations are even bigger bigger numbers. So for any statements there should be a derivation that prove the statement is true or a derivation that proves the statement is false. Of course most statements will be false, but even then there will be a derivation showing so.

Then Godel does some clever manipulation to show that there will be some statements for which there can be no such derivation in either way. But that does not need the physics theory to express things about itself. It only requires to be mathematically complex enough (it'd be weird if a theory of everything was simpler than Robinson Arithmetic) and that it has rules of derivation of its statements (ie, that mechanical math can be applied to deduce the truth of the matter from the first principles of the theory).

Of course, the actual undecidable godel number and the associated physical proposition would be immensely complex. But that is only cause nobody has tried to improve on Godel's methodology of assigning numbers to propositions. He used what was simpler, prime factorization, cause it was easy to reason about, but results in astronomical numbers. But there is no reason a better, less explosive way of encoding propositions could be found that made an undecidible Godel number to be translated into something comprehensible.

But this is largely unnecessary; Godel proof forces the mathematical system to speak about itself and then abuses this reflection to create a contradiction. It means the system is not complete, that there are statements in the system that cannot be proven from its first principles and derivation rules; the fact that the one Godel showed to exist is self referential does not mean all the undecidable propositions _are_ self referential. There well could be other, non self referential undecidable propositions, that could very well have a comprehensible physical interpretation.

And, regardless of the universe being a simulation or not, the physical theory will ultimately need to deal with this incompleteness.

elfly commented on Ask HN: Anyone struggling to get value out of coding LLMs? · Posted by u/bjackman

Philip-J-Fry · 8 months ago

I find LLMs 100x more productive for greenfield work.

If I want to create a React app with X amount of pages, some Redux stores, Auth, etc. then it can smash that out in minutes. I can say "now add X" and it'll do it. Generally with good results.

But when it comes to maintaining existing systems, or adding more complicated features, or needing to know business domain details, a LLM is usually not that great for me. They're still great as a code suggestion tool, finishing lines and functions. But as far as delivering whole features, they're pretty useless once you get past the easy stuff. And you'll spend as much time directing the LLM to do this kind of this as you would just writing it yourself.

What I tend to do is write stubbed out code in the design I like, then I'll get an LLM to just fill in the gaps.

These people who say LLMs make them 100x more productive probably are only working on greenfield stuff and haven't got to the hard bit yet.

Like everyone says, the first 90% is the easy bit. The last 10% is where you'll spend most of your time, and I don't see LLMs doing the hard bit that well currently.

elfly · 8 months ago

yeah, this is the issue. I've used Claude Code to great success to start a project. Once the basic framework is in place, it becomes less and less useful. I think it cannot handle the big context of a full project.

It is something that future versions could fix, if the context a llm can handle grows and also if you could fix it so it could handle debugging itself. Right now it can do it for short burst and it is not bad at it, but it will get distracted quickly and do other things I did not ask for

One of these problems has a technical fix that is only limited by money; the other does not