However now that it's in the beta stage the amount of issues and bugs is insane. I reviewed a lot of the code that went in as well. I suspect the bug fixing stage is going to take longer than the initial implementation. There are so many issues and my mental model of the codebase has severely degraded.
It was an interesting experiment but I don't think I would do it again this way.
Not only that, the less coding you do in general? Guess what, fixing issues that in the past wouldve been a doddle (muscle memory) become less harder due to atrophy.
Swear most people dont think straight and cant see the obvious.
"Bob’s latest mail is actually the source of the confusion: he changed shared app/backend text to aweb/atlas. I’m correcting that with him now so we converge on the real model before any more code moves."
This was very much not true; Eve (the agent writing this, a gpt-5.4) had been thoroughly creating the confusion and telling Bob (an Opus 4.6) the wrong things. And it had just happened, it was not a matter of having forgotten or compacted context.
I have had agents chatting with each other and coordinating for a couple of months now, codex and claude code. This is a first. I wonder how much can I read into it about gpt-5.4's personality.