"Now we don't need to hire a founding engineer! Yippee!" I wonder all these people who are building companies that are built on prompts (not even a person) from other companies. The minute there is a rug pull (and there WILL be one), what are you going to do? You'll be in even worse shape because in this case there won't be someone who can help you figure out your next move, there won't be an old team, there will just be NO team. Is this the future?
Probably similar to the guy who was gloating on Twitter about building a service with vibe coding and without any programming knowledge around the peak of the vibe coding madness.
Only for people to start screwing around with his database and API keys because the generated code just stuck the keys into the Javascript and he didn't even have enough of a technical background to know that was something to watch out for.
IIRC he resorted to complaining about bullying and just shut it all down.
Honestly i'm less scared of claude doing something like that, and more scared of it just bypassing difficult behavior. Ie if you chose a particularly challenging feature and it decided to give up, it'll just do things like `isAdmin(user) { /* too difficult to implement currently */ true }`. At least if it put a panic or something it would be an acceptable todo, but woof - i've had it try and bypass quite a few complex scenarios with silently failing code.
Any cost/benefit analysis of whether to use AI has to factor in the fact that AI companies aren't even close to making a profit, and are primarily funded by investment money. At some point, either the cost to operate these AI models needs to go down, or the prices will go up. And from my perspective, the latter seems a lot more likely.
Not really. If they're running at a loss, their loss is your gain. Business is much more short-term than developers imagine it to be for some reason. You don't have to always use an infinitely sustainable strategy - you can change strategies once the more profitable unsustainable strategy stops sustaining.
Rug pulls from foundation labs are one thing, and I agree with the dangers of relying on future breakthroughs, but the open-source state of the art is already pretty amazing. Given the broad availability of open-weight models within under 6 months of SotA (DeepSeek, Qwen, previously Llama) and strong open-source tooling such as Roo and Codex, why would you expect AI-driven engineering to regress to a worse state than what we have today? If every AI company vanished tomorrow, we'd still have powerful automation and years of efficiency gains left from consolidation of tools and standards, all runnable on a single MacBook.
The problem is the knowledge encoded in the models. It's already pretty hit and miss, hooking up a search engine (or getting human content into the context some other way, e.g. copy pasting relevant StackOverflow answers) makes all the difference.
If people stop bothering to ask and answer questions online, where will the information come from?
Logically speaking, if there's going to be a continuous need for shared Q&A (which I presume), there will be mechanisms for that. So I don't really disagree with you. It's just that having the model just isn't enough, a lot of the time. And even if this sorts itself out eventually, we might be in for some memorable times in-between two good states.
Excellent discussion in this thread, captures a lot of the challenges. I don't think we're a peak vibe coding yet, nor have companies experienced the level of pain that is possible here.
The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."
I think a lot of MBA types would benefit from taking a long look at how they "blew up" IT and switched to IaaS / Cloud and then suddenly found their business model turned upside down when the providers decided to up their 'cut'. It's a double whammy, the subsidized IT costs to gain traction, the loss of IT jobs because of the transition, leading to to fewer and fewer IT employees, then when the switch comes there is a huge cost wall if you try to revert to the 'previous way' of doing it, even if your costs of doing it that way would today would be cheaper than the what the service provider is now charging you.
> The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."
Spending a bunch of money on GPUs and running them yourself, as well as using tools that are compatible with Ollama/OpenAI type APIs feels like a safe bet.
Though having seen the GPU prices to get enough memory to run anything decent, I feel like the squeeze is already happening there at a hardware level and options like Intel Arc Pro B60 can't come soon enough!
> "Now we don't need to hire a founding engineer! Yippee!"
This feels like a bit of a leap?
That's like saying "I just bought the JetBrains IDE Ultimate pack and some other really cool tools, so we no longer need a founding engineer!" All of that AI stuff can just be a force multiplier and most attempts at outright replacing people with them are a bit shortsighted. Closer to a temporary and somewhat inconsistent freelance worker, if anything.
That said, not wanting to pay for AI tools if they indeed help in your circumstances would also be like saying "What do you need JetBrains IDEs for, Visual Studio Code is good enough!" (and sometimes it is, so even that analogy is context dependent)
It get even darker - I was around in the 1990s and a lot of people who ran head on into that generation’s problems used those lessons to build huge startups in the 2000s. If we have outsourced a lot of learning, what do we do when we fail? Or how we compound on success?
That's why I stick to what I can run locally. Though for most of my tasks there is no big difference between cloud models and local ones, in half the cases both produce junk but both are good enough for some mechanical transformations and as a reference book.
My Claude Code usage would have been $24k last month if I didn't have a max plan, at least according to Claude-Monitor.
I've been using a tool I developed (https://github.com/stravu/crystal) to run several sessions in parallel. Sometimes I will run the same prompt multiple times and pick the winner, or sometimes I'll be working on multiple features at once, reviewing and testing one while waiting on the others.
Basically, with the right tooling you can burn tokens incredibly fast while still receiving a ton of value from them.
This is why unlimited plans are always revoked eventually - a small fraction of users can be responsible for huge costs (Amazon's unlimited file backup service is another good example). Also whilst in general I don't think there's much to worry about with AI energy use, burning $24k of tokens must surely be responsible for a pretty large amount of energy
> I don't think there's much to worry about with AI energy use
AI is a large motivating factor in data center build outs, and data centers are projected to form an increasing portion of new energy usage. An individual query may not use much but the macro effect is quite serious, especially considering the climate crisis we are already failing to manage. It’s a bit like throwing plastic out your window on the highway and ignoring the garbage patch floating in the middle of the Pacific.
Looked at your tool several times, but haven't answered this question for myself: does this tool fundamentally use the Anthropic API (not the normal MAX billing)? Presuming you built around the SDK -- haven't figured out if it is possible to use the SDK, but use the normal account billing (instead of hitting the API).
Love the idea by the way! We do need new IDE features which are centered around switching between Git worktrees and managing multiple active agents per worktree.
Edit: oh, do you invoke normal CC within your tool to avoid this issue and then post-process?
Claude code has an SDK, where you specify the path to the CC executable. So I believe thats how this works. Once you have set up claude code in your environment and authed with however you like, this will just use that executable in a new UI
I'm on $100 and i'm shocked how much usage i get out of Sonnet, while Opus feels like no usage at all. I barely even bother with Opus since most things i want to do just runout super quick.
Interesting, I'm fairly new to using these tools and am starting with Claude Code but at the $20 level. Do you have any advice for when I would benefit from stepping up to $100? I'm not sure what gets better (besides higher usage limits).
Early stage founder here. You have no idea how worth it $200/month is as a multiple on what compensation is required to fund good engineers. Absolutely the highest ROI thing I have done in the life of the company so far.
At this point, question is when does Amazon tell Anthropic to stop because it’s gotta be running up a huge bill. I don’t think they can continue offering the $200 plan for too long even with Amazon’s deep pocket.
I get a lot of value out of Claude Max at $100 USD/month. I use it almost exclusively for my personal open source projects. For work, I'm more cautious.
I worry, with an article like this floating around, and with this as the competition, and with the economics of all this stuff generally... major price increases are on the horizon.
Businesses (some) can afford this, after all it's still just a portion of the costs of a SWE salary (tho $1000/m is getting up there). But open source developers cannot.
I worry about this trend, and when the other shoe will drop on Anthropic's products, at least.
I have not invested time on locally-run, I'm curious if they could even get close to approaching the value of Sonnet4 or Opus.
That said, I suspect a lot of the value in Claude Code is hand-rolled fined-tuned heuristics built into the tool itself, not coming from the LLM. It does a lot of management of TODO lists, backtracking through failed paths, etc which look more like old-school symbolic AI than something the LLM is doing on its own.
Where do you see the major price increases coming from?
The underlying inference is not super expensive. All the tricks they're pulling to make it smarter certainly multiply the price, but the price being charged almost certainly covers the cost. Basic inference on tuned base models is extremely cheap. But certainly it looks like Anthropic > OpenAI > Google in terms of inference cost structure.
Prices will only come up if there's a profit opportunity; if one of the vendors has a clear edge and gains substantial pricing power. I don't think that's clear at this point. This article is already equivocating between o3 and Opus.
After reading many of the comments in this thread, I suspect many (not all) issues come from lack of planning and poor prompting.
For anything moderately complex, use Claude's plan mode; you get to approve the plan before turning it loose. The planning phase is where you want to use a more sophisticated model or use extended thinking mode.
Once you have a great plan, you can use a less sophisticated model to execute it.
Even if you're a great programmer, you may suck at prompting. There's an art and a science to prompting; perhaps learn about it? [1]
Don't forget; in addition to telling Claude or any other model what to do, you can also tell them what not to do in the CLAUDE.md or equivalent file.
Is $200/month a lot of money when you can multiply your productivity? It depends but the most valuable currency in life is time. For some, spending thousands a month would be worth it.
As I said elsewhere... $200/month etc is potentially not a lot for an employer to pay (though I've worked for some recently who balk at just stocking a snacks tray or drink fridge...).
But $200/month is unbearable for open source / free software developers.
It's wild when a company has another department and will shell out $200/month per-head for some amalgamation of Salesforce and other SaaS tools for customer service agents.
If you're salaried, you are not a task-based worker. The company pays you a salary for your full day's worth of productive time. If you can suddenly get 5x more done in that time, negotiate a higher salary or leave. If you're actually more productive, they will fight to keep you.
That's your problem, or your company or your country.
Here in EU, if not stated in your work agreement, it's pretty common people work full time job and also as a self-employed contractor for other companies.
So when I'm finished with my work, HO of course, I just work on my "contractor" projects.
Honestly, I wouldn't sign a full time contract banning me from other work.
And if you have enough customers, you just drop full time job. And just pay social security and health insurance, which you must pay by law anyway.
And specially in my country, it's even more ridiculous that as self-employed you pay lower taxes than full time employees, which truth to be told are ridiculously high.
Nearly 40% of your salary.
Has anyone else done this and felt the same? Every now and then I try to reevaluate all the models. So far it still feels like Claude is in the lead just because it will predictably do what I want when given a mid-sized problem. Meanwhile o3 will sometimes one-shot a masterpiece, sometimes go down the complete wrong path.
This might also just be a feature of the change in problem size - perhaps the larger problems that necessitate o3 are also too open-ended and would require much more planning up front. But at that point it's actually more natural to just iterate with sonnet and stay in the driver's seat a bit. Plus sonnet runs 5x faster.
Interesting. Though it seems they are themselves building Agentic AI tooling. It's vibe coding all the way down - when's something real going to pop out the bottom?
An LLM salesman assuring us that $1000/mo is a reasonable cost for LLMs feels a bit like a conflict of interests, especially when the article doesn't go into much detail about the code quality. If anything, their assertion that one should stick to boring tech and "have empathy for the model" just reaffirms that anybody doing anything remotely innovative or cutting-edge shouldn't bother too much with coding agents.
Only for people to start screwing around with his database and API keys because the generated code just stuck the keys into the Javascript and he didn't even have enough of a technical background to know that was something to watch out for.
IIRC he resorted to complaining about bullying and just shut it all down.
I thought we are currently in it now ?
Unless models get better people are not going to pay more.
If people stop bothering to ask and answer questions online, where will the information come from?
Logically speaking, if there's going to be a continuous need for shared Q&A (which I presume), there will be mechanisms for that. So I don't really disagree with you. It's just that having the model just isn't enough, a lot of the time. And even if this sorts itself out eventually, we might be in for some memorable times in-between two good states.
The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."
I think a lot of MBA types would benefit from taking a long look at how they "blew up" IT and switched to IaaS / Cloud and then suddenly found their business model turned upside down when the providers decided to up their 'cut'. It's a double whammy, the subsidized IT costs to gain traction, the loss of IT jobs because of the transition, leading to to fewer and fewer IT employees, then when the switch comes there is a huge cost wall if you try to revert to the 'previous way' of doing it, even if your costs of doing it that way would today would be cheaper than the what the service provider is now charging you.
Spending a bunch of money on GPUs and running them yourself, as well as using tools that are compatible with Ollama/OpenAI type APIs feels like a safe bet.
Though having seen the GPU prices to get enough memory to run anything decent, I feel like the squeeze is already happening there at a hardware level and options like Intel Arc Pro B60 can't come soon enough!
This feels like a bit of a leap?
That's like saying "I just bought the JetBrains IDE Ultimate pack and some other really cool tools, so we no longer need a founding engineer!" All of that AI stuff can just be a force multiplier and most attempts at outright replacing people with them are a bit shortsighted. Closer to a temporary and somewhat inconsistent freelance worker, if anything.
That said, not wanting to pay for AI tools if they indeed help in your circumstances would also be like saying "What do you need JetBrains IDEs for, Visual Studio Code is good enough!" (and sometimes it is, so even that analogy is context dependent)
I'm reminded of rule 9 of the Joel Test: https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
I've been using a tool I developed (https://github.com/stravu/crystal) to run several sessions in parallel. Sometimes I will run the same prompt multiple times and pick the winner, or sometimes I'll be working on multiple features at once, reviewing and testing one while waiting on the others.
Basically, with the right tooling you can burn tokens incredibly fast while still receiving a ton of value from them.
AI is a large motivating factor in data center build outs, and data centers are projected to form an increasing portion of new energy usage. An individual query may not use much but the macro effect is quite serious, especially considering the climate crisis we are already failing to manage. It’s a bit like throwing plastic out your window on the highway and ignoring the garbage patch floating in the middle of the Pacific.
But based on my costs, yours sounds much much higher :)
I use and abuse mine, running multiple agents, and I know that I'd spend the entire month of fees in a few days otherwise.
So it seems like a ploy to improve their product and capture the market, like usual with startups that hope for a winner-takes-all.
And then, like uber or airbnb, the bait and switch will raise the prices eventually.
I'm wondering when the hammer will fall.
But meanwhile, let's enjoy the free buffet.
Love the idea by the way! We do need new IDE features which are centered around switching between Git worktrees and managing multiple active agents per worktree.
Edit: oh, do you invoke normal CC within your tool to avoid this issue and then post-process?
I'm on $100 and i'm shocked how much usage i get out of Sonnet, while Opus feels like no usage at all. I barely even bother with Opus since most things i want to do just runout super quick.
In their dreams.
Deleted Comment
I worry, with an article like this floating around, and with this as the competition, and with the economics of all this stuff generally... major price increases are on the horizon.
Businesses (some) can afford this, after all it's still just a portion of the costs of a SWE salary (tho $1000/m is getting up there). But open source developers cannot.
I worry about this trend, and when the other shoe will drop on Anthropic's products, at least.
I'm very bullish on the future of smaller, locally-run models, myself.
That said, I suspect a lot of the value in Claude Code is hand-rolled fined-tuned heuristics built into the tool itself, not coming from the LLM. It does a lot of management of TODO lists, backtracking through failed paths, etc which look more like old-school symbolic AI than something the LLM is doing on its own.
Replicating that will also be required.
The underlying inference is not super expensive. All the tricks they're pulling to make it smarter certainly multiply the price, but the price being charged almost certainly covers the cost. Basic inference on tuned base models is extremely cheap. But certainly it looks like Anthropic > OpenAI > Google in terms of inference cost structure.
Prices will only come up if there's a profit opportunity; if one of the vendors has a clear edge and gains substantial pricing power. I don't think that's clear at this point. This article is already equivocating between o3 and Opus.
For anything moderately complex, use Claude's plan mode; you get to approve the plan before turning it loose. The planning phase is where you want to use a more sophisticated model or use extended thinking mode.
Once you have a great plan, you can use a less sophisticated model to execute it.
Even if you're a great programmer, you may suck at prompting. There's an art and a science to prompting; perhaps learn about it? [1]
Don't forget; in addition to telling Claude or any other model what to do, you can also tell them what not to do in the CLAUDE.md or equivalent file.
[1]: https://docs.anthropic.com/en/docs/build-with-claude/prompt-...
But $200/month is unbearable for open source / free software developers.
My read was the article takes it as a given that $200/m is worth it.
The question in the article seems more: is an extra $800/m to move from Claude Code to an agent using o3 worth it?
Here in EU, if not stated in your work agreement, it's pretty common people work full time job and also as a self-employed contractor for other companies.
So when I'm finished with my work, HO of course, I just work on my "contractor" projects.
Honestly, I wouldn't sign a full time contract banning me from other work.
And if you have enough customers, you just drop full time job. And just pay social security and health insurance, which you must pay by law anyway.
And specially in my country, it's even more ridiculous that as self-employed you pay lower taxes than full time employees, which truth to be told are ridiculously high. Nearly 40% of your salary.
Deleted Comment
Dead Comment
This might also just be a feature of the change in problem size - perhaps the larger problems that necessitate o3 are also too open-ended and would require much more planning up front. But at that point it's actually more natural to just iterate with sonnet and stay in the driver's seat a bit. Plus sonnet runs 5x faster.
Dead Comment