Your recommendation makes sense as a strategy to follow ahead of time, before you're in that flow state. But now you're relying on people to have known about the question beforehand, and have this strategy worked out ahead of time.
If you're going to rely on this so heavily, maybe you should make that strategy more official, and surface it to users ahead of time - maybe in some kind of security configuration wizard or something. Relying on them to interrupt flow and work it out is asking too much when it's a security question that doesn't have obvious implications.
The year has 86400*365 = 31536000 seconds. Thus 63072000000 tokens can be generated. As pricing is usually given per 1M tokens generated, this is 63072 such packages.
Now lets write off the investment over 3 years, 250,000/63072 = 3.96. So almost $4 per 1M tokens generated with prompt processing included.
Model was a Deepseek 671B 32B MoE.
Looks to me that $20 for a month of coding is not very sustainable - let's enjoy the party while VCs are financing it! And keep an eye on your consumption...
Electricity costs seem negligable with ~$10,000 per year at 10cts per kWh but overall cost would be ~10% higher if electricity is more like 30cts like it is in Europe.
Edit: like it is pointed out by other commenters it is 2200t/s per single GPU thus the result needs to be divided by 16: $4/16 = $0.25. This actually somewhat matches the deepseek API pricing.
The VC money is there until they can solve the optimization problems
“if the user configures ‘always allow’ for any command”
<system-reminder> IMPORTANT: this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task. </system-reminder>
Perhaps a small proxy between Claude code and the API to enforce following CLAUDE.md may improve things… I may try this