This is the kind of pricing that I expect most AI companies are gonna try to push for, and it might get even more expensive with time. When you see the delta between what's currently being burnt by OpenAI and what they bring home, the sweet point is going to be hard to find.
Whether you find that you get $250 worth out of that subscription is going to be the big question
I agree, and the problem is that "value" != "utilization".
It costs the provider the same whether the user is asking for advice on changing a recipe or building a comprehensive project plan for a major software product - but the latter provides much more value than the former.
How can you extract an optimal price from the high-value use cases without making it prohibitively expensive for the low-value ones?
Worse, the "low-value" use cases likely influence public perception a great deal. If you drive the general public off your platform in an attempt to extract value from the professionals, your platform may never grow to the point that the professionals hear about it in the first place.
I wonder who will be the first to bite the bullet and try charging different rates for LLM inference depending on whether it's for commercial purposes. Enforcement would be a nightmare but they'd probably try to throw AI at that as well, successfully or not.
I pay for both ChatGPT and Grok at the moment. I often find myself not using them as much as I had hoped for the $50 a month it is costing me. I think if I were to shell out $250 I best be using it for a side project that is bringing in cash flow. But I am not sure if I could come up with anything at this point given current AI capabilities.
Value capture pricing is a fantasy often spouted by salesmen, the current era AI systems have limited differentiation, so the final cost will trend towards the cost to run the system.
So far I have not been convinced that any particular platform is more than 3 months ahead of the competition.
See: nvidia product segmentation by VRAM and FP64 performance, but shipping CUDA for even the lowliest budget turd MX150 GPU. Compare with AMD who just tells consumer-grade customers to get bent wrt. GPU compute
I feel prices will come down a lot for "viable" AI, not everyone needs the latest and greatest at rock-bottom prices. Assuming AGI is just a pipe-dream with LLMs as I suspect.
Not necessarily. The prevailing paradigm is that performance scales with size (of data and compute power).
Of course, this is observably false as we have a long list of smaller models that require fewer resources to train and/or deploy with equal or better performance than larger ones. That's without using distillation, reduced precision/quantization, pruning, or similar techniques[0].
The real thing we need is more investment into reducing computational resources to train and deploy models and to do model optimization (best example being Llama CPP). I can tell you from personal experience that there is much lower interest in this type of research and I've seen plenty of works rejected because "why train a small model when you can just tune a large one?" or "does this scale?"[1] I'd also argue that this is important because there's not infinite data nor compute.
[1] Those works will out perform the larger models. The question is good, but this creates a barrier to funding. Costs a lot to test at scale, you can't get funding if you don't have good evidence, and it often won't be considered evidence if it isn't published. There's always more questions, every work is limited, but smaller compute works have higher bars than big compute works.
> GPUs will keep getting cheaper. [...] but 2025-level performance, at least, shouldn't get more expensive.
This generation of GPUs have worse performance for more $$$ than the previous generation. At best $/perf has been a flat line for the past few generations. Given what fab realities are nowadays, along with what works best for GPUs (the bigger the die the better), it doesn't seem likely that there will be any price scaling in the near future. Not unless there's some drastic change in fabrication prices from something
Costs more than seats for Office 365, Salesforce and many productivity tools. I don't see management gleefully running to give access to whole departments. But then again, if you could drop headcount by just 1 on a team by giving it to the rest, you probably come out ahead.
The problem with all of these is that SOTA models keep changing. I thought about getting OpenAI's Pro subscription, and then Gemini flew ahead and was free. If I get this then sooner or later OpenAI or Anthropic will be back on top.
I wonder if there's an opportunity here to abstract away these subscription costs and offer a consistent interface and experience?
For example - what if someone were to start a company around a fork of LiteLLM? https://litellm.ai/
LiteLLM, out of the box, lets you create a number of virtual API keys. Each key can be assigned to a user or a team, and can be granted access to one or more models (and their associated keys). Models are configured globally, but can have an arbitrary number of "real" and "virtual" keys.
Then you could sell access to a host of primary providers - OpenAI, Google, Anthropic, Groq, Grok, etc. - through a single API endpoint and key. Users could switch between them by changing a line in a config file or choosing a model from a dropdown, depending on their interface.
Assuming you're able to build a reasonable userbase, presumably you could then contract directly with providers for wholesale API usage. Pricing would be tricky, as part of your value prop would be abstracting away marginal costs, but I strongly suspect that very few people are actually consuming the full API quotas on these $200+ plans. Those that are are likely to be working directly with the providers to reduce both cost and latency, too.
The other value you could offer is consistency. Your engineering team's core mission would be providing a consistent wrapper for all of these models - translating between OpenAI-compatible, Llama-style, and Claude-style APIs on the fly.
Is there already a company doing this? If not, do you think this is a good or bad idea?
I think the biggest hurdle would be complying with the TOS. Imagine that OpenAI etc would not be a fan of sharing quotas across individuals in this way
The Gemini 2.5 Pro 05/06 release by Google’s own reported benchmarks was worse in 10/12 cases than the 3/25 version. Google re routed all traffic for the 3/25 checkpoint to the 05/06 version in the API.
I’m also unsure who needs all of these expanded quotas because the old Gemini subscription had higher quotas than I could ever anticipate using.
You can just surf between Gemini, DeepSeek, Qwen, etc. using them for free. I can't see paying for any AI subscription at this point as the free models out there are quite good and are updated every few months (at least).
I am willing to pay for up to 2 models at a time but I am constantly swapping subscriptions around. I think I'd started and cancelled GPT and Claude subscriptions at least 3-4 times each.
This 100%. Unless you are building a product around the latest models and absolutely must squeeze the latest available oomph, it's more advantageous to just wait a little bit.
I wonder why anyone would pay these days, unless its using features outside of the chatbot. Between Claude, ChatGPT, Mistral, Gemini, Perplexity, Grok, Deepseek and son on, how do you ever really run out of free "wannabe pro"?
The global average salary is somewhere in the region of $1500.
There’s lots of people and companies out there with $250 to spend on these subscriptions per seat, but on a global scale (where Google operates), these are pretty niche markets being targeted. That doesn’t align well with the multiple trillions of dollars in increased market cap we’ve seen over the last few years at Google, Nvda, MS etc.
This is one of those assumed truisms that turns out to be false upon close scrutiny, and there's a bit of survivorship bias in the sense that we tend to look at the technologies that had mass appeal and market forces to make them cheaper and available to all. But theres tons of new tech thats effectively unobtainable to the vast majority of populations, heck even nation states. With the current prohibitive costs (in terms of processing power, energy costs, data center costs) to train these next generation models, and the walled gardens that have been erected, there's no reason to believe the good stuff is going to get cheaper anytime soon, in my opinion.
The technology itself is not useful. What they're really selling is the data it was trained on. Most of which was generated by students and the working class. So there's a unique extra layer of exploitation in these pricing models.
The $250 is just rate limiting at the moment. It isn't a real price; I doubt it is based on cost-recovery or what they think the market can bare.
They need users to make them a mature product, and this rate-limits the number of users while putting a stake in the ground to help understand the "value" they can attribute to the product.
The story being told at Wall Street is that this is a once-in-an-era revolution in work akin to the Industrial Revolution. That’s driving multiple trillions of dollars in market cap into the companies in AI markets.
That story doesn’t line up with a product whose price point limits it to fewer than 25-50mn subscriptions shared between 5 inference vendors.
I've toyed with Gemini 2.5 briefly and was impressed... but I just can't bring myself to see Google as an option as an inference provider. I don't trust them.
Actually, that's not true. I do trust them - I trust them to collect as much data as possible and to exploit those data to the greatest extent they can.
I'm deep enough into AI that what I really want is a personal RAG service that exposes itself to an arbitrary model at runtime. I'd prefer to run inference locally, but that's not yet practical for what I want it to do, so I use privacy-oriented services like Venice.ai where I can. When there's no other reasonable alternative I'll use Anthropic or OpenAI.
I don't trust any of the big providers, but I'm realizing that I have baseline hostility toward Google in 2025.
Understanding that no outside provider is going to care about your privacy and will always choose to try and sell you their crappy advertisements and push their agenda on you is the first step in building a solution. In my opinion that solution will come in the form of a personalized local ai agent which is the gatekeeper of all information the user receives and sends to the outside world. A fully context aware agent that has the users interests in mind and so only provides user agreed context to other ai systems and also filters all information coming to the user from spam, agenda manipulation, etc... Basically a very advanced spam blocker of the future that is 100% local and fully user controlled and calibrated. I think we should all be either working on something like this if we want to keep our sanity in this brave new world.
To be clear, I don't trust Venice either . It just seems less likely to me that they would both lie about their collection practices and be able to deeply exploit the data.
I definitely want locally-managed data at the very least.
I pay for OpenAI Pro but this is a clear no for me. I just don't get enough value out of Gemini to justify a bump from $20 / month to $250.
If they really want to win they should undercut OpenAI and convince people to switch. For $100 / month I'd downgrade my OpenAI Pro subscription and switch to Gemini Ultra.
I mean OpenAI already loses money on their Pro line. So it's less selling $1 for $0.50 and more selling $1 for $0.25 because the guy down the street sells it for $0.50
I paid for it and Google Flow was upgraded to Ultra but Gemini still shows Pro, and asks for me to upgrade. When I go to "upgrade," it says I am already at Google Ultra.
Hmm, interesting. There's basically no information what makes Ultra worth that much money in concrete terms except "more quota". One interesting tidbid I've noticed is that it seems Google One (or what is it called now) also carries sub for youtube. So far, I'm still on "old" Google One for my family and myself storage and have a separate youtube subscription for the same. I still haven't seen a clear upgrade path, or even a discount based on how much I have left from the old subscription, if I ever choose to do so (why?).
edit: also google AI Ultra links leads to AI Pro and there's no Ultra to choose from. GG Google, as always with their "launches".
I believe Imagen 4 and Veo 3 (the newest image/video models) and the "deep think" variant are for Ultra only. (Is it worth it? It's a different question.)
Why do people keep on saying that corporations will pay these price-tags?
Most corporations really keep a very tight lid on their software license costs. A $250 license will be only provided for individuals with very high justification barriers and the resulting envy effects will be a horror for HR.
I think it will be rather individuals who will be paying out of their pocket and boosting their internal results.
And outside of those areas in California where apples cost $5 in the supermarket I don't see many individuals capable of paying these rates.
This isn't really out of line with many other SaaS licenses that companies pay for.
This also includes things like video and image generation, where certain departments might previously have been paying thousands of dollars for images or custom video. I can think of dozens of instances where a single Veo2/3 video clip would have been more than good enough to replace something we had to pay a lot of money and waste of a lot of time acquiring previously.
You might be comparing this to one-off developer tool purchases, which come out of different budgets. This is something that might come out of the Marketing Team's budget, where $250/month is peanuts relative to all of the services they were previously outsourcing.
I think people are also missing the $20/month plan right next to it. That's where most people will end up. The $250/month plan is only for people who are bumping into usage limits constantly or who need access to something very specific to do their job.
We just signed up to spend $60+/month for every dev to have access to Copilot because the ROI is there. If $250/month save several hours per month for a person, it makes financial sense
When Docker pulled their subscription shenanigans, the global auto parts manufacturer I work for wasn't delighted when they saw $5 (or was it 7?)/month/user, but were ready to suck it up for a few hundred devs.
They noped right out when it turned out to be more like $20/month/user, not payable by purchase order, and instead spent a developer month cobbling together our own substitute involving Windows Subsystem for Linux, because it would pay off within two months.
The big problem for companies is that every SaaS vendor they use wants to upsell AI add-on licensing upgrades. Companies won’t buy the AI option for every app they’re licensing today. Something will have to give.
Nobody outside of the major players (Microsoft, Google, Apple, Salesforce) has enough product suite eyeball time to justify a first-party subscription.
Most companies didn't target it in their first AI release because there was revenue laying on the ground. But the market will rapidly pressure them to support BYOLLM in their next major feature build.
They're still going to try to charge an add-on price on top of BYOLLM... but that margin is going to compress substantially.
Which means we're probably t minus 1 year from everyone outside the above mentioned players being courted and cut revenue-sharing deals in exchange for making one LLM provider their "preferred" solution with easier BYOLLM. (E.g. Microsoft pays SaaS Vendor X behind the scenes to drive BYOLLM traffic their way)
I foresee a slightly different outcome: If companies can genuinely enhance worker productivity with LLMs (for many roles, this will be true), then they can expand their business without hiring more people. Instead of firing, they will slow the rate of hiring. Finally, the 250 USD/month license isn't that much of a cost burden if you start with the most senior people, then slowly extend the privilege to lower and lower levels, carefully deciding if the role will be positively impacted by access to a high quality LLM. (This is similar to how Wall Street trading floors decide who gets access to expensive market data via Reuters or Bloomberg terminal.)
For non-technical office jobs, LLMs will act like a good summer intern, and help to suppress new graduate hiring. Stuff like HR, legal, compliance, executive assistants, sales, marketing/PR, and accounting will all greatly benefit from LLMs. Programming will take much longer because it requires incredibly precise outputs.
One low hanging fruit for programming and LLMs: What if Microsoft creates a plug-in to the VBA editor in Microsoft Office (Word, Excel, etc.) that can help to write VBA code? For more than 25 years, I have watched non-technical people use VBA, and I have generally been impressed with the results. Sure, their code looks like shit and everything has hard-coded limits, but it helps them do their work faster. It is a small miracle what people can teach themselves with (1) a few chapters of a introductory VBA book, (2) some blog posts / Google searches, and (3) macro recording. If you added (4) LLM, then it would greatly boost the productivity of Microsoft Office power users.
I don’t see any benefit to removing humans in order to achieve the exact same level of efficiency… wouldn’t that just straight-up guarantee a worse product unless your employees were absolutely all horrendous to begin with?
Whether you find that you get $250 worth out of that subscription is going to be the big question
It costs the provider the same whether the user is asking for advice on changing a recipe or building a comprehensive project plan for a major software product - but the latter provides much more value than the former.
How can you extract an optimal price from the high-value use cases without making it prohibitively expensive for the low-value ones?
Worse, the "low-value" use cases likely influence public perception a great deal. If you drive the general public off your platform in an attempt to extract value from the professionals, your platform may never grow to the point that the professionals hear about it in the first place.
So far I have not been convinced that any particular platform is more than 3 months ahead of the competition.
They successfully solved it with an advertising....and they also had the ability to cache results.
Much like social media, this will end in “if you aren’t paying for the product, then you are the product.”
Moore's law should help as well, shouldn't it? GPUs will keep getting cheaper.
Unless the models also get more GPU hungry, but 2025-level performance, at least, shouldn't get more expensive.
Of course, this is observably false as we have a long list of smaller models that require fewer resources to train and/or deploy with equal or better performance than larger ones. That's without using distillation, reduced precision/quantization, pruning, or similar techniques[0].
The real thing we need is more investment into reducing computational resources to train and deploy models and to do model optimization (best example being Llama CPP). I can tell you from personal experience that there is much lower interest in this type of research and I've seen plenty of works rejected because "why train a small model when you can just tune a large one?" or "does this scale?"[1] I'd also argue that this is important because there's not infinite data nor compute.
[0] https://arxiv.org/abs/2407.05694
[1] Those works will out perform the larger models. The question is good, but this creates a barrier to funding. Costs a lot to test at scale, you can't get funding if you don't have good evidence, and it often won't be considered evidence if it isn't published. There's always more questions, every work is limited, but smaller compute works have higher bars than big compute works.
This generation of GPUs have worse performance for more $$$ than the previous generation. At best $/perf has been a flat line for the past few generations. Given what fab realities are nowadays, along with what works best for GPUs (the bigger the die the better), it doesn't seem likely that there will be any price scaling in the near future. Not unless there's some drastic change in fabrication prices from something
Maybe I'm misremembering, but I thought Moore's law doesn't apply to GPUs?
For example - what if someone were to start a company around a fork of LiteLLM? https://litellm.ai/
LiteLLM, out of the box, lets you create a number of virtual API keys. Each key can be assigned to a user or a team, and can be granted access to one or more models (and their associated keys). Models are configured globally, but can have an arbitrary number of "real" and "virtual" keys.
Then you could sell access to a host of primary providers - OpenAI, Google, Anthropic, Groq, Grok, etc. - through a single API endpoint and key. Users could switch between them by changing a line in a config file or choosing a model from a dropdown, depending on their interface.
Assuming you're able to build a reasonable userbase, presumably you could then contract directly with providers for wholesale API usage. Pricing would be tricky, as part of your value prop would be abstracting away marginal costs, but I strongly suspect that very few people are actually consuming the full API quotas on these $200+ plans. Those that are are likely to be working directly with the providers to reduce both cost and latency, too.
The other value you could offer is consistency. Your engineering team's core mission would be providing a consistent wrapper for all of these models - translating between OpenAI-compatible, Llama-style, and Claude-style APIs on the fly.
Is there already a company doing this? If not, do you think this is a good or bad idea?
The Gemini 2.5 Pro 05/06 release by Google’s own reported benchmarks was worse in 10/12 cases than the 3/25 version. Google re routed all traffic for the 3/25 checkpoint to the 05/06 version in the API.
I’m also unsure who needs all of these expanded quotas because the old Gemini subscription had higher quotas than I could ever anticipate using.
"Google AI Ultra" is a consumer offering though, there's no API to have quotas for?
Have you tried say O1 Pro Mode? And if you have, do you find it as good as whatever free models you use?
If you haven't, it's kind of weird to do the comparison without actually having tried it.
The Gemini subscription is monthly, so not too much lock-in if you want to change later.
There’s lots of people and companies out there with $250 to spend on these subscriptions per seat, but on a global scale (where Google operates), these are pretty niche markets being targeted. That doesn’t align well with the multiple trillions of dollars in increased market cap we’ve seen over the last few years at Google, Nvda, MS etc.
They need users to make them a mature product, and this rate-limits the number of users while putting a stake in the ground to help understand the "value" they can attribute to the product.
The global average salary earner isn't doing a computer job that benefits from AI.
I don't understand the point of this comparison.
That story doesn’t line up with a product whose price point limits it to fewer than 25-50mn subscriptions shared between 5 inference vendors.
Do you mean the current half baked implementations or just the idea of AI in general?
> I don't understand the point of this comparison.
I don't understand the point of "AI."
Deleted Comment
Actually, that's not true. I do trust them - I trust them to collect as much data as possible and to exploit those data to the greatest extent they can.
I'm deep enough into AI that what I really want is a personal RAG service that exposes itself to an arbitrary model at runtime. I'd prefer to run inference locally, but that's not yet practical for what I want it to do, so I use privacy-oriented services like Venice.ai where I can. When there's no other reasonable alternative I'll use Anthropic or OpenAI.
I don't trust any of the big providers, but I'm realizing that I have baseline hostility toward Google in 2025.
To be clear, I don't trust Venice either . It just seems less likely to me that they would both lie about their collection practices and be able to deeply exploit the data.
I definitely want locally-managed data at the very least.
If they really want to win they should undercut OpenAI and convince people to switch. For $100 / month I'd downgrade my OpenAI Pro subscription and switch to Gemini Ultra.
Deleted Comment
Average Google product launch.
edit: also google AI Ultra links leads to AI Pro and there's no Ultra to choose from. GG Google, as always with their "launches".
This also includes things like video and image generation, where certain departments might previously have been paying thousands of dollars for images or custom video. I can think of dozens of instances where a single Veo2/3 video clip would have been more than good enough to replace something we had to pay a lot of money and waste of a lot of time acquiring previously.
You might be comparing this to one-off developer tool purchases, which come out of different budgets. This is something that might come out of the Marketing Team's budget, where $250/month is peanuts relative to all of the services they were previously outsourcing.
I think people are also missing the $20/month plan right next to it. That's where most people will end up. The $250/month plan is only for people who are bumping into usage limits constantly or who need access to something very specific to do their job.
This is not the usecase of AI Ultra.
They noped right out when it turned out to be more like $20/month/user, not payable by purchase order, and instead spent a developer month cobbling together our own substitute involving Windows Subsystem for Linux, because it would pay off within two months.
Nobody outside of the major players (Microsoft, Google, Apple, Salesforce) has enough product suite eyeball time to justify a first-party subscription.
Most companies didn't target it in their first AI release because there was revenue laying on the ground. But the market will rapidly pressure them to support BYOLLM in their next major feature build.
They're still going to try to charge an add-on price on top of BYOLLM... but that margin is going to compress substantially.
Which means we're probably t minus 1 year from everyone outside the above mentioned players being courted and cut revenue-sharing deals in exchange for making one LLM provider their "preferred" solution with easier BYOLLM. (E.g. Microsoft pays SaaS Vendor X behind the scenes to drive BYOLLM traffic their way)
For non-technical office jobs, LLMs will act like a good summer intern, and help to suppress new graduate hiring. Stuff like HR, legal, compliance, executive assistants, sales, marketing/PR, and accounting will all greatly benefit from LLMs. Programming will take much longer because it requires incredibly precise outputs.
One low hanging fruit for programming and LLMs: What if Microsoft creates a plug-in to the VBA editor in Microsoft Office (Word, Excel, etc.) that can help to write VBA code? For more than 25 years, I have watched non-technical people use VBA, and I have generally been impressed with the results. Sure, their code looks like shit and everything has hard-coded limits, but it helps them do their work faster. It is a small miracle what people can teach themselves with (1) a few chapters of a introductory VBA book, (2) some blog posts / Google searches, and (3) macro recording. If you added (4) LLM, then it would greatly boost the productivity of Microsoft Office power users.