Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

This price drop is significant. For <128,000 tokens they're dropping from $3.50/million to $1.25/million, and output from $10.50/million to $2.50/million.

For comparison, GPT-4o is currently $5/million input and $15/million output and Claude 3.5 Sonnet is $3/million input and $15/million output.

Gemini 1.5 Pro was already the cheapest of the frontier models and now it's even cheaper.

pzo · a year ago

whats confusing they have different pricing for output. Here [1] it's $5/million output (starting 1st october) and on vertex AI [2] it's $2.5/1 million (starting 7 october) - but characters - so it's overall gonna be more expensive if you wanna compare to equivalent 1 million tokens. It's actually even more confusing to know what kind of characters they mean? 1 byte? UTF-8?

[1] https://ai.google.dev/pricing

[2] https://cloud.google.com/vertex-ai/generative-ai/pricing

Deathmax · a year ago

They do mention how characters are counted in the Vertex AI pricing docs: "Characters are counted by UTF-8 code points and white space is excluded from the count"

monsieurpooh · a year ago

They CHANGED the pricing for the first link; originally the output said price drop from $10.50 to $2.50 but now it says $10.50 to $5.00.

RhodesianHunter · a year ago

I wonder if they're pulling the wall-mart model. Ruthlessly cut costs and sell at-or-below costs until your competitors go out of business, then ratchet up the prices once you have market dominance.

lacker · a year ago

Probably not. Do they really believe they are going to knock OpenAI out of business, when the OpenAI models are better?

Instead I think they are going after the "Android model". Recognize they might not be able to dethrone the leader who invented the space. Define yourself in the marketplace as the cheaper alternative. "Less good but almost as good." In the end, they hope to be one of a small number of surviving members of an valuable oligopoly.

bko · a year ago

Isn't Walmart still incredibly cheap? They have a net margin of 2.3%

I think that's one of those things competitors complain about that never actually happens (the raising prices part).

https://www.macrotrends.net/stocks/charts/WMT/walmart/net-pr...

sangnoir · a year ago

There's lot of room to cut margins in the AI stack right now (see Nvidia's latest report); low prices are not an sure indication of predatory pricing. Which company do you think is most likely to have the lowest training and inference costs between Anthropic, OpenAI and Google? My bet goes to the one designing,producing and using their own TPUs.

charlie0 · a year ago

Yes, and it's the exact same thing OpenAI/Microsoft and Facebook are doing. In Facebook's case, they are giving it away for free.

entropicdrifter · a year ago

You think Google would engage in monopolistic practices like that?

Because I do

GaggiX · a year ago

GPT-4o is 2.5/10$. Unless you look at an old checkpoint. GPT-4o was the cheapest frontier model before.

simonw · a year ago

I can’t see that price on https://openai.com/api/pricing/ - it’s listing $5/m input and $15/m output for GPT-4o right now.

No wait, correction: That’s confusing: it lists 4o first and then lists gpt-4o-2024-08-06 as $2.50/$10.

lossolo · a year ago

> For comparison, GPT-4o is currently $5/million input and $15/million output and Claude 3.5 Sonnet is $3/million input and $15/million output.

Google is the only one of the three that has its own data centers and custom inference hardware (TPU).

ekkk · a year ago

It doesn't matter if it's cheap, it's unusable.

copperx · a year ago

Can you expound on why do you find it unusable?

monsieurpooh · a year ago

They CHANGED the pricing from $2.50 to $5.00 stealthily unannounced. Look at the web site again; it says $5 per million now, and this comment on this website might be the ONLY evidence in the world that I wasn't gaslighting myself or hallucinating!

Has anyone used Gemini Code Assist? I'm curious how it compares with Github Copilot and Cursor.

mil22 · a year ago

I have used Github Copilot extensively within VS Code for several months. The autocomplete - fast and often surprisingly accurate - is very useful. My only complaint is when writing comments, I find the completions distracting to my thought process.

I tried Gemini Code Assist and it was so bad by comparison that I turned it off within literally minutes. Too slow and inaccurate.

I also tried Codestral via the Continue extension and found it also to be slower and less useful than Copilot.

So I still haven't found anything better for completion than Copilot. I find long completions, e.g. writing complete functions, less useful in general, and get the most benefit from short, fast, accurate completions that save me typing, without trying to go too far in terms of predicting what I'm going to write next. Fast is the key - I'm a 185 wpm on Monkeytype, so the completion had better be super low latency otherwise I'll already have typed what I want by the time the suggestion appears. Copilot wins on the speed front by far.

I've also tried pretty much everything out there for writing algorithms and doing larger code refactorings, and answering questions, and find myself using Continue with Claude Sonnet, or just Sonnet or o1-preview via their native web interfaces, most of the time.

kendallchuang · a year ago

I see, perhaps with Gemini because the model is larger it takes longer to generate the completions. I would expect with a larger model it would perform better on larger codebases. It sounds like for you, it's faster to work on a smaller model with shorter more accurate completions rather than letting the model guess what you're trying to write.

imp0cat · a year ago

Have you tried Gitlab Duo and if so, what are your thoughts on that?

spiralk · a year ago

The Aider leaderboards seem like a good practical test of coding usefulness: https://aider.chat/docs/leaderboards/. I haven't tried Cursor personally but I am finding Aider with Sonnet more useful that Github Copilot and its nice to be able to pick any model API. Eventually even a local model may be viable. This new Gemini model does not rank very high unfortunately.

kendallchuang · a year ago

Thanks for the link. That's unfortunate, though perhaps the benchmarks will be updated after this latest Gemini release. Cursor with Sonnet is great, I'll have to give Aider a try as well.

therein · a year ago

I know you aren't necessarily talking about in-editor code assist but something about in-editor AI cloud code assist makes me super uncomfortable.

It makes sense I need to be careful not to commit secrets to public repositories but now I have to avoid not only saving credentials into a file but even to paste them by accident into my editor?

danmaz74 · a year ago

I tried it briefly and didn't like it. On the other hand, I found Gemini pro better than sonnet or 4o at some more complex coding tasks (using continue.dev)

dudus · a year ago

I use it and find it very helpful. Never tried cursor or copilot though

therein · a year ago

I tried Cursor the other day. It was actually pretty cool. My thought was, I'll open this open source project and use it to grok my way around the codebase. It was very helpful. After that I accidentally pasted an API secret into the document; had to consider it compromised and re-issued the credential.

spotlmnop · a year ago

God, it sucks

naiv · a year ago

This sounds interesting:

"We will continue to offer a suite of safety filters that developers may apply to Google’s models. For the models released today, the filters will not be applied by default so that developers can determine the configuration best suited for their use case."

ancorevard · a year ago

This is the most important update.

Pricing and speed doesn't matter when your call fails because of "safety".

CSMastermind · a year ago

Also Google's safety filters are absolutely awful. Beyond parody levels of bad.

This is a query I did recently that got rejected for "safety" reasons:

Who are the current NFL starting QBs?

Controversial I know, I'm surprised I'd be willing to take the risk with submitting such a dangerous query to the model.

KaoruAoiShiho · a year ago

There's still basic filters even if you take all the ones that you can turn off from the UI all off. It's still not capable of summarizing some YA novels I tried to feed it because of those filters.

panarky · a year ago

The "safety" filters used to make Gemini models nearly unusable.

For example, this prompt was apparently unsafe: "Summarize the conclusions of reputable econometric models that estimate the portion of import tariffs that are absorbed by the exporting nation or company, and what portion of import tariffs are passed through to the importing company or consumers in the importing nation. Distinguish between industrial commodities like steel and concrete from consumer products like apparel and electronics. Based on the evidence, estimate the portion of tariffs passed through to the importing company or nation for each type of product."

I can confirm that this prompt is no longer being filtered which is a huge win given these new lower token prices!

FergusArgyll · a year ago

Any opinions on pro-002 vs pro-exp-0827 ?

Unlike others here I really appreciate the gemini API, it's free and it works. I haven't done too many complicated things with it but I made a chatbot for the terminal, a forecasting agent (for metaculus challenge) and a yt-dlp auto namer of songs. The point for me isn't really how it compares to openAI/anthropic, it's a free API key and I wouldn't have made the above if I had to pay just to play around

bn-l · a year ago

I’ve used it. The API is incredibly buggy and flakey. A particular pain point is the “recitation error” fiasco. If you’re developing a real world app this basically makes the Gemini api unusable. It strikes me as a kind of “Potemkin” service.

Google is aware of the issue and it has been open on google's bug tracker since March 2024: https://issuetracker.google.com/issues/331677495

There is also discussion on GitHub: https://github.com/google-gemini/generative-ai-js/issues/138

It stems from something google added intentionally to prevent copyright material being returned verbatim (ala the NYT openai fiasco), so they dialled up the "recitation" control (the act of repeating training data—and maybe data they should not have legally trained on).

Here are some quotes from the bug tracker page:

> I got this error by just asking "Who is Google?"

> We're encountering recitation errors even with basic tutorials on application development. When bootstrapping a Spring Boot app, we're flagged for the pom.xml being too similar to some blog posts.

> This error is a deal breaker... It occurs hundreds of times a day for our users and massively degrades their UX.

screye · a year ago

The recitation error is a big deal.

I was ready to champion gemini use across my organization, and the recitation issue curbed any enthusiasm I had. It's opaque and Google has yet to suggest a mitigation.

Your comment is not hyperbole. It's a genuine expression of how angry many customers are.

3choff · a year ago

It seems the recitation problem has been fixed with the latest models. In my tests, the answer generation no longer stops prematurely. Before I resume a project that was on hold due to this issue, I'm gathering feedback from other users about their experiences with this and the new models. Are you still having this problem?

Deleted Comment

summerlight · a year ago

Looks like they are more focused on the economical aspect of those large models? Like 90~95% performance of other frontier models at 50%~70% price.

Workaccount2 · a year ago

They are going for large corporate customers. They are a brand name with deep pockets and a pretty risk-adverse model.

So even if Gemini sucks, they'll still win over execs being pushed to make a decision.

hadlock · a year ago

Not even trying to be snarky, but their lack of ability to offer products for more than a handful of years, does not lend google towards being chosen by large corporate customers. I know a guy who works in cloud sales and his government customers are PISSED they are sunsetting one of their PDF products and are being forced to migrate that process. The customer was expecting that to work for 10+ years and after a ~3 year onboarding process, they have 6 months to migrate. If my neck was on the line after buying the google PDF product, I wouldn't even short list them for an AI product.

That doesn't seem like much of a plan given their trailing position in the cloud space and the fact that Microsoft and AWS both have their own offerings.

TIPSIO · a year ago

I do like the trend.

Imagine if Anthropic or someone eventually release a Claude 3.5 but at like a whopping 10x its current speed.

Would be incredibly more useful and game changing than a slow o1 model that may or not be x percent smarter.

bangaladore · a year ago

Sonnet 3.5 is fast for its quality. But yeah, it's nowhere near Google's flash models. But I assume that is largely just because its a much smaller model.

sgt101 · a year ago

We might see that with the inference ASICs later this year I guess?

anotherpaulg · a year ago

The new Gemini models perform basically the same as the previous versions on aider's code editing benchmark. The differences seem within the margin of error.

https://aider.chat/docs/leaderboards/

frankdenbow · a year ago

Interview with the product lead: https://x.com/rowancheung/status/1838611170061918575?

They put a 42 minute video on twitter? That's brave.

mh- · a year ago

https://www.youtube.com/watch?v=WQvMdmk8IkM

throwup238 · a year ago

> What makes the new model so unique?"
>> Yeah It's a good question. I think it's maybe less so of what makes it unique and more so the general trajectory of the trend that we're on.*

Disappointing.