It makes sense for OpenAI to overpay for wrapper companies that have distribution - a good analogy is British pub (bar) companies. By mid 2000s they were struggling. Low margins, rising cost base, expensive debt.
What saved them? Heineken. They didn't care if the pubs made much of a profit - they made positive margins on having people drink their beer at market prices. They just wanted to increase volume. So they bought up several major players. In 2008 they acquired Scottish & Newcastle's operations, later thought bought Star Pubs & Bars, which had 1,049 leased and tenanted pubs, and finally half of Punch Taverns.
The same strategy can work for OpenAI - buy up the wrapper companies, and make sure YOUR models are served to the user base.
"The company must change its mindset and become proactive in its approach to compliance. I have decided this can best be achieved by the imposition of a sanction that will serve as a deterrent to future non-compliant conduct by Star and other pub-owning businesses."
Nice analogy. Although a simpler way to say it is simply vertical integration, a known term for the phenomenon with a whole class of benefits.
One of those benefits brings to mind another analogy: Apple. The ai model and the tooling are kind of like the hardware and software. By co-developing them you can make a better product and certainly something hard to compete with.
"We have something called A-SWE, which is the Agentic Software Engineer. And this is where it starts to get really interesting because it’s not just about augmenting a developer, like a Copilot might do, but actually saying, ‘Hey, this thing can go off and build an app.’ It can do the pull request, it can do the QA, it can do the bug testing, it can write the documentation."
On the few times I've used Cursor or Claude Code for tasks beyond simple tests or features, I found myself spending more time correcting their errors than if I had written the code from scratch.
I like Cursor and use it daily, but none of its models are even close to being able to take nontrivial work. Besides, it quickly gets expensive if you’re using the smarter models.
IMO these AI tools will become to software engineers what CAD is to mechanical and civil engineers. Can they work without it? Sure, but why would they?
This is because Cursor is not sending the full context even when you drag and drop things inside the chat box.
I started getting worse results from Cursor too. Then, Gemini 2.5 Pro dropped with 1M context, I repomixed my project, popped it into AIStudio, and asked it make me prompts I can feed into Cursor to fix the issues I have.
Gemini has the whole picture and the prompts it creates tell Cursor which items to change how.
It's pretty obvious that these tools are not replacements for developers as of yet. I've tried them, and they are very nifty, and can even do some boring tasks really well, but you can't actually substitute actual developer skill (yet). But everybody is holding their breath because it looks like they might eventually reach that level, and the time-frame for that eventually is unknown.
I use cursor, and these tools dont encode a causal representation of how something works. Hence they cannot root cause and fix bugs. Fixing issues with 99% correctness snowballs into a hot mess very soon due to compounding errors. Somehow this is missed by people who dont actually code for a living.
The question isn’t “can AI code?”, the question is “can AI keep coding?”.
How do any of these companies create “an AI Software Engineer”? Scraping knowledge posted by actual engineers on StackOverflow? Scraping public (& arguably private) GitHub repos created by actual engineers? What happens when all of them are out of a job? AI gets trained on knowledge generated by AI? Where will the incremental gain come from?
It’s like saying I will teach myself to cook better food by only learning from recipe books I created based on the knowledge I already have.
This sounds like the ouroboros snake eating its own tail, which it is, but because of tool use letting it compile and run code, it can generate code for, say, rust that does a thing, iterate until it's gotten the borrow checker to not be angry, then run the code to assert it does what it claims to, and then feed that working code into the training set as good code (and the non-working code as bad). Even using only the recipe books you already had, doing a lot of cooking practice would make you a better cook, and once you learn the recepies in the book well, mixing and matching recepies; egg preparation from one, flour ratios from another, is simply just something a good cook would just get a feel for what works and what doesn't, even if they only ever used that one book.
OpenAI‘s early investment in Cursor was a masterstroke. Acquiring Windsurf would be another.
Next advances in coding AI depend on real-world coding data, esp how professional developers use agentic AI for coding + other tasks.
RL works well on sufficiently large base models as shown by rapid progress on verifiable problems with good training data, e.g. competition math, competitive coding problems, scientific question answering.
Training LLMs on detailed interaction data from AI-powered IDEs could become a powerful flywheel leading to the automation of practical coding.
> How many developers want to have usage analytics of their editors helping companies build functionality that aspires to replace them? This is silly.
Honestly, too many. Software engineers can be really, really dumb. I think it has something to do with assuming they're really smart.
But even unwilling developers may be forced to participate (see the recent Shopify CEO email), despite knowing full well what's going on. I mean, tons of people have already had to go through the humiliation of training their offshore replacements before getting laid off, and that's a much more in-your-face situation.
Many of them likely won’t switch immediately. They could also try to keep them with sweet offers, like generous usage quotas, early access to the latest models, etc.
Once sufficient data is gathered, the next generation models will be among the very best at agentic coding, which leads to stronger stickiness, and so on.
> Training LLMs on detailed interaction data from AI-powered IDEs could become a powerful flywheel leading to the automation of practical coding.
I agree. But this is a more general flywheel effect. OpenAI has 500M users generating trillions of interactive tokens per month. Those chat sessions are sequences of interaction, where downstream context can be used to judge prior responses. Basically, in hindsight, you check "has this LLM response been good or bad?", and generate a score. You can expand the window to multiple related chats. So you can leverage extended context and hindsight for judging response quality. Using that data you can finetune a RLHF model, and with it finetune the base model.
But it's not just hindsight analysis. Sometimes users test or implement projects in the real world, and the LLM gets to see idea validation. Other times they elicit tacit experience from humans. That is what I think forms an experience flywheel. LLM being together with humans during problem solving, internalizing approaches, learning from outcomes.
Besides problem solving assistance LLMs are used for counselling/keeping company/therapeutic role. People chat with LLMs to understand and clarify their goals. These are generative teleological models. They are also used by 90% of students if I am to believe a random article.
So the triad of uses for LLMs are: professional problem solving, goal setting/therapy, and learning. All three benefit from the flywheel effect of interacting with millions of people.
This is beginning to look a bit like OpenAI is becoming to startups what Facebook was in the Instagram and WhatsApp era. Back then Facebook were far more established, and mobile was a big catalyst, but the sums being mentioned here are very large.
We should all start building the products that we think will terrify OpenAI most.
To followup on this train of thought, I feel like it might be game over for everyone else if Google decides to release their own VS code Cursor-like clone and stays consistent with the incredible context and free tier for Gemini Pro that you currently get from online/webui Gemini and Google AI studio.
The question is do they want to go in that direction? (And also if they do, do they only allow Gemini model or do they open it up to a choice of various models (to also include models not related to Google/Gemini) and/or BYOK). I don't see why not because I believe they will slaughter Cursor, Windsurf, et al if so ...
I'm not sure it matters whether or not it's a fork of VS Code or not.
Zed is a standalone editor written from scratch and it hasn't had the same success as Cursor (yet).
JetBrains IDEs are my absolute favorite. JetBrains controls the entire stack but they haven't had the same results yet. Cursor and Claude Code have some sort of product differentiation here that is hard to argue against.
I think it speaks to the SV bubble that the single most valuable application of their LLM that they can think of would be software development.
One of the oddities of Instagram and WhatsApp is both of them were twists on what the expected formula for user value was at the time. (Retro photos and international SMS replacement respectively).
Not at all the same, because these startups are ultimately dependent on OpenAI or OpenAI-like model providers to be able to exist. So OpenAI isn't preemptively quashing competitors like Facebook did, rather moving further up and down the chain (chips -> data centers -> foundation models -> fine-tuned models -> AI-powered products) to expand their business.
I don’t quite understand why OpenAI would pay so much when there’s a solid open-source alternative like Cline. I tried both, and feel that Cline with DeepSeek v3 is comparable to Cursor and more cost-effective.
People do this exact thing all the time. Facebook paid $1 billion for Instagram whenever Facebook's cash and marketable securities was only $9-10 billion, even though Facebook already had a mobile social media app.
People clearly want the subscription model where they don’t have to worry about API keys and such. I bet a huge chunk of this market is non technical people who can’t code and don’t realize how bad the code they’re writing is when Windsurf and Cursor chop off the context to make it cheaper to run.
It's funny that in under a year we went from Sam Altman publicly saying that OpenAI was going to "steamroll" startups that were building products within its blast radius to now offering multiple billion dollars for those same startups.
My uninformed and perhaps overly charitable interpretation: he warned them they were going to be steamrolled, they built their product anyway, and now OpenAI is buying them because (1) OpenAI doesn't want the negative publicity of steamrolling them all, and (2) OpenAI has the money and is a bit too lazy to build a clone.
I mostly use Claude, but have recently been playing with Gemini 2.5 and ChatGPT 4.1, and they've been great too, with slightly different strengths and weaknesses.
Usability/Performance/etc aside, I get such a sense of magic and wonder with the new Agent mode in VSCode. Watching a little AI actually wander around the code and making decisions on how to accomplish a task. It's so unfathomably cool.
What saved them? Heineken. They didn't care if the pubs made much of a profit - they made positive margins on having people drink their beer at market prices. They just wanted to increase volume. So they bought up several major players. In 2008 they acquired Scottish & Newcastle's operations, later thought bought Star Pubs & Bars, which had 1,049 leased and tenanted pubs, and finally half of Punch Taverns.
The same strategy can work for OpenAI - buy up the wrapper companies, and make sure YOUR models are served to the user base.
"The company must change its mindset and become proactive in its approach to compliance. I have decided this can best be achieved by the imposition of a sanction that will serve as a deterrent to future non-compliant conduct by Star and other pub-owning businesses."
https://www.gov.uk/government/news/heineken-pub-company-fine...
One of those benefits brings to mind another analogy: Apple. The ai model and the tooling are kind of like the hardware and software. By co-developing them you can make a better product and certainly something hard to compete with.
"We have something called A-SWE, which is the Agentic Software Engineer. And this is where it starts to get really interesting because it’s not just about augmenting a developer, like a Copilot might do, but actually saying, ‘Hey, this thing can go off and build an app.’ It can do the pull request, it can do the QA, it can do the bug testing, it can write the documentation."
https://www.youtube.com/watch?v=2kzQM_BUe7E The relevant discussion about A-SWE begins around the 11:26 mark (686 seconds).
I like Cursor and use it daily, but none of its models are even close to being able to take nontrivial work. Besides, it quickly gets expensive if you’re using the smarter models.
IMO these AI tools will become to software engineers what CAD is to mechanical and civil engineers. Can they work without it? Sure, but why would they?
I started getting worse results from Cursor too. Then, Gemini 2.5 Pro dropped with 1M context, I repomixed my project, popped it into AIStudio, and asked it make me prompts I can feed into Cursor to fix the issues I have.
Gemini has the whole picture and the prompts it creates tell Cursor which items to change how.
It's pretty obvious that these tools are not replacements for developers as of yet. I've tried them, and they are very nifty, and can even do some boring tasks really well, but you can't actually substitute actual developer skill (yet). But everybody is holding their breath because it looks like they might eventually reach that level, and the time-frame for that eventually is unknown.
Surely then they have no swe reqs right?
Deleted Comment
How do any of these companies create “an AI Software Engineer”? Scraping knowledge posted by actual engineers on StackOverflow? Scraping public (& arguably private) GitHub repos created by actual engineers? What happens when all of them are out of a job? AI gets trained on knowledge generated by AI? Where will the incremental gain come from?
It’s like saying I will teach myself to cook better food by only learning from recipe books I created based on the knowledge I already have.
This sounds like the ouroboros snake eating its own tail, which it is, but because of tool use letting it compile and run code, it can generate code for, say, rust that does a thing, iterate until it's gotten the borrow checker to not be angry, then run the code to assert it does what it claims to, and then feed that working code into the training set as good code (and the non-working code as bad). Even using only the recipe books you already had, doing a lot of cooking practice would make you a better cook, and once you learn the recepies in the book well, mixing and matching recepies; egg preparation from one, flour ratios from another, is simply just something a good cook would just get a feel for what works and what doesn't, even if they only ever used that one book.
Until we play with it, it doesn't exist.
edit: typo.
Deleted Comment
Deleted Comment
Next advances in coding AI depend on real-world coding data, esp how professional developers use agentic AI for coding + other tasks.
RL works well on sufficiently large base models as shown by rapid progress on verifiable problems with good training data, e.g. competition math, competitive coding problems, scientific question answering.
Training LLMs on detailed interaction data from AI-powered IDEs could become a powerful flywheel leading to the automation of practical coding.
Honestly, too many. Software engineers can be really, really dumb. I think it has something to do with assuming they're really smart.
But even unwilling developers may be forced to participate (see the recent Shopify CEO email), despite knowing full well what's going on. I mean, tons of people have already had to go through the humiliation of training their offshore replacements before getting laid off, and that's a much more in-your-face situation.
Once sufficient data is gathered, the next generation models will be among the very best at agentic coding, which leads to stronger stickiness, and so on.
I agree. But this is a more general flywheel effect. OpenAI has 500M users generating trillions of interactive tokens per month. Those chat sessions are sequences of interaction, where downstream context can be used to judge prior responses. Basically, in hindsight, you check "has this LLM response been good or bad?", and generate a score. You can expand the window to multiple related chats. So you can leverage extended context and hindsight for judging response quality. Using that data you can finetune a RLHF model, and with it finetune the base model.
But it's not just hindsight analysis. Sometimes users test or implement projects in the real world, and the LLM gets to see idea validation. Other times they elicit tacit experience from humans. That is what I think forms an experience flywheel. LLM being together with humans during problem solving, internalizing approaches, learning from outcomes.
Besides problem solving assistance LLMs are used for counselling/keeping company/therapeutic role. People chat with LLMs to understand and clarify their goals. These are generative teleological models. They are also used by 90% of students if I am to believe a random article.
So the triad of uses for LLMs are: professional problem solving, goal setting/therapy, and learning. All three benefit from the flywheel effect of interacting with millions of people.
We should all start building the products that we think will terrify OpenAI most.
And MSFT has many end game options to dump free IDEs on the market with integrated AI.
The question is do they want to go in that direction? (And also if they do, do they only allow Gemini model or do they open it up to a choice of various models (to also include models not related to Google/Gemini) and/or BYOK). I don't see why not because I believe they will slaughter Cursor, Windsurf, et al if so ...
Zed is a standalone editor written from scratch and it hasn't had the same success as Cursor (yet).
JetBrains IDEs are my absolute favorite. JetBrains controls the entire stack but they haven't had the same results yet. Cursor and Claude Code have some sort of product differentiation here that is hard to argue against.
One of the oddities of Instagram and WhatsApp is both of them were twists on what the expected formula for user value was at the time. (Retro photos and international SMS replacement respectively).
When you have just raised $40 billion and you spend $3 billion on a company that has a product that you also build that is dumb as rocks.
Deleted Comment
Deleted Comment