AI founders will learn the bitter lesson

There's only one core problem in AI worth solving for most startups building AI powered software: context.

No matter how good the AI gets, it can't answer about what it doesn't know. It can't perform a process for which it doesn't know the steps or the rules.

No LLM is going to know enough about some new drug in a pharma's pipeline, for example, because it doesn't know about the internal resources spread across multiple systems in an enterprise. (And if you've ever done a systems integration in any sufficiently large enterprise, you know that this is a "people problem" and usually not a technical problem).

I think the startups that succeed will understand that it all comes down to classic ETL: identify the source data, understand how to navigate systems integration, pre-process and organize the knowledge, train or fine-tune a model or have the right retrieval model to provide the context.

There's fundamentally no other way. AI is not magic; it can't know about trial ID 1354.006 except for what it was trained on and what it can search for. Even coding assistants like Cursor are really solving a problem of ETL/context and will always be. The code generation is the smaller part; getting it right requires providing the appropriate context.

lolinder · 8 months ago

This is why I strongly suspect that AI will not play out the way the Web did (upstarts unseat giants) and will instead play out like smartphones (giants entrench and balloon).

If all that matters is what you can put into context, then AI really isn't a product in most cases. The people selling models are actually just selling compute, so that space will be owned by the big clouds. The people selling applications are actually just packaging data, so that space will be owned by the people who already have big data in their segment: the big players in each industry. All competitors at this point know how important data is, and they're not going to sell it to a startup when they could package it up themselves. And most companies will prefer to just use features provided by the B2B companies they already trust, not trust a brand new company with all the same data.

I fully expect that almost all of the AI wins will take the form of features embedded in existing products that already have the data (like GitHub with Copilot), not brand new startups who have to try to convince companies to give them all their data for the first time.

master_crab · 8 months ago

Yup. And it’s already playing out that way. Anthropic, OpenAI, Gemini - technically not an upstart. All have hyperscalers backing and subsidizing their model training (AWS, Azure, GCP, respectively). It’s difficult to discern where the segmentation between compute and models are here.

CharlieDigital · 8 months ago

    > AI will not play out the way the Web did (upstarts unseat giants)

Yes, I agree.

I recently spoke to a doctor that wanted to do a startup one part of which is an AI agent that can provide consumers second opinions for medical questions. For this to be safe, it will require access to not only patient data, but possibly front line information from content origins like UpToDate because that content is a necessity to provide grounded answers for information that's not in the training set and not publicly available via search.

The obvious winner is UpToDate who owns that data and the pipeline for originating more content. If you want to build the best AI agent for medical analysis, you need to work with UpToDate.

    > ...not brand new startups who have to try to convince companies to give them all their data for the first time.

Yes. I think of Microsoft and SharePoint, for example. Enterprises that are using SharePoint for document and content storage have already organized a subset of their information in a way that benefits Microsoft as concerns AI agents that are contextually aware of your internal data.

diggan · 8 months ago

> will instead play out like smartphones (giants entrench and balloon).

Someone correct me if I'm wrong, but didn't smartphones go the "upstarts unseat giants" way? Apple wasn't a phone-maker, and became huge in the phone-market after their launch. Google also wasn't a phone-maker, yet took over the market slowly but surely with their Android purchase.

I barely see any Motorola, Blackberry, Nokia or Sony Ericsson phones anymore, yet those were the giants at one time. Now it's all iOS/Android, two "upstarts" initially.

ip26 · 8 months ago

The people selling models are actually just selling compute

Yes, fully agreed. Anything AI is discovering in your dataset could have been found by humans, and it could have been done by a more efficient program. But that would require humans to carefully study it and write the program. AI lets you skip the novel analysis of the data and writing custom programs by using a generalizable program that solves those steps for you by expending far more compute.

I see it as, AI could remove the most basic obstacle preventing us from applying compute to vast swathes of problems- and that’s the need to write a unique program for the problem at hand.

scotty79 · 8 months ago

> All competitors at this point know how important data is, and they're not going to sell it to a startup when they could package it up themselves.

Except they won't package it themselves because they are inept and inert. They still won't sell it to startups though.

mritchie712 · 8 months ago

I think you're downplaying how well Cursor is doing "code generation" relative to other products.

Cursor can do at least the following "actions":

* code generation

* file creation / deletion

* run terminal commands

* answer questions about a code base

I totally agree with you on ETL (it's a huge part of our product https://www.definite.app/), but the actions an agent takes are just as tricky to get right.

Before I give Cursor, I often doubt it's going to be able to pull it off and I constantly impressed by how deep it can go to complete a complex task.

stickfigure · 8 months ago

This really puzzles me. I tried Cursor and was completely underwhelmed. The answers it gave (about a 1.5M loc messy Spring codebase) were surface-level and unhelpful to anyone but a Java novice. I get vastly better work out of my intern.

To add insult to injury, the IntelliJ plugin threw spurious errors. I ended up uninstalling it and marking my calendar to try again in 6 months.

Yet some people say Cursor is great. Is it something about my project? I can't imagine how it deals with a codebase that is many millions of tokens. Or is it something about me? I'm asking hard questions because I don't need to ask the easy ones.

What are people who think Cursor is great doing differently?

uxhacker · 8 months ago

So isn’t cursor just a tool for Claude or ChatGpt to use? Another example would be a flight booking engine. So why can’t an AI just talk direct to an IDE? This is hard as the process has changed, due to the human needing to be in the middle.

So Isn’t AI useless without the tools to manipulate?

whimsicalism · 8 months ago

I’m very “bullish” on AI in general but find cursor incredibly underwhelming because there is little value add compared to basically any other AI coding tool that goes beyond autocomplete. Cursor emphatically does not understand large codebases and smaller (few file codebases) can just be pasted into a chat context in the worst case.

ErikBjare · 8 months ago

Is it really that different to Claude with tools via MCP, or my own terminal-based gptme? (https://github.com/ErikBjare/gptme)

digitcatphd · 8 months ago

I agree with you at this time, but there are a couple things I think will change this:

1. Agentic search can allow the model to identify what context is needed and retrieve the needed information (internally or externally through APIs or search)

2. I received an offer from OpenAI to give me free credits if I shared my API data with it, in other words, it is paying for industry specific data as they are probably fine tuning niche models.

There could be some exceptions to UI/UX going down specific verticals but eventually these fine tuning sector specific instances value will erode over time but this will likely occupy a niche since enterprise wants maximum configuration and more out of box solutions are oriented around SMEs.

est31 · 8 months ago

It comes down to moats. Does OpenAI have a moat? It's leading the pack, but the competitors always seem to be catching up to it. We don't see network effects with it yet like with social networks, unless OpenAI introduces household robots for everyone or something, builds a leading marketshare in that segment, and the rich data from these household bots is enough training data that one can't replicate with a smaller robot fleet.

And AI is too fundamental of a technology that a "loss leader biggest wallet wins" strategy, used by the likes of Uber, will work.

API access can be restricted. Big part of why Twitter got authwalled was so that AI models can't train from it. Stack overflow added a no AI models clause to their free data dump releases (supposed to be CC licensed), they want to be paid if you use their data for AI models.

CharlieDigital · 8 months ago

    > Agentic search

All you've proposed is moving the context problem somewhere else. You still need to build the search index. It's still a problem of building and providing context.

dartos · 8 months ago

To your first point, the LLM still can’t know what it doesn’t know.

Just like you can’t google for a movie if you don’t know the genre, any scenes, or any actors in it, and AI can’t build its own context if it didn’t have good enough context already.

IMO that’s the point most agent frameworks miss. Piling on more LLM calls doesn’t fix the fundamental limitations.

TL;DR an LLM can’t magically make good context for itself.

I think you’re spot on with your second point. The big differentiators for big AI models will be data that’s not easy to google for and/or proprietary data.

Lucky they got all their data before people started caring.

stereobit · 8 months ago

It’s not even just the lack of access to the data, so much hidden information to make decisions is not documented at all. It’s intuition, learned from doing something in a specific context for a long time and only a fraction of that context is accessible.

HPsquared · 8 months ago

This is where Microsoft has the advantage, all those Teams calls can provide context.

CharlieDigital · 8 months ago

Yes, this is definitely a big problem.

Anyone that's done any amount of systems integration in enterprises knows this.

    "Let me talk to Lars; he should know because his team owns that system."

    "We don't have any documentation on this, but Mette should know about it because she led the project."

abrichr · 8 months ago

> No matter how good the AI gets, it can't answer about what it doesn't know. It can't perform a process for which it doesn't know the steps or the rules

This is exactly the motivation behind https://github.com/OpenAdaptAI/OpenAdapt: so that users can demonstrate their desktop workflows to AI models step by step (without worrying about their data being used by a corporation).

iandanforth · 8 months ago

Context is important but it takes about two weeks to build a context collection bot and integrate it into slack. The hard part is not technical, AIs can rapidly build a company specific and continually updated knowledge base, it's political. Getting a drug company to let you tap slack and email and docs etc is dauntingly difficult.

lolinder · 8 months ago

Difficult to impossible. Their vendors are already working on AI features, so why would they risk adding a new vendor when a vendor they've already approved will have substantially the same capabilities soon?

energy123 · 8 months ago

This problem will be eaten by OpenAI et al. the same way the careful prompting strategies used in 2022/2023 were eaten. In a few years we will have context lengths of 10M+ or online fine tuning, combined with agents that can proactively call APIs and navigate your desktop environment.

Providing all context will be little more than copying and pasting everything, or just letting the agent do its thing.

Super careful or complicated setups to filter and manage context probably won't be needed.

OutOfHere · 8 months ago

Context requires quadratic VRAM. It is why OpenAI hasn't even supported 200k context length yet for its 4o model.

Is there a trick that bypasses this scaling constraint while strictly preserving the attention quality? I suspect that most such tricks lead to performance loss while deep in the context.

CharlieDigital · 8 months ago

Even if your context is a trillion tokens in length, the problem of creating that context still exists. It's still ETL and systems integration.

cyanydeez · 8 months ago

To bake a cake from scratch, you must first recreate the universe

prng2021 · 8 months ago

I agree but do see 1 realistic solution to solve the problem you describe. Every product on the market is independently integrating a LLM right now that has access to their product’s silo of information. I can imagine a future where a corporate employee interacts with 1 central LLM that in turn understands the domain of expertise of all the other system-specific LLMs. Given that knowledge, the central one can orchestrate prompting and processing responses from the others.

We been using this pattern forever with traditional APIs but the huge hurdle is that the information in any system you integrate with is often both complex and messy. LLMs handle the hard work of handling ambiguity and variations.

edanm · 8 months ago

I agree that context is one core focus, but I really don't agree that it's the only thing a startup can focus on.

Context aside, you have the generation aspect of it, which can be very important (models trained to output good SQL, or good legal contracts, etc). You have the UI, which is possibly the most important element of a good AI product (think the difference between an IDE and Copilot - very very different UX/UI for the same underlying model).

Context is incredibly important, and I agree that people are downplaying some aspects of ETL here (though this isn't standard ETL in some cases). But it's not even close to being everything.

markmiro · 8 months ago

Startups can still win against big players by building better products faster (with AI), collecting more / better data to feed AI, and then feeding that into better AI automation for customers. Big players won't automatically win, but more data is a moat that gives them room to mess up for a long time and still pull out ahead. Even then, big companies already compete against one another and swallowing a small AI startup can help them and therefore starting one can also make sense.

whimsicalism · 8 months ago

There are not really any startups in the position to feed AI the great data they have.

mycall · 8 months ago

I found that fine-tuning and RAG can be replaced with tool calling for some specialized domains, e.g. real-time data. Even things like user's location can be tool called, so context can be obtained reliably. I also note that GPT-4o and better are smart enough to chain together different functions you give it, but not reliably. System prompting helps some, but the non-determinism of AI today is both awesome and a cure.

CharlieDigital · 8 months ago

Tool calling is just systems integration with a different name. The job of the tool is still to provide context from some other system.

dragonwriter · 8 months ago

> I found that fine-tuning and RAG can be replaced with tool calling for some specialized domains

RAG is just a single-purpose instance of the more general process of tool calling, so, that's not surprising.

whimsicalism · 8 months ago

All of these comments are premised on this technology staying still. A model with memory and the ability to navigate the computer (we are already basically halfway there) would easily eliminate the problems you describe.

HN, i find, also has a tendency to fall prey to the bitter lesson.

layer8 · 8 months ago

There is a second, related problem: continuous learning. AI models won’t go anywhere as long as their state resets on each new session, and they revert to being like the new intern on their first day.

lukaspetersson · 8 months ago

I somewhat agree. The agent will be able to find the information autonomously. But some data will be proprietary and out of reach for the agent.

Startups should really try to get such a moat. Chapter 2 will cover this.

jgalt212 · 8 months ago

> There's only one core problem in AI worth solving for most startups building AI powered software: context.

Is this another way of saying "content is king"?

socrateslee · 8 months ago

AI code copilot like cursor provide a immersive context than most of other AI products.

jbverschoor · 8 months ago

And how does that differ from any person without that information?

CharlieDigital · 8 months ago

It doesn't.

And that's why the teams that really want to unlock AI will understand that the core problem is really systems integration and ETL; the AI needs to be aware of the entire corpus of relevant information through some mechanism (tool use, search, RAG, graph RAG, etc.) and the startups that win are the ones that are going to do that well.

You can't solve this problem with more compute nor better models.

I've said it elsewhere in this discussion, but the LLM is just a magical oven that's still reliant on good ingredients being prepped and put into the oven before hitting the "bake" button if you want amazing dishes to pop out. If you just want Stouffer's Mac & Cheese, it's already good enough for that.

osigurdson · 8 months ago

Wouldn't this just be foundational model + RAG in the limit?

CharlieDigital · 8 months ago

RAG is the action of retrieval to augment generation. Retrieval of what? From where?

The process that feeds RAG is all about how you extract, transform, and load source data into the RAG database. Good RAG is the output of good ETL.

skrebbel · 8 months ago

Yeah seems like context is the AI version of cache invalidation, in the sense of the joke that "there's only 2 hard problems in computer science, cache invalidation and naming things". It all boils down to that (that, and naming things)

throwpoaster · 8 months ago

And off-by-one errors :)

Scarblac · 8 months ago

Also, there's only one hard problem in software engineering: people.

Seems to apply to AI as well.

rthrfrd · 8 months ago

I agree - busy building to solve that :)

I think this argument only makes sense if you believe that AGI and/or unbounded AI agents are "right around the corner". For sure, we will progress in that direction, but when and if we truly get there–who knows?

If you believe, as I do, that these things are a lot further off than some people assume, I think there's plenty of time to build a successful business solving domain-specific workflows in the meantime, and eventually adapting the product as more general technology becomes available.

Let's say 25 years ago you had the idea to build a product that can now be solved more generally with LLMs–let's say a really effective spam filter. Even knowing what you know now, would it have been right at the time to say, "Nah, don't build that business, it will eventually be solved with some new technology?"

jillesvangurp · 8 months ago

I don't think it's that binary. We've had a lot of progress over the last 25 years; much of it in the last two. AGI is not a well defined thing that people easily agree on. So, determining whether we have it or not is actually not that simple.

Mostly people either get bogged down into deep philosophical debates or simply start listing things that AI can and cannot do (and why they believe why that is the case). Some of those things are codified in benchmarks. And of course the list of stuff that AIs can't do is getting stuff removed from it on a regular basis at an accelerating rate. That acceleration is the problem. People don't deal well with adapting to exponentially changing trends.

At some arbitrary point when that list has a certain length, we may or may not have AGI. It really depends on your point of view. But of course, most people score poorly on the same benchmarks we use for testing AIs. There are some specific groups of things where they still do better. But also a lot of AI researchers working on those things.

comex · 8 months ago

What acceleration?

Consider OpenAI's products as an example. GPT-3 (2020) was a massive step up in reasoning ability from GPT-2 (2019). GPT-3.5 (2022) was another massive step up. GPT-4 (2023) was a big step up, but not quite as big. GPT-4o (2024) was marginally better at reasoning, but mostly an improvement with respect to non-core functionality like images and audio. o1 (2024) is apparently somewhat better at reasoning at the cost of being much slower. But when I tried it on some puzzle-type problems I thought would be on the hard side for GPT-4o, it gave me (confidently) wrong answers every time. 'Orion' was supposed to be released as GPT-5, but was reportedly cancelled for not being good enough. o3 (2025?) did really well on one benchmark at the cost of $10k in compute, or even better at the cost of >$1m – not terribly impressive. We'll see how much better it is than o1 in practical scenarios.

To me that looks like progress is decelerating. Admittedly, OpenAI's releases have gotten more frequent and that has made the differences between each release seem less impressive. But things are decelerating even on a time basis. Where is GPT-5?

fuzzfactor · 8 months ago

>Let's say 25 years ago you had the idea to build a product

I resemble that remark ;)

>that can now be solved more generally with LLMs

Nope, sorry, not yet.

>"Nah, don't build that business, it will eventually be solved with some new technology?"

Actually I did listen to people like that to an extent, and started my business with the express intent of continuing to develop new technologies which would be adjacent to AI when it matured. Just better than I could at my employer where it was already in progress. It took a couple years before I was financially stable enough to consider layering in a neural network, but that was 30 years ago now :\

Wasn't possible to benefit with Windows 95 type of hardware, oh well, didn't expect a miracle anyway.

Heck, it's now been a full 45 years since I first dabbled in a bit of the ML with more kilobytes of desktop memory than most people had ever seen. I figured all that memory should be used for something, like memorizing, why not? Seemed logical. Didn't take long to figure out how much megabytes would help, but they didn't exist yet. And it became apparent that you could only go so far without a specialized computer chip of some kind to replace or augment a microprocessor CPU. What kind, I really had no idea :)

I didn't say they resembled 25-year-old ideas that much anyway ;)

>We've had a lot of progress over the last 25 years; much of it in the last two.

I guess it's understandable this has been making my popcorn more enjoyable than ever ;)

antonvs · 8 months ago

Agreed. There's a difference between developing new AI, and developing applications of existing AI. The OP seems to blur this distinction a bit.

The original "Bitter Lesson" article referenced in the OP is about developing new AI. In that domain, its point makes sense. But for the reasons you describe, it hardly applies at all to applications of AI. I suppose it might apply to some, but they're exceptions.

ilaksh · 8 months ago

You think it will be 25 years before we have a drop in replacement for most office jobs?

I think it will be less than 5 years.

You seem to be assuming that the rapid progress in AI will suddenly stop.

I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

Even if there is no progress in scaling memristors or any exotic new paradigm, high speed memory organized to localize data in frequently used neural circuits and photonic interconnects surely have multiple orders of magnitude of scaling gains in the next several years.

lolinder · 8 months ago

> You seem to be assuming that the rapid progress in AI will suddenly stop.

And you seem to assume that it will just continue for 5 years. We've already seen the plateau start. OpenAI has tacitly acknowledged that they don't know how to make a next generation model, and have been working on stepwise iteration for almost 2 years now.

Why should we project the rapid growth of 2021–2023 5 years into the future? It seems far more reasonable to project the growth of 2023–2025, which has been fast but not earth-shattering, and then also factor in the second derivative we've seen in that time and assume that it will actually continue to slow from here.

sealeck · 8 months ago

I think you're suffering from some survivorship bias here. There are lot of technologies that don't work out.

noch · 8 months ago

> You seem to be assuming that the rapid progress in AI will suddenly stop.

> I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

It's better to talk about actual numbers to characterise progress and measure scaling:

" By scaling I usually mean the specific empirical curve from the 2020 OAI paper. To stay on this curve requires large increases in training data of equivalent quality to what was used to derive the scaling relationships. "[^2]

"I predicted last summer: 70% chance we fall off the LLM scaling curve because of data limits, in the next step beyond GPT4.

[…]

I would say the most plausible reason is because in order to get, say, another 10x in training data, people have started to resort either to synthetic data, so training data that's actually made up by models, or to lower quality data."[^0]

“There were extraordinary returns over the last three or four years as the Scaling Laws were getting going,” Dr. Hassabis said. “But we are no longer getting the same progress.”[^1]

---

[^0]: https://x.com/hsu_steve/status/1868027803868045529

[^1]: https://x.com/hsu_steve/status/1869922066788692328

[^2]: https://x.com/hsu_steve/status/1869031399010832688

SoftTalker · 8 months ago

Also office jobs will be adapted to be a better fit to what AI can do, just as manufacturing jobs were adapted so that at least some tasks could be completed by robots.

fuzzfactor · 8 months ago

Not my downvote, just the opposite but I think you can do a lot in an office already if you start early enough . . .

At one time I would have said you should be able to have an efficient office operation using regular typewriters, copiers, filing cabinets, fax machines, etc.

And then you get Office 97, zip through everything and never worry about office work again.

I was pretty extreme having a paperless office when my only product is paperwork, but I got there. And I started my office with typewriters, nice ones too.

Before long Google gets going. Wow. No-ads information superhighway, if this holds it can only get better. And that's without broadband.

But that's besides the point.

Now it might make sense for you to at least be able to run an efficient office on the equivalent of Office 97 to begin with. Then throw in the AI or let it take over and see what you get in terms of output, and in comparison. Microsoft is probably already doing this in an advanced way. I think a factor that can vary over orders of magnitude is how does the machine leverage the abilities and/or tasks of the nominal human "attendant"?

One type of situation would be where a less-capable AI could augment a defined worker more effectively than even a fully automated alternative utilizing 10x more capable AI. There's always some attendant somewhere so you don't get a zero in this equation no matter how close you come.

Could be financial effectiveness or something else, the dividing line could be a moving target for a while.

You could even go full paleo and train the AI on the typewriters and stuff just to see what happens ;)

But would you really be able to get the most out of it without the momentum of many decades of continuous improvement before capturing it at the peak of its abilities?

GardenLetter27 · 8 months ago

We already have AGI in some ways though. Like I can use Claude for both generating code and helping with some maths problems and physics derivations.

It isn't a specific model for any of those problems, but a "general" intelligence.

Of course, it's not perfect, and it's obviously not sentient or conscious, etc. - but maybe general intelligence doesn't require or imply that at all?

SecretDreams · 8 months ago

For me, general intelligence from a computer will be achieved when it knows when it's wrong. You may say that humans also struggle with this, and I'd agree - but I think there's a difference between general intelligence and consciousness, as you said.

raincole · 8 months ago

> AGI in some ways

In other words, just AI, not AGI.