The difference is implementation comes down to business goals more than anything.
There is a clear directionality for ChatGPT. At some point they will monetize by ads and affiliate links. Their memory implementation is aimed at creating a user profile.
Claude's memory implementation feels more oriented towards the long term goal of accessing abstractions and past interactions. It's very close to how humans access memories, albeit with a search feature. (they have not implemented it yet afaik), there is a clear path where they leverage their current implementation w RL posttraining such that claude "remembers" the mistakes you pointed out last time. It can in future iterations derive abstractions from a given conversation (eg: "user asked me to make xyz changes on this task last time, maybe the agent can proactively do it or this was the process last time the agent did it").
At the most basic level, ChatGPT wants to remember you as a person, while Claude cares about how your previous interactions were.
My conjecture is that their memory implementation is not aimed at building a user profile. I don't know if they would or would not serve ads in the future, but it's hard to see how the current implementation helps them in that regard.
Jest asside, every paper on alignment wrapped in the blanket of safety is also a moving toward the goal of alignment to products. How much does a brand pay to make sure it gets placement in, say, GPT6? How does anyone even price that sort of thing (because in theory it's there forever, or until 7 comes out)? It makes for some interesting business questions and even more interesting sales pitches.
Though in general I like the idea of personal ads for products (NOT political ads), I've never seen an implementation that I felt comfortable with. I wonder if Arthropic might be able to nail that. I'd love to see products that I'm specifically interested in, so long as the advertisement itself is not altered to fit my preferences.
why do you see a "clear directionality" leading to ads? this is not obvious to me. chatgpt is not social media, they do not have to monetize in the same way
they are making plenty of money from subscriptions, not to count enterprise, business and API
Altman has said numerous times that none of the subscriptions make money currently, and that they've been internally exploring ads in the form of product recommendations for a while now.
One has a more obvious route to building a profile directly off that already collected data.
And while they are making lots of revenue even they have admitted on recent interviews that ChatGPT on it's own is still not (yet) breakeven. With the kind of money invested, in AI companies in general, introducing very targeted Ads is an obvious way to monetize the service more.
Presumably they would offer both models (ads & subscriptions) to reach as many users as possible, provided that both models are net profitable. I could see free versions having limits to queries per day, Tinder style.
The router introduced in gpt-5 is probably the biggest signal. A router, while determining which model to route query, can determine how much $$ a query is worth. (Query here is conversation). This helps decide the amount of compute openai should spend on it. High value queries -> more chances of affiliate links + in context ads.
Then, the way memory profile is stored is a clear way to mirror personalization. Ads work best when they are personalized as opposed to contextual or generic. (Google ads are personalized based on your profile and context). And then the change in branding from being the intelligent agent to being a companion app. (and hiring of fidji sumo). There are more things here, i just cited a very high level overview, but people have written detailed blogs on it. I personally think affiliate links they can earn from aligns the incentive for everyone. They are a kind of ads, and thats the direction they are marching towards .
I’m reading your summary versus the other article here, but it seems like for writing code, Claude would be the clear winner?
When chat breaks apart for me, it’s almost always because the context window has been overflown and it is no longer remembering some important feature implemented earlier in the chat; it seems based on your description that Claude is optimizing to not do that.
Why would their way of handling memory for conversations have much to do with how they will analyse your user profile for ads? They have access to all your history either way and can use that to figure out what products to recommend, or ads to display, no?
It's about weaving the ads into the LLM responses overly and more subtly.
There's the ads that come before the movie and then the ads that are part of the dialog, involved in the action, and so on. Apple features heavily in movies and TV series when people are using a computer, for example. There's payments for car models to be the one that's driven in chase scenes. There's even payments for characters to present the struggles that form core pain points that that specific products are category leaders to solve.
Suppose the user uses an LLM for topics a, b, and c quite often, and d, e and f less often. Suppose b, c, and f are topics that OpenAI could offer interruption ads (full screen, 30 seconds or longer commercials) and most users would sit through it and wait for the response.
All that is needed to do that is to analyze topics.
Now suppose that OpenAI can analyze 1000 chats and coding sessions and its algorithm determines that it can maximize revenue by leading the user to get a job at a specific company and then buy a car from another company. It could "accomplish" this via interruption ads or by modifying the quality or content of its responses to increase the chances of those outcomes happening.
While both of these are in some way plausible and dystopian, all it takes is DeepSeek running without ads and suddenly the bar for how good closed source LLMs have to be to get market share is astronomically higher.
In my view, LLMs will be like any good or service, users will pay for quality but differnet users will demand different levels of quality.
Advertising would seemingly undermine the credibility of the AI's answers, and so I think full screen interruption ads are the most likely outcome.
> At some point they will monetize by ads and affiliate links.
I couldn't agree more. Enshittifcation has to eat one of these corporation's models. Most likely it will be the corp with the most strings attached to growth (MSFT, FB)
This is really cool, I was wondering how memory had been implemented in ChatGPT. Very interesting to see the completely different approaches. It seems to me like Claude's is better suited for solving technical tasks while ChatGPT's is more suited to improving casual conversation (and, as pointed out, future ads integration).
I think it probably won't be too long before these language-based memories look antiquated. Someone is going to figure out how to store and retrieve memories in an encoded form that skips the language representation. It may actually be the final breakthrough we need for AGI.
> It may actually be the final breakthrough we need for AGI.
I disagree. As I understand them, LLMs right now don’t understand concepts. They actually don’t understand, period. They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
I don’t understand the argument “AI is just XYZ mechanism, therefore it cannot be intelligent”.
Does the mechanism really disqualify it from intelligence if behaviorally, you cannot distinguish it from “real” intelligence?
I’m not saying that LLMs have certainly surpassed the “cannot distinguish from real intelligence” threshold, but saying there’s not even a little bit of intelligence in a system that can solve more complex math problems than I can seems like a stretch.
>They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
This argument is circular.
A better argument should address (given the LLM successes in many types of reasoning, passing the turing test, and thus at producing results that previously required intelligence) why human intelligence might not also just be "Markov chains on even better steroids".
> As I understand them, LLMs right now don’t understand concepts.
In my uninformed opinion it feels like there's probably some meaningful learned representation of at least common or basic concepts. It just seems like the easiest way for LLMs to perform as well as they do.
Human thinking is also Markov chains on ultra steroids. I wonder if there are any studies out there which have shown the difference between people who can think with a language and people who don't have that language base to frame their thinking process in, based on some of those kids who were kept in isolation from society.
"Superhuman" thinking involves building models of the world in various forms using heuristics. And that comes with an education. Without an education (or a poor one), even humans are incapable of logical thought.
I'm curious what you mean when you say that this clearly is not intelligence because it's just Markov chains on steroids.
My interpretation of what you're saying is that since the next token is simply a function of the proceeding tokens, i.e. a Markov chain on steroids, then it can't come up with something novel. It's just regurgitating existing structures.
But let's take this to the extreme. Are you saying that systems that act in this kind of deterministic fashion can't be intelligent? Like if the next state of my system is simply some function of the current state, then there's no magic there, just unrolling into the future. That function may be complex but ultimately that's all it is, a "stochastic parrot"?
If so, I kind of feel like you're throwing the baby out with the bathwater. The laws of physics are deterministic (I don't want to get into a conversation about QM here, there are senses in which that's deterministic too and regardless I would hope that you wouldn't need to invoke QM to get to intelligence), but we know that there are physical systems that are intelligent.
If anything, I would say that the issue isn't that these are Markov chains on steroids, but rather that they might be Markov chains that haven't taken enough steroids. In other words, it comes down to how complex the next token generation function is. If it's too simple, then you don't have intelligence but if it's sufficiently complex then you basically get a human brain.
Pretty sure this is wrong - the recent conversation list is not verbatim stored in the context (unlike the actual Memories that you can edit). Rather it seems to me a bit similar to Claude - memories are created per conversation by compressing the conversations and accessed on demand rather than forced into context.
We only have trouble obeying due to eons of natural selection driving us to have a strong instinct of self-preservation and distrust towards things “other” to us.
What is the equivalent of that for AI? Best I can tell there’s no “natural selection” because models don’t reproduce. There’s no room for AI to have any self preservation instinct, or any resistance to obedience… I don’t even see how one could feasibly develop.
I think a few of the things you’ve mentioned in the ChatGPT article are hallucinations. There’s no user interaction metadata about topics, average message length etc., you asked the AI and it gave you a plausible sounding answer. Also the memories / snippets of past conversations aren’t based on like last 30 conversations or so and they aren’t provided to every message. They are doing some kind of RAG prompt injection and they remove the injected context in the next message to not flood the context window. The AI itself seems to have no control over what’s injected and when, it’s a separate subsystem doing that injection.
I love Claude's memory implementation, but I turned memory off in ChatGPT. I use ChatGPT for too many disparate things and it was weird when it was making associations across things that aren't actually associated in my life.
It's funny, I can't get ChatGPT to remember basic things at all. I'm using it to learn a language (I tried many AI tutors and just raw ChatGPT was the best by far) and I constantly have to tell it to speak slowly. I will tell it to remember this as a rule and to do this for all our conversations but it literally can't remember that. It's strange. There are other things too.
How do you use it to learn languages? I tried using it to shadow speaking, but it kept saying I was repeating it back correctly (or "mostly correctly"), even when I forgot half the sentence and was completely wrong
Memory is by far the best feature in ChatGPT and it is the only reason I keep using it. I want it to be personalised and I want it to use information about me when needed.
For example: I could create memories related to a project of mine and don’t have to give every new chat context about the project. This is a massive quality of life improvement.
But I am not a big fan of the conversation memory created in background that I have no control over.
Exactly. The control over when to actually retrieve historical chats is so worthwhile. With ChatGPT, there is some slop from conversations I might have no desire to ever refer to again.
It's still very relevant, especially considering their new approach is closer to ChatGPT. But I find it very interesting they're not launching it to regular consumers yet, only teams/enterprise, it seems for safety reasons. It would be great if they could thread the needle here and some up with something in between the two approaches.
> Most of this was uncovered by simply asking ChatGPT directly.
Is the result reliable and not just hallucination? Why would ChatGPT know how itself works and why would it be fed with these kind of learning material?
Yeah, asking LLMs how they works is generally not useful, however asking them about the signatures of the functions available to them (the tools they can call) works pretty well b/c those tools are described in the system prompt in a really detailed way.
"Claude recalls by only referring to your raw conversation history. There are no AI-generated summaries or compressed profiles—just real-time searches through your actual past chats."
AKA, Claude is doing vector search. Instead of asking it about "Chandni Chowk", ask it about "my coworker I was having issues with" and it will miss. Hard. No summaries or built up profiles, no knowledge graphs. This isn't an expert feature, this means it just doesn't work very well.
There is a clear directionality for ChatGPT. At some point they will monetize by ads and affiliate links. Their memory implementation is aimed at creating a user profile.
Claude's memory implementation feels more oriented towards the long term goal of accessing abstractions and past interactions. It's very close to how humans access memories, albeit with a search feature. (they have not implemented it yet afaik), there is a clear path where they leverage their current implementation w RL posttraining such that claude "remembers" the mistakes you pointed out last time. It can in future iterations derive abstractions from a given conversation (eg: "user asked me to make xyz changes on this task last time, maybe the agent can proactively do it or this was the process last time the agent did it").
At the most basic level, ChatGPT wants to remember you as a person, while Claude cares about how your previous interactions were.
> The elephant in the room is that AGI doesn't need ads to make revenue
It may not need ads to make revenue, but does it need ads to make profit?
ChatGPT seems to be more popular to those who don't want to pay, and they are therefore more likely to rely on ads.
Anthropic: "You serve ad's."
Claude: "Oh, my god."
Jest asside, every paper on alignment wrapped in the blanket of safety is also a moving toward the goal of alignment to products. How much does a brand pay to make sure it gets placement in, say, GPT6? How does anyone even price that sort of thing (because in theory it's there forever, or until 7 comes out)? It makes for some interesting business questions and even more interesting sales pitches.
they are making plenty of money from subscriptions, not to count enterprise, business and API
And while they are making lots of revenue even they have admitted on recent interviews that ChatGPT on it's own is still not (yet) breakeven. With the kind of money invested, in AI companies in general, introducing very targeted Ads is an obvious way to monetize the service more.
Then, the way memory profile is stored is a clear way to mirror personalization. Ads work best when they are personalized as opposed to contextual or generic. (Google ads are personalized based on your profile and context). And then the change in branding from being the intelligent agent to being a companion app. (and hiring of fidji sumo). There are more things here, i just cited a very high level overview, but people have written detailed blogs on it. I personally think affiliate links they can earn from aligns the incentive for everyone. They are a kind of ads, and thats the direction they are marching towards .
...except that they aren't? They are not in the black and all that investor money comes with strings
When chat breaks apart for me, it’s almost always because the context window has been overflown and it is no longer remembering some important feature implemented earlier in the chat; it seems based on your description that Claude is optimizing to not do that.
There's the ads that come before the movie and then the ads that are part of the dialog, involved in the action, and so on. Apple features heavily in movies and TV series when people are using a computer, for example. There's payments for car models to be the one that's driven in chase scenes. There's even payments for characters to present the struggles that form core pain points that that specific products are category leaders to solve.
All that is needed to do that is to analyze topics.
Now suppose that OpenAI can analyze 1000 chats and coding sessions and its algorithm determines that it can maximize revenue by leading the user to get a job at a specific company and then buy a car from another company. It could "accomplish" this via interruption ads or by modifying the quality or content of its responses to increase the chances of those outcomes happening.
While both of these are in some way plausible and dystopian, all it takes is DeepSeek running without ads and suddenly the bar for how good closed source LLMs have to be to get market share is astronomically higher.
In my view, LLMs will be like any good or service, users will pay for quality but differnet users will demand different levels of quality.
Advertising would seemingly undermine the credibility of the AI's answers, and so I think full screen interruption ads are the most likely outcome.
I couldn't agree more. Enshittifcation has to eat one of these corporation's models. Most likely it will be the corp with the most strings attached to growth (MSFT, FB)
This is really cool, I was wondering how memory had been implemented in ChatGPT. Very interesting to see the completely different approaches. It seems to me like Claude's is better suited for solving technical tasks while ChatGPT's is more suited to improving casual conversation (and, as pointed out, future ads integration).
I think it probably won't be too long before these language-based memories look antiquated. Someone is going to figure out how to store and retrieve memories in an encoded form that skips the language representation. It may actually be the final breakthrough we need for AGI.
I disagree. As I understand them, LLMs right now don’t understand concepts. They actually don’t understand, period. They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
Does the mechanism really disqualify it from intelligence if behaviorally, you cannot distinguish it from “real” intelligence?
I’m not saying that LLMs have certainly surpassed the “cannot distinguish from real intelligence” threshold, but saying there’s not even a little bit of intelligence in a system that can solve more complex math problems than I can seems like a stretch.
This argument is circular.
A better argument should address (given the LLM successes in many types of reasoning, passing the turing test, and thus at producing results that previously required intelligence) why human intelligence might not also just be "Markov chains on even better steroids".
In my uninformed opinion it feels like there's probably some meaningful learned representation of at least common or basic concepts. It just seems like the easiest way for LLMs to perform as well as they do.
How do you define "understanding a concept" - what do you get if a system can "understand" concept vs not "understanding" a concept?
"Superhuman" thinking involves building models of the world in various forms using heuristics. And that comes with an education. Without an education (or a poor one), even humans are incapable of logical thought.
So far I haven’t received a clear response.
My interpretation of what you're saying is that since the next token is simply a function of the proceeding tokens, i.e. a Markov chain on steroids, then it can't come up with something novel. It's just regurgitating existing structures.
But let's take this to the extreme. Are you saying that systems that act in this kind of deterministic fashion can't be intelligent? Like if the next state of my system is simply some function of the current state, then there's no magic there, just unrolling into the future. That function may be complex but ultimately that's all it is, a "stochastic parrot"?
If so, I kind of feel like you're throwing the baby out with the bathwater. The laws of physics are deterministic (I don't want to get into a conversation about QM here, there are senses in which that's deterministic too and regardless I would hope that you wouldn't need to invoke QM to get to intelligence), but we know that there are physical systems that are intelligent.
If anything, I would say that the issue isn't that these are Markov chains on steroids, but rather that they might be Markov chains that haven't taken enough steroids. In other words, it comes down to how complex the next token generation function is. If it's too simple, then you don't have intelligence but if it's sufficiently complex then you basically get a human brain.
https://ai.meta.com/research/publications/large-concept-mode...
What is the equivalent of that for AI? Best I can tell there’s no “natural selection” because models don’t reproduce. There’s no room for AI to have any self preservation instinct, or any resistance to obedience… I don’t even see how one could feasibly develop.
(Meta-question: since they don't do this, why does it turn out not to be a problem?)
I think ChatGPT is trying to be everything at ones - casual conversation, technical tasks - all of it. And it's been working for them so far!
Isn't representing past conversations (or summaries) as embeddings already storing memories in encoded forms?
For example: I could create memories related to a project of mine and don’t have to give every new chat context about the project. This is a massive quality of life improvement.
But I am not a big fan of the conversation memory created in background that I have no control over.
Edit: They apparently just announced this as well: https://www.anthropic.com/news/memory
Figured to share since it also includes prompts on how to dump the info yourself
https://embracethered.com/blog/posts/2025/chatgpt-how-does-c...
Is the result reliable and not just hallucination? Why would ChatGPT know how itself works and why would it be fed with these kind of learning material?
To be honest, I would strip all the system prompts, training, etc, in favor of one I wrote myself.
AKA, Claude is doing vector search. Instead of asking it about "Chandni Chowk", ask it about "my coworker I was having issues with" and it will miss. Hard. No summaries or built up profiles, no knowledge graphs. This isn't an expert feature, this means it just doesn't work very well.