OpenAI slams court order to save all ChatGPT logs, including deleted chats

Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

Billion people use the internet daily. If any organization suspects some people use the Internet for illicit purposes eventually against their interests, would the court order the ISP to log all activities of all people? Would Google be ordered to save the search of all its customers because some might use it for bad things? And once we start, where will we stop? Crimes could happen in the past or in the future, will the court order the ISP and Google to retain the logs for 10 years, 20 years? Why not 100 years? Who should bear the cost for such outrageous demands?

The consequences of such orders are of enormous impact the puny judge can not even begin to comprehend. Privacy right is an integral part of the freedom of speech, a core human right. If you don’t have private thoughts, private information, anybody can be incriminated against them using these past information. We will cease to exist as individuals and I argue we will cease to exist as human as well.

capnrefsmmat · 3 months ago

Courts have always had the power to compel parties to a current case to preserve evidence. (For example, this was an issue in the Google monopoly case, since Google employees were using chats set to erase after 24 hours.) That becomes an issue in the discovery phase, well after the defendant has an opportunity to file a motion to dismiss. So a case with no specific allegation of wrongdoing would already be dismissed.

The power does not extend to any of your hypotheticals, which are not about active cases. Courts do not accept cases on the grounds that some bad thing might happen in the future; the plaintiff must show some concrete harm has already occurred. The only thing different here is how much potential evidence OpenAI has been asked to retain.

dragonwriter · 3 months ago

> Courts have always had the power to compel parties to a current case to preserve evidence.

Not just that, even without a specific court order parties to existing or reasonably anticipated litigation have a legal obligation that attaches immediately to preserve evidence. Courts tend to issue orders when a party presents reason to believe another party is out of compliance with that automatic obligation, or when there is a dispute over the extent of the obligation. (In this case, both factors seem to be in play.)

golol · 3 months ago

So if Amazon sues Google, claiming that it is being disadvantaged in search rankings, a court should be able to force Google to log all search activity, even when users delete it?

lcnPylGDnU4H9OF · 3 months ago

So then the courts need to find who is setting their chats do be deleted and order them to stop. Or find specific infringing chatters and order OpenAI to preserve these specified users’ logs. OpenAI is doing the responsible thing here.

MeIam · 3 months ago

Time does not need user logs to prove such a thing if it was true. Times can show that it is possible so they can show how their own users can access the text. Why would they need other user's data?

mrtksn · 3 months ago

>Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

Probably because they bothered to pursue such a thing and hundreds of millions people did not.

How do you conclusively know if someone's content generating machine infringe with your rights? By saving all of its input/output for investigation.

It's ridiculous, sure but is it less ridiculous than AI companies claiming that the copyrights shouldn't apply to them because it will be bad for their business?

IMHO those are just growth pain. Back in the day people used to believe that the law don't apply on them because they did it on the internet and they were mostly right because the laws were made for another age. Eventually the laws both for criminal stuff and copyright caught up. Will be the same for AI, now we are in the wild west age of AI.

TimPC · 3 months ago

AI companies aren't seriously arguing that copyright shouldn't apply to them because "it's bad for business". The main argument is that they qualify for fair use because their work is transformative which is one of the major criteria for fair use. Fair use is the same doctrine that allows a school to play a movie for educational purposes without acquiring a license for the public performance of that movie. The original works don't have model weights and can't answer questions or interact with a user so the output is substantially different from the input.

shkkmo · 3 months ago

> It's ridiculous, sure but is it less ridiculous than AI companies claiming that the copyrights shouldn't apply to them because it will be bad for their business?

Since that wasn't ever a real argument, your strawman is indeed ridiculous.

The argument is that requiring people to have a special license to process text with an algorithm is a dramatic expansion of the power of copyright law. Expansions of copyright law will inherently advantage large corporate users over individuals as we see already happening here.

New York Times thinks that they have the right to spy on the entire world to see if anyone might be trying to read articles for free.

That is the problem with copyright. That is why copyright power needs to be dramatically curtailed, not dramatically expanded.

dogman144 · 3 months ago

You raise good points but the target of your support feels misplaced. Want private ai? You must self-host and inspect if it’s phoning home. No way around it in my view.

Otherwise, you are picking your data privacy champions as the exact same companies, people and investors that sold us social media, and did something quite untoward with the data they got. Fool me twice, fool me three times… where is the line?

In other words - OAI has to save logs now? Candidly they probably were already, or it’s foolish not to assume that.

jrm4 · 3 months ago

Love the spirit of what you say and I practice it myself, literally.

But also, no - Just self-host or it's all your fault is never ever a sufficient answer to the problem.

It's exactly the same as when Exxon says "what are you doing to lower your own carbon footprint?" It's shifting the burden unfairly; companies like OpenAI put themselves out there and thus must ALWAYS be held to task.

tailspin2019 · 3 months ago

> Privacy right is an integral part of the freedom of speech

I completely agree with you, but as a ChatGPT user I have to admit my fault in this too.

I have always been annoyed by what I saw as shameless breaches of copyright of thousands of authors (and other individuals) in the training of these LLMs, and I've been wary of the data security/confidentiality of these tools from the start too - and not for no reason. Yet I find ChatGPT et al so utterly compelling and useful, that I poured my personal data[0] into these tools anyway.

I've always felt conflicted about this, but the utility just about outweighed my privacy and copyright concerns. So as angry as I am about this situation, I also have to accept some of the blame too. I knew this (or other leaks or unsanctioned use of my data) was possible down the line.

But it's a wake up call. I've done nothing with these tools which is even slightly nefarious, but I am today deleting all my historical data (not just from ChatGPT[1] but other hosted AI tools) and will completely reassess my approach of using them - likely with an acceleration of my plans to move to using local models as much as I can.

[0] I do heavily redact my data that goes into hosted LLMs, but there's still more private data in there about me than I'd like.

[1] Which I know is very much a "after the horse has bolted" situation...

CamperBob2 · 3 months ago

Keeping in mind that the purpose of IP law is to promote human progress, it's hard to see how legacy copyright interests should win a fight with AI training and development.

100 years from now, nobody will GAF about the New York Times.

fluidcruft · 3 months ago

A pretty clear distinction is that all ISPs in the world are not currently involved in a lawsuit with New York Times and are not accused of deleting evidence. What OpenAI is accused of is significantly different from merely agnostically routing packets between A and B. OpenAI is not raising astronomical funds because they operate as an ISP.

DannyBee · 3 months ago

Lawyer here

First - in the US, privacy is not a constitutional right. It should be, but it's not. You are protected against government searches, but that's about it. You can claim it's a core human right or whatever, but that doesn't make it true, and it's a fairly reductionist argument anyway. It has, fwiw, also historically not been seen as a core right for thousands of years. So i think it's a harder argument to make than you think despite the EU coming around on this. Again, I firmly believe it should be a core right, but asserting that it is doesn't make that true.

Second, if you want the realistic answer - this judge is probably overworked and trying to clear a bunch of simple motions off their docket. I think you probably don't realize how many motions they probably deal with on a daily basis. Imagine trying to get through 145 code reviews a day or something like that. In this case, this isn't the trial, it's discovery. Not even discovery quite yet, if i read the docket right. Preservation orders of this kind are incredibly common in discovery, and it's not exactly high stakes most of the time. Most of the discovery motions are just parties being a pain in the ass to each other deliberately. This normally isn't even a thing that is heard in front of a judge directly, the judge is usually deciding on the filed papers.

So i'm sure the judge looked at it for a few minutes, thought it made sense at the time, and approved it. I doubt they spent hours thinking hard about the consequences.

OpenAI has asked to be heard in person on the motion, i'm sure the judge will grant it, listen to what they have to say, and determine they probably fucked it up, and fix it. That is what most judges do in this situation.

tiahura · 3 months ago

While the Constitution does not explicitly enumerate a "right to privacy," the Supreme Court has consistently recognized substantive privacy rights through Due Process Clause jurisprudence, establishing constitutional protection for intimate personal decisions in Griswold v. Connecticut (1965), Lawrence v. Texas (2003), and Obergefell v. Hodges (2015).

shkkmo · 3 months ago

> It has, fwiw, also historically not been seen as a core right for thousands of years. So i think it's a harder argument to make than you think despite the EU coming around on this.

This doesn't seem true. I'd assume you know more about this than I do though so can you explain this in more detail? The concept of privacy is definitely more than thousands of years old. The concept of a "human right", is arguably much newer. Do you have particular evidence that a right to privacy is a harder argument to make that other human rights?

While the language differs, the right to privacy is enshrined more or less explicitly in many constitutions, including 11 USA states. It isn't just a "european" thing.

zerocrates · 3 months ago

Even in the "protected against government searches" sense from the 4th Amendment, that right hardly exists when dealing with data you send to a company like OpenAI thanks to the third-party doctrine.

ComposedPattern · 3 months ago

> It has, fwiw, also historically not been seen as a core right for thousands of years.

Nothing has been seen as a core right for thousands of years, as the concept of human rights is only a few hundred years old.

pama · 3 months ago

Thanks. As an EU citizen am I exempt from this order? How does the judge or the NYTimes or OpenAI know that I am an EU citizen?

HardCodedBias · 3 months ago

"First - in the US, privacy is not a constitutional right"

What? The supreme court disagreed with you in Griswold v. Connecticut (1965) and Roe v. Wade (1973).

While one could argue that they were vastly stretching the meaning of words in these decisions the point stands that at this time privacy is a constitutional right in the USA.

Dead Comment

wat10000 · 3 months ago

ChatGPT isn’t like an ISP here. They are being credibly accused of basing their entire business on illegal activity. It’s more like if The Pirate Bay was being sued. The alleged infringement is all they do, and requiring them to preserve records of their users is pretty reasonable.

piombisallow · 3 months ago

Regardless of the details of this specific case, the courts are not democratic and do not decide based on the interest of the parties or how many they are, they decide based on the law.

brookst · 3 months ago

This is not true even in the slightest.

The law is not a deterministic computer program. It’s a complex body of overlapping work and the courts are specifically chartered to use judgement. That’s why briefs from two parties in a dispute will often cite different laws and precedents.

For instance, Winter v. NRDC specifically says that courts must consider whether an injunction is in the public interest.

dragonwriter · 3 months ago

> Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

Because the law favors preservation of evidence for an active case above most other interests. It's not a matter of arbitrary preference by the particular court.

cactusplant7374 · 3 months ago

> Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

It simply didn't. ChatGPT hasn't deleted any user data.

> "OpenAI did not 'destroy' any data, and certainly did not delete any data in response to litigation events," OpenAI argued. "The Order appears to have incorrectly assumed the contrary."

It's a bit of a stretch to think a big tech company like ChatGPT is deleting users' data.

blackqueeriroh · 3 months ago

This is incorrect. As someone who has had the opportunity to work in several highly=regulated industries, companies do not want to hold onto extra data about you that they don’t have to unless their business is selling that data.

OpenAI already has a business, and not one they want to violate by having a massive amount of customer data stolen if they get hacked.

Deleted Comment

resource_waste · 3 months ago

>Privacy right is an integral part of the freedom of speech, a core human right.

Are these contradictory?

If you overhear a friend gossiping, can't you spread that gossip?

Also, where are human rights located? I'll give you a microscope.(sorry, I'm a moral anti-realist/expressivist and I can't help myself)

152132124 · 3 months ago

I think you will have a better time arguing with a LLM

deadbabe · 3 months ago

OpenAI is a business selling a product, it’s not a decentralized network of computers contributing spare processing power to run massive LLMs. Therefore, you can easily point a finger at them and tell them to stop some activity for which they are the sole gatekeeper.

oersted · 3 months ago

I completely agree with you. But perhaps we should be more worried that OpenAI or Google can retain all this data and do pretty much what they want with it in the first place, without a judge getting into the picture.

fireflash38 · 3 months ago

In your arguments for privacy, do you consider privacy from OpenAI?

rvnx · 3 months ago

Cut a joke about ethics and OpenAI

humpty-d · 3 months ago

I fail to see how saving all logs advances that cause

trod1234 · 3 months ago

It doesn't, it favors longstanding caselaw and laws already on the books.

There is a longstanding precedent with regards to business document retention, and chat logs have been part of that for years if not decades. The article tries to make this sound like this is something new, but if you look at the e-retention guidelines in various cases over the years this is all pretty standard.

For a business to continue operating, they must preserve business documents and related ESI upon an appropriate legal hold to avoid spoliation. They likely weren't doing this claiming the data was deleted, which is why the judge ruled in favor against OAI.

This isn't uncommon knowledge either, its required. E-discovery and Information Governance are things any business must meet in this area; and those documents are subject to discovery in certain cases, where OAI likely thought they could avoid it maliciously.

The matter here is OAI and its influence rabble are churning this trying to do a runaround on longstanding requirements that any IT professional in the US would have reiterated from their legal department/Information Governance policies.

There's nothing to see here, there's no real story. They were supposed to be doing this and didn't, were caught, and the order just forces them to do what any other business is required to do.

I remember an executive years ago (decades really), asking about document retention, ESI, and e-discovery and how they could do something (which runs along similar lines to what OAI tried as a runaround). I remember the lawyer at the time saying, "You've gotta do this or when it goes to court you will have an indefensible position as a result of spoliation...".

You are mistaken, and appear to be trying to frame this improperly towards a point of no accountability.

I suggest you review the longstanding e-discovery retention requirements that courts require of businesses to operate.

This is not new material, nor any different from what's been required for a long time now. All your hyperbole about privacy is without real basis, they are a company; they must comply with law, and it certainly is not outrageous to hold people who break the law to account, and this can only occur when regulatory requirements are actually fulfilled.

There is no argument here.

References: Federal Rules of Civil Procedure (FRCP) 1, 4, 16, 26, 34, 37

There are many law firms who have written extensively on this and related subjects. I encourage you to look at those too.

(IANAL) Disclosure: Don't take this as legal advice. I've had the opportunity to work with quite a few competent ones, but I don't interpret the law; only they can. If you need someone to provide legal advice seek out competent qualified counsel.

Dead Comment

baobun · 3 months ago

It's a honeypot from the beginning y'all

rolandog · 3 months ago

> Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

Can't you use the same arguments against, say, Copyright holders? Billionaires? Corporations doing the Texas two-step bankruptcy legal maneuver to prevent liability from allegedly poisoning humanity?

I sure hope so.

Edit: ... (up to a point)

Dead Comment

huijzer · 3 months ago

> Why could a court favor the interest of the New York Times in a vague accusation versus the interest and right of hundred millions people?

Well maybe some people in power have pressured the court into this decision? The New York Times surely has some power as well via their channels

Note that this also applies to GPT models on the API

> That risk extended to users of ChatGPT Free, Plus, and Pro, as well as users of OpenAI’s application programming interface (API), OpenAI said.

This seems very bad for their business.

merksittich · 3 months ago

Interesting detail from the court order [0]: When asked by the judge if they could anonymize chat logs instead of deleting them, OpenAI's response effectively dodged the "how" and focused on "privacy laws mandate deletion." This implicitly admits they don't have a reliable method to sufficiently anonymize data to satisfy those privacy concerns.

This raises serious questions about the supposed "anonymization" of chat data used for training their new models, i.e. when users leave the "improve model for all users" toggle enabled in the settings (which is the default even for paying users). So, indeed, very bad for the current business model which appears to rely on present users (voluntarily) "feeding the machine" to improve it.

[0] https://cdn.arstechnica.net/wp-content/uploads/2025/06/NYT-v...

Kon-Peki · 3 months ago

Thank you for the link to the actual text!

So, the NYT asked for this back in January and the court said no, but asked OpenAI if there was a way to accomplish the preservation goal in a privacy-preserving manner. OpenAI refused to engage for 5 f’ing months. The court said “fine, the NYT gets what they originally asked for”.

Nice job guys.

blackqueeriroh · 3 months ago

That’s not an implicit admission, it’s refusing to argue something they don’t want to do.

neilv · 3 months ago

Some established businesses will need to review their contracts, regulations, and risk tolerance.

And wrapper-around-ChatGPT startups should double-check their privacy policies, that all the "you have no privacy" language is in place.

999900000999 · 3 months ago

I'm not going to look up the comment, but a few months back I called this out and said if you seriously want to use any LLM in a privacy sensitive context you need to self host.

For example, if there are business consequences for leaking customer data, you better run that LLM yourself.

Etheryte · 3 months ago

In the European privacy framework, and legal framework at large, you can't terms of service away requirements set by the law. If the law requires you to keep the logs, there is nothing you can get the user to sign off on to get you out of it.

cj · 3 months ago

> Some established businesses will need to review their contracts, regulations, and risk tolerance.

I've reviewed a lot of SaaS contracts over the years.

Nearly all of them have clauses that allow the vendor to do whatever they have to if ordered to by the government. That doesn't make it okay, but it means OpenAI customers probably don't have a legal argument, only a philosophical argument.

Same goes for privacy policies. Nearly every privacy policy has a carve out for things they're ordered to do by the government.

Chris2048 · 3 months ago

Just to be pedantic, could the company encrypt the logs with a third-party key in escrow, s.t they would not be able to access that data, but the third party could provide access e.g. for a court.

Wowfunhappy · 3 months ago

> And wrapper-around-ChatGPT startups should double-check their privacy policies, that all the "you have no privacy" language is in place.

If a court orders you to preserve user data, could you be held liable for preserving user data? Regardless of your privacy policy.

johnQdeveloper · 3 months ago

> This seems very bad for their business.

Well, it is gonna be all _AI Companies_ very soon so unless everyone switches to local models which don't really have the same degree of profitability as a SaaS, its probably not going to kill a company to have less user privacy because tbh people are used to not having privacy these days on the internet.

It certainly will kill off the few companies/people trusting them with closed source code or security related stuff but you really should not outsource that anywhere.

csomar · 3 months ago

Did an American court just destroy all American AI companies in favor of open weight Chinese models?

bsder · 3 months ago

> It certainly will kill off the few companies/people trusting them with closed source code or security related stuff but you really should not outsource that anywhere.

And how many companies have proprietary code hosted on Github?

mountainriver · 3 months ago

You can fine tune models on a multitenant base model and it’s often more profitable.

SchemaLoad · 3 months ago

>don't really have the same degree of profitability as a SaaS

They have a fair bit. Local models lets companies sell you a much more expensive bit of hardware. Once Apple gets their stuff together it could end up being a genius move to go all in on local after the others have repeated scandals of leaking user data.

ivape · 3 months ago

Going to drop a PG tweet:

https://x.com/paulg/status/1913338841068404903

"It's a very exciting time in tech right now. If you're a first-rate programmer, there are a huge number of other places you can go work rather than at the company building the infrastructure of the police state."

---

So, courts order the preservation of AI logs, and government orders the building of a massive database. You do the math. This is such an annoying time to be alive in America, to say the least. PG needs to start blogging again about what's going on now days. We might be entering the digital version of the 60s, if we're lucky. Get local, get private, get secure, fight back.

Kokouane · 3 months ago

If you were working with code that was proprietary, you probably shouldn't of been using cloud hosted LLMs anyways, but this would seem to seal the deal.

larrymcp · 3 months ago

I think you probably mean "shouldn't have". There is no "shouldn't of".

YetAnotherNick · 3 months ago

Why not? Assuming you believe you can use any cloud for backup or Github for code storage.

gpm · 3 months ago

I think it's fair to question how proprietary your data is.

Like there's the algorithm by which a hedge fund is doing algorithmic trading, they'd be insane to take the risk. Then there's the code for a video game, it's proprietary, but competitors don't benefit substantially from an illicit copy. You ship the compiled artifacts to everyone, so the logic isn't that secret. Copies of the similar source code have linked before with no significant effects.

consumer451 · 3 months ago

All GPT integrations I’ve implemented have been via Azure’s service, due to Microsoft’s contractual obligation for them not to train on my data.

As far as I understand it, this ruling does not apply to Microsoft, does it?

Descon · 3 months ago

I think when you spin up open AI in azure, that instance is yours, so I don't believe that would be subject to this order.

dinobones · 3 months ago

How? This is retention for legal risk, not for training purposes.

They can still have legal contracts with other companies, that stipulate that they don't train on any of their data.

paxys · 3 months ago

Your employees' seemingly private ChatGPT logs being aired in public during discovery for a random court case you aren't even involved in is absolutely a business risk.

godelski · 3 months ago

Retention means an expansion of your threat model. Specifically, in a way you have little to no control over.

It's one thing if you get pwned because a hacker broke into your servers. It is another thing if you get pwned because a hacker broken into somebody else's servers.

At this point, do we believe OpenAI has a strong security infrastructure? Given the court order, it doesn't seem possible for them to have sufficient security for practical purposes. Your data might be encrypted at rest, but who has the keys? When you're buying secure instances, you don't want the provider to have your keys...

antihipocrat · 3 months ago

Will a business located in another jurisdiction be comfortable that the records of all staff queries & prompts are being stored and potentially discoverable by other parties? This is more than just a Google search, these prompts contain business strategy and IP (context uploads for example)

CryptoBanker · 3 months ago

Right, because companies always follow the letter of their contracts.

lxgr · 3 months ago

Why would the reason matter for people that don't want their data retained at all?

Take8435 · 3 months ago

...Data that is kept can be exfiltrated.

jameshart · 3 months ago

Thinking about the value of the dataset of Enron’s emails that was disclosed during their trials, imagine the value and cost to humanity of all OpenAI’s api logs even for a few months being entered into court record..

ukuina · 3 months ago

Aren't most enterprise customers using AzureOpenAI?

bigfudge · 3 months ago

Will this apply to Azure OpenAI model APIs too?

m3kw9 · 3 months ago

Not when people have nowhere else to go, pretty much you cannot escape it, it’s too convenient to not use now. You think no other AI chat providers doesn’t need to do this?