Readit News logoReadit News
jrochkind1 · a day ago
> If source code can now be generated from a specification, the specification is where the essential intellectual content of a GPL project resides.

Our foreparents fought for the right to implement works-a-like to corporate software packages, even if the so-called owners did not like it. We're ready to throw it all away, and let intellectual property owners get so much more control.

The implications will not end up being anti-large-corporation or pro-sharing. If you can prevent someone from re-implementing a spec or building a client that speaks your API or building a work-a-like, it will be the large corporations that exersize this power as usual.

dogcomplex · 19 hours ago
We should be removing IP law entirely, not strengthening it to cover entire classes of problem even when implemented entirely differently. Same for anyone trying to claim "colorful monster creatures" as innately Pokemon IP. Just because someone climbed a mountain first doesn't mean they own it forever. Nobody should be honouring any of these claims.

Nor should we be treating AI models themselves as respected IP. They're built on everyone else's data. Throw away this whole class of law, it's irrelevant in this new world.

marcus_holmes · 18 hours ago
Good news! LLM output cannot be copyrighted. Everything that an LLM produces is automatically, irrevocably, in the public domain.
giancarlostoro · 7 hours ago
I would be okay with just keeping it but limiting it severely. If you release music and you can't sell enough albums in 20 years, that's not societies problem. A lot of artists release albums every 1 - 3 years anyway, so they're always selling some records, or were before streaming became the way to "own" music. Most make their money off of concerts anyway.

For movies and shows, charge and increasing fee to renew the copyright. Eventually studios will give up certain movies. The older the movie the more you pay.

teaearlgraycold · 18 hours ago
> own it forever

Well we could try fixing the forever part. Copyright is out of control. I’d like to see a world with much less power given to IP. Sometimes I even say I want it eradicated entirely. But realistically we should start by cutting things back. Maybe give software an especially short copyright period.

LtWorf · 14 hours ago
The problem here is that large companies can do whatever they want and regular people cannot. Don't worry, they won't be allowing you the same rights as these companies.
jongjong · 13 hours ago
But some people designed their entire lives around the assumption of IP protections.

If we remove IP laws, we should remove all private property laws!

Dead Comment

thayne · 17 hours ago
Yeah, I really don't think we want APIs to be protected by IP. But in this case it isn't just the API, there were also tests involved. I think you could make a pretty strong argument that if you used a test suite to get an agent to implement some code, the code is a derivative product of the test code.
direwolf20 · 8 hours ago
I really don't think a book is a derivative work of the AI model you used to proofread it.

Dead Comment

RobRivera · 21 hours ago
Sounds very similar to that whole API lawsuit with oracle.
halJordan · 21 hours ago
And the whole Adobe pdf thing and the whole Microsoft word thing. And the whole ibm pc thing. Imagine if we were forced to keep using ibm from when they lost their way until now simply because anti-ai luddites were able to scare monger
glhaynes · 3 hours ago
It'd be interesting (earnestly!) to see someone make a solid case for AI reimplementation being bad but that the original (afaik) "clean room" project, Compaq's reimplementation of IBM's PC BIOS (something most people seem to see as a righteous move toward openness and freedom), was good.
noemit · 10 hours ago
Copyright has always benefited those with power, down to the very first instance: Albrecht Durer bullying little children who wanted to make inferior copies of his prints so that their familities could enjoy the art. Durer insisted the art was only for nobles. Ab initio
zelphirkalt · 10 hours ago
It is not aout throwing the right to implement things away. As long as it is done according to the license of the works modified or copied, one can do that. What this is against is, that people wash away a license, that is meant to keep things open, transparent and free. It enables businesses to go back to completely proprietary systems, which will impact your rights.

I am for keeping the licenses in place, as long as there is any copyright at all on software. If we get rid of that, then we can get rid of copyleft licenses and all others too. But of course businesses and greedy people want to have their cake and eat it too. They want copyleft to disappear, but _their_ software, oh no, no one may copy that! Double standards at their best.

rcxdude · 10 hours ago
You're asking for exactly the same cake. You want for the GPL to pass through this process, but not the proprietary licenses that the original GNU tools were washing away.

(the paradox of copyleft is that it does tend to push free software advocates in a direction of copyright maximalism)

taint69 · 2 hours ago
The IP is not located in index.js, README.md, and it’s not even in New Planning Doc (1).docx

But in the lonely mind of a man.

thayne · 17 hours ago
What if there was a special exemption for using a specification if you open source (or open hardware) the result for some definition roughly (or exacactly) equivalent to the OSI definition of open source, or FSFs definition of free software?

Although I think the chance of that happening is effectively zero.

red_admiral · 9 hours ago
The specification of chardet, which started this all off, is essentially forced by the unicode statndard though.
az226 · 7 hours ago
SCOTUS ruled on this already when Google copied Sun’s Java wholesale.
mikkupikku · 7 hours ago
Oracle's Java. Oracle bought Sun, including Java, then started throwing lawsuits over something they didn't even make. IP is absurd.

Deleted Comment

alterom · 21 hours ago
>Our foreparents fought for the right to implement works-a-like to corporate software packages, even if the so-called owners did not like it

Our "foreparents" weren't competing with corporations with unlimited access to generative AI trained on their work. The times, they're-a-changin'.

You're rehashing the argument made in one of the articles which this piece criticizes and directly addresses, while ignoring the entirety of what was written before the conclusion that you quoted.

If anyone finds themselves agreeing with the comment I'm responding to, please, do yourself a favor and read the linked article.

I would do no justice to it by reiterating its points here.

hathawsh · 21 hours ago
I believe the GP post is saying that if we react to the new AI-enabled environment by arbitrarily strengthening IP controls for IP owners, the greatest benefactors will almost certainly be lawyer-laden corporations, not communities, artists, or open source projects. That seems like a reasonable argument.

It seems like the answer is to adjust IP owner rights very carefully, if that's possible. It sounds very hard, though.

sobellian · 20 hours ago
I think the article in fact reaches the exact opposite conclusion it should. I'm not really sure how useful it is to talk about sharing and commons and morals when the point raised was about what is possible. The prescription includes copyleft APIs. These are not possible under Oracle v Google. And you could point it out if I'm wrong but the article doesn't discuss what would happen if Congress acted to reverse Oracle v Google (IMO a cosmically bad idea).
matheusmoreira · 20 hours ago
Adding even more intellectual property nonsense isn't going to work. The real solution is to force AI companies to open up their models to all. We need free as in freedom LLMs that we can run locally on our own computers.
tpmoney · 20 hours ago
I agree with the comment and find the linked article motivated reasoning at best. It's easy to find something "morally good" when it aligns with what you wanted. But plenty of people at Oracle, at IBM, at Microsoft, at Nintendo, at Sony and plenty of other companies whose moats have been commoditized by open source knockoffs don't find such happenings to be "morally good". And even if in general you think that "more freedom" justifies these sorts of unauthorized clones, then Oracle V. Google was at best a lateral move, as Java was hardly a closed ecosystem. One also wonders how far the idea of "more freedom" = "good" goes. How does (did if Qualcom's recent acquisition changes the position) the author feel about the various chinese knockoff clones of the Arduino boards and systems? Undeniably they were a financial good for hobbyists and the maker world alike, and they were well within the "legal" limits, and certainly they "opened" the ecosystem more. But were they "good"? Was the fact that they competed and undersold Arduino's work without contributing anything back and making it harder financially for Arduino to continue their work a "moral good"?

If "more freedom" is your goal, then this rewrite is inherently in that direction. It didn't "close" the old library down. The LGPL version remains under its license, for anyone to use and redistribute exactly as it always has. There is just now also an alternative that one can exercise different rights with. And that doesn't even get into the fact that "increased freedom" was never a condition of being allowed to clone a system from its interfaces in the first place. It might have been a fig leaf, but some major events in the legal landscape of all this came from closed reimplementations. Sony v. Connectix is arguably the defining case for dealing with cloning from public interfaces and behavior as it applies to emulators of all kinds, and Connectix Virtual Gamestation was very much NOT an open source or free product.

But to go a step further, the larger idea of AI assisted re-writes being "good", even if the human developers may have seen the original code seems to broadly increase freedoms overall. Imagine how much faster WINE development can go now that everyone that has seen any Microsoft source code can just direct Claude to implement an API. Retro gaming and the emulation scene is sure to see a boost from people pointing AIs at ay tests in source leaks and letting them go to town. No our "foreparents" weren't competing with corporations with unlimited access to AI trained on their work, they were competing with corporations with unlimited access to the real hardware and schematics and specifications. The playing field has always been un-level which was why fighting for the right to re-implement what you can see with your own eyes and measure with your own instruments was so important. And with the right AI tools, scrappy and small teams of developers can compete on that playing field in a way that previous developers could only dream of.

So no, I agree with the comment that you're responding to. The incredible mad dash to suddenly find strong IP rights very very important now that it's the open source community's turn to see their work commoditized and used in ways they don't approve of is off-putting and in my opinion a dangerous road to tread that will hand back years of hard fought battles in an attempt to stop the tides. In the end it will leave all of us in a weaker position while solidifying the hold large corporations have on IP in ways we will regret in the years to come.

salawat · 21 hours ago
I mean. Yeah. GPL's genius was that it used Copyright, which proprietary enterprise wouldn't dare dismantle, to secure for the public a permanent public good.

Pretty sure no one, (but me anyway) saw overt theft of IP by ignoring IP law through redefinition coming. Admittedly I couldn't articulate for you capital would skill transfer and commoditize it in the form of pay to play data centers, but give me a break, I was a teenager/twenty something at the time.

zmmmmm · a day ago
The really interesting question to me is if this transcends copyright and unravels the whole concept of intellectual property. Because all of it is premised on an assumption that creativity is "hard". But LLMs are not just writing software, they are rapidly being engineered to operate completely generally as knowledge creation engines: solving math proofs, designing drugs, etc.

So: once it's not "hard" any more, does IP even make sense at all? Why grant monopoly rights to something that required little to no investment in the first place? Even with vestigial IP law - let's say, patents: it just becomes and input parameter that the AI needs to work around the patents like any other constraints.

palmotea · a day ago
> So: once it's not "hard" any more, does IP even make sense at all? Why grant monopoly rights to something that required little to no investment in the first place? Even with vestigial IP law - let's say, patents: it just becomes and input parameter that the AI needs to work around the patents like any other constraints.

I think it still does: IIRC, the current legal situation is AI-output does not qualify for IP protections (at least not without substantial later human modification). IP protections are solely reserved for human work.

And I'm fine with that: if a person put in the work, they should have protections so their stuff can't be ripped off for free by all the wealthy major corporations that find some use for it. Otherwise: who cares about the LLMs.

robmccoll · a day ago
I think you have a rather idealized model of IP in mind. In practice, IP law tends to be an expensive weapon the wealthy major corporations use against the little guy. Deep enough pockets and a big enough warchest of broad parents will drain the little guy every time.
rlpb · a day ago
What if a person puts in the work, but the work was worthless or can be trivially reproduced without effort?

See also: https://en.wikipedia.org/wiki/Sweat_of_the_brow

jbergqvist · a day ago
Does this matter in practice though? By modifying some of the generated code and not taking a solution produced by an LLM end-to-end but borrowing heavily from it, can't a human claim full ownership of the IP even though in reality the LLM did most of the relevant work?
nkmnz · a day ago
> AI-output does not qualify for IP protections

I beg to differ. AI-output did not entitle the person creating the prompt for IP protections, so far – but my objection is not directed towards the "so far", but towards your omission of "the person creating the prompt", because if an AI outputs copyrighted material from the training data, that material is still copyrighted. AI is not a magical copyright removal machine.

utopiah · 13 hours ago
> operate completely generally as knowledge creation engines: solving math proofs, designing drugs, etc.

Any example of that? So far I haven't seen any but maybe I'm looking at the wrong places.

I've see a lot of :

- "solving" math proofs that were properly formalized, with often numerous documented past attempts, re-verified by proper mathematicians, without necessarily any interesting results

- haven't seen any designed trust, most I've seen was (again with entire teams of experts behind) finding slight optimizations

Basically all outputs I've seen so far have been both following existing trends (basically low hanging fruits without any paradigm shift) and never ever alone but rather as search supports for teams of World class experts. None of these that would quality IMHO as knowledge creation. Whenever such results were published the publication seemed mostly to be promotion about the workflow itself more than the actual results. DeepMind seems to be the prime example for that.

PS: for the epistemological distinction you can see a few past comments of mine (e.g. https://news.ycombinator.com/item?id=47011884 )

satvikpendem · a day ago
Good. Intellectual property is now a twisted concept by the elite, whatever its benefits were previously. As soon as Disney made Mickey popular, it was all downhill.
godd2 · 20 hours ago
Copyright is about originality and expression, not effort. US copyright law does not use "Sweat of the Brow" doctrine.
gbacon · 2 hours ago
The labor theory of value is bunk economics anyway.
rfw300 · a day ago
More likely: this is a transitional phase where our previously hard problems become easy, and we will soon set our sights on new and much harder problems. The pinnacle of creative achievement in the universe is probably not 2010s B2B SaaS.

It is entirely possible, however, that human beings will not be the primary drivers of progress on those problems.

gbacon · 2 hours ago
Finally, a perspective that looks beyond the buggy whips! As for your last comment, it depends on what you mean by the primary drivers. Figurative crank turners, maybe not. Creativity and insight, don’t count us out just yet.
treyd · 19 hours ago
> if this transcends copyright and unravels the whole concept of intellectual property.

I have been saying this for years. Intellectual property is based on the concept that ideas can be owned, which is fundamentally a contradiction with how reality operates. We've been able to write laws that paper over that contradiction by introducing concepts like "fair use", but it doesn't resolve it.

AI is just making the conflict arising out of that contradiction more intense in new ways and forcing us to reckon with it in this new technological landscape. You can follow two perfectly reasonable lines of logic and end up with contradictory solutions. So how are we going to get out of this mess? I don't know, not without rolling back (at least parts of) what intellectual property is in the first place.

prohobo · 12 hours ago
From what I understand, LLMs can't really generate anything meaningful that doesn't implicitly rely on the operator's choices. It's hard to make the right novel choices as soon as you leave well-defined problem spaces.

In terms of math and biochemistry the cost of generating candidates has collapsed, but the cost of validating them hasn't.

kindkang2024 · 19 hours ago
At some level, IP makes sense — creators should be rewarded. But IP only benefits those who claim it. The benefits rarely flow back to humanity who made it all possible. Every LLM was trained on humanity's collective knowledge. The value was created collectively, then captured privately.

That's the reason I like the idea of DUKI/dju:ki/ — Decentralized Universal Kindness Income, similar to UBI but driven by voluntary kindness and sincere marketing rather than taxation. If AI makes creation trivially easy and IP loses its justification, the question becomes: how do we ensure a tiny part of the wealth generated flows back to everyone?

Eridrus · 16 hours ago
The point of IP is to encourage the creation of new things.

Not all protections have to be ones that give total control like copyright.

I think it's a mistaken assumption that costs will fall to zero. The low hanging fruit will get picked, and then we'll be doing expensive combined AI/wetlab search for new drugs.

If there is any meaningful headroom we will keep doing expensive things to make progress.

matheusmoreira · 15 hours ago
> The point of IP is to encourage the creation of new things.

Then why are corporations allowed to milk successful works for all eternity? Why do we have Disney monopolizing films made half a century ago? Why do we have Nintendo selling people the exact same Mario ROMs from the 80s every single console generation?

They should have like 10 years of copyright so they can turn a profit. Once it expires it's over and the work enters the public domain where it belongs. If they want to keep profiting they should have to keep creating new things. They shouldn't be able to turn shared culture into eternal intellectual property portfolios that they monopolize and then sit on like dragons.

LelouBil · 17 hours ago
This is similar :

https://www.vice.com/en/article/musicians-algorithmically-ge...

Two musicians generated every possible melody within an octave, and published them as creative Commons Zero.

I never heard about this again though.

gnopgnip · 17 hours ago
Copyright doesn’t depend on the “sweat of the brow”. See Feist v Rural Telephone co 1991

Also copyright can protect something normally not eligible when the author chooses what information to include and exclude

js8 · a day ago
It might unravel intellectual property, just not in a fair way. When capitalism started, public land was enclosed to create private property. Despite this being in many cases a quite unfair process, we still respect this arrangement.

With AI, a similar process is happening - publicly available information becomes enclosed by the model owners. We will probably get a "vestigial" intellectual property in the form of model ownership, and everyone will pay a rent to use it. In fact, companies might start to gatekeep all the information to only their own LLM flavor, which you will be required to use to get to the information. For example, product documentation and datasheets will be only available by talking to their AI.

nradov · a day ago
Nothing changes for drug patents regardless of whether an LLM was used in the discovery process.
reverius42 · a day ago
Not sure why this should be true; the US Supreme Court recently chose to let precedent stand that AI creations are not copyrightable. https://www.reuters.com/legal/government/us-supreme-court-de...

That also seems relevant for this whole discussion, actually -- if a work can't be copyrighted it certainly can't have a changed license, or any license at all. (I guess it's effectively public domain to the extent that it's public at all?)

zmmmmm · a day ago
Even if all I have to do is tell my agent, "here is a patent for a drug, analyse the patent and determine an equivalent but non-infringing drug" and it chugs away for a couple of hours and spits out a drug along with all the specifications to manufacture it?

I guess the state of play will be that for new drugs the original manufacturer will already have done that and ensured that literally anything that could be found as a workaround is included in the scope of the patent. But I feel like it will not be possible to keep that wartertight.

eru · 20 hours ago
There's different kinds of intellectual property.

Copyright might rest on 'creativity is hard'. But patents and trademarks do not.

DonsDiscountGas · 20 hours ago
Trademarks don't, patents do. Different kind of creativity but still.
newyankee · a day ago
If you think about creative outcomes as n dimensional 'volumes', AI expressions can cover more than humans in many domains. These are precisely artistic styles, music styles etc. and tbh not everyone can be a Mozart but may be a lot more with AI can be Mozart lite. This begs the question how much of creativity is appreciated as a shared experience
matheusmoreira · 19 hours ago
Intellectual property never made any sense to begin with. It is logically reducible to ownership of numbers. It is that absurd. Computers made the entire concept irrelevant the second they were invented but they kept holding on via lobbying power. Maybe AI will finally put the final nail on the coffin of intellectual property.

Sure, it's disgusting and hypocritical how these corporations enshrined all this nonsense into law only to then ignore it all the second LLMs were invented. It's ultimately a good thing though. The model weights are all that matters. All we need to do is wait for the models to hit diminishing returns, then somehow find a way to leak them so that everyone has access. If they refuse, then just force them. By law or by revolution.

paxys · a day ago
"Hard" or "easy" has never been part of the premise.

A company spends a decade and billions of dollars to develop a groundbreaking drug and patents it.

I think of a cool new character called "Mr Poop" and publish a short story about him with an hour of work.

Both of us get the exact same protection under the law (yes yes I know copyright vs patent etc., but ultimately they are all about IP protection).

keeda · 20 hours ago
Creativity is still hard. AI-generated content is called "slop" for a reason ;-)
AlienRobot · 21 hours ago
The basis of your argument is that AI-generated work isn't hard, but your conclusion is that ALL work, AI-generated or not, should lose IP rights?
spwa4 · a day ago
Don't worry. The courts have consistently sided with huge companies on copyright. In the US. In Europe. Doesn't matter.

Company incorporates GPL code in their product? Never once have courts decided to uphold copyright. HP did that many times. Microsoft got caught doing it. And yet the GPL was never applied to their products. Every time there was an excuse. An inconsistent excuse.

Schoolkid downloads a movie? 30,000 USD per infraction PLUS armed police officer goes in and enforces removal of any movies.

Or take the very subject here. AI training WAS NOT considered fair use when OpenAI violated copyright to train. Same with Anthropic, Google, Microsoft, ... They incorporated harry potter and the linux kernel in ChatGPT, in the model itself. Undeniable. Literally. So even if you accept that it's changed now, OpenAI should still be forced to redistribute the training set, code, and everything needed to run the model for everything they did up to 2020. Needless to say ... courts refused to apply that.

So just apply "the law", right. Courts' judgement of using AI to "remove GPL"? Approved. Using AI to "make the next Disney-style movie"? SEND IN THE ARMY! Whether one or the other violates the law according to rational people? Whatever excuse to avoid that discussion is good enough.

hyperman1 · a day ago
I've always thought the opposite: IP law was created to make sure creativity stays hard, and hence controllable by the elites.

Patents came along when farmers started making city goods, threatening guilds secrets. Copyright came when the printing press made copying and translating the bible easy and accessible to all. (Trademark admittedly does not fit this view, but doesn't seem all that damaging either)

To Protect The Arts, and To Time Limit Trade Secrets were just the Protect The Children of old times, a way to confuse people who didn't look too hard at actual consequences.

This means that the future of IP depends on what lets the powers that be pull up the ladder behind them. Long term I'd expect e.g. copyright expansion and harder enforcement, just because cloning by AI gets easy enough to threaten the status quo.

cobbzilla · a day ago
> Trademark admittedly does not fit this view, but doesn't seem all that damaging either

Isn’t trademark the only thing keeping a certain cartoon mouse out of the public domain, despite the fact that his earliest animations are out of copyright? Not sure if you’d consider that damaging, or if anyone has yet tested the boundaries of the House of Mouse’s patience here.

jazzyjackson · 21 hours ago
:/ before copyright you just had patrons, which looks a lot more like the rich controlling what art gets made than what we have today
ordu · a day ago
I believe it is a narrow view of the situation. If we take a look into the history, into the reasons for inventing GPL, we'll see that it was an attempt to fight copyrights with copyrights. The very name 'copyleft' is trying to convey the idea.

What AI are eroding is copyright. You can re-implement not just a GPL program, but to reverse engineer and re-implement a closed source program too, people have demonstrated it already, there were stories here on HN about it.

AI is eroding copyright, so there may no longer be a need for the GPL. GNU should stop and rethink its stance, chuck away the GPL as the main tool to fight evil software corporations and embrace LLM as the main weapon.

davidw · a day ago
> LLM as the main weapon

LLM's - to date - seem to require massive capital expenditures to have the highest quality ones, which is a monumental shift in power towards mega corporations and away from the world of open source where you could do innovative work on your own computer running Linux or FreeBSD or some other open OS.

I don't think that's an exciting idea for the Free Software Foundation.

Perhaps with time we'll be able to run local ones that are 'good enough', but we're not there yet.

There's also an ethical/moral question that these things have been trained on millions of hours of people's volunteer work and the benefits of that are going to accrue to the mega corporations.

Edit: I guess the conclusion I come to is that LLM's are good for 'getting things done', but the context in which they are operating is one where the balance of power is heavily tilted towards capital, and open source is perhaps less interesting to participate in if the machines are just going to slurp it up and people don't have to respect the license or even acknowledge your work.

ordu · a day ago
> LLM's - to date - seem to require massive capital expenditures to have the highest quality ones, which is a monumental shift in power towards mega corporations and away from the world of open source

Yeah, a bit of a conundrum. But I don't think that fighting for copyright now can bring any benefits for FOSS. GNU should bring Stallman back and see whether he can come with any new ideas and a new strategy. Alternatively they could try without Stallman. But the point is: they should stop and think again. Maybe they will find a way forward, maybe they won't but it means that either they could continue their fight for a freedom meaningfully, or they could just stop fighting and find some other things to do. Both options are better then fighting for copyright.

> There's also an ethical/moral question that these things have been trained on millions of hours of people's volunteer work and the benefits of that are going to accrue to the mega corporations.

I want a clarify this statement a bit. The thing with LLM relying on work of others are not against GPU philosophy as I understand it: algorithms have to be free. Nothing wrong with training LLMs on them or on programs implementing them. Nothing wrong with using these LLMs to write new (free) programs. What is wrong are corporations reaping all the benefits now and locking down new algorithms later.

I think it is important, because copyright is deemed to be an ethical thing by many (I think for most people it is just a deduction: abiding the law is ethical, therefore copyright is ethical), but not for GNU.

zozbot234 · a day ago
> LLM's - to date - seem to require massive capital expenditures to have the highest quality ones

There are near-SOTA LLM's available under permissive licenses. Even running them doesn't require prohibitive expenses on hardware unless you insist on realtime use.

Aozora7 · a day ago
>Perhaps with time we'll be able to run local ones that are 'good enough', but we're not there yet.

Right now, we can get local models that you can run on consumer hardware, that match capabilities of state of the art models from two years ago. The improvements to model architecture may or may not maintain the same pace in the future, but we will get a local equivalent to Opus 4.6 or whatever other benchmark of "good enough" you have, in the foreseeable future.

tmp10423288442 · a day ago
> LLM's - to date - seem to require massive capital expenditures to have the highest quality ones, which is a monumental shift in power towards mega corporations and away from the world of open source where you could do innovative work on your own computer running Linux or FreeBSD or some other open OS.

When the FSF and GPL were created, I don't think this was really a consideration. They were perfectly happy with requiring Big Iron Unix or an esoteric Lisp Machine to use the software - they just wanted to have the ability to customize and distribute fixes and enhancements to it.

jacquesm · a day ago
> There's also an ethical/moral question that these things have been trained on millions of hours of people's volunteer work and the benefits of that are going to accrue to the mega corporations.

This was already the case and it just got worse, not better.

thenewnewguy · a day ago
Is massive capital expenditure not also required to enforce the GPL? If some company steals your GPLed code and doesn't follow the license, you will have to sue them and somebody will have to pay the lawyers.
socalgal2 · a day ago
Maybe a good open source idea is to "seti at home" style crowd-source training, assuming that's possible.
shadowgovt · a day ago
How close are we to good enough and who's working on that? I would be interested in supporting that work; to my mind, many of the real objections to LLMs are diminished if we can make them small and cheap enough to run in the home (and, perhaps, trained with distributed shared resources, although the training problem is the harder one).
stebalien · a day ago
Copyleft is a mirror of copyright, not a way to fight copyright. It grants rights to the consumer where copyright grants rights to the creator. Importantly, it gives the end-user the right to modify the software running on their devices.

Unfortunately, there are cases where you simply can't just "re-implement" something. E.g., because doing so requires access to restricted tools, keys, or proprietary specifications.

ordu · a day ago
These are words of Stallman:

"So, I looked for a way to stop that from happening. The method I came up with is called “copyleft.” It's called copyleft because it's sort of like taking copyright and flipping it over. [Laughter] Legally, copyleft works based on copyright. We use the existing copyright law, but we use it to achieve a very different goal."

https://writings.hongminhee.org/2026/03/legal-vs-legitimate/

rileymat2 · a day ago
> It grants rights to the consumer where copyright grants rights to the creator.

It also grants one major right/feature to the creator, the ability to spread their work while keeping it as open as they intend.

johnofthesea · a day ago
> AI is eroding copyright, so there may no longer be a need for the GPL. GNU should stop and rethink its stance, chuck away the GPL as the main tool to fight evil software corporations and embrace LLM as the main weapon.

Is this LLM thing freely available or is it owned and controlled by these companies? Are we going to rent the tools to fight "evil software corporations"?

Aozora7 · a day ago
There already are LLMs with open weights that are better at code than state of the art closed source models from a year ago. For now, you most people may have to rent the hardware to run those models, since it's too expensive for most people to own something that can run inference on one trillion parameters, but I wouldn't consider LLMs to be controlled by "evil software corporations" at this point.
josephg · a day ago
Open models do exist. They’re nowhere near aa good as frontier models, but they’re getting better all the time.

It’s probably only a matter of time before open models are as good as Claude code is today.

cozzyd · a day ago
easy, we ask Claude to write an open-source freely-available version of Claude with equal or better capabilities.
Peritract · a day ago
> chuck away the GPL as the main tool to fight evil software corporations and embrace LLM as the main weapon.

LLMs are one of the primary manifestations of 'evil software corporations' currently.

dathinab · a day ago
> we'll see that it was an attempt to fight copyrights with copyrights

it's not that simple

yes, GPLs origins have the idea of "everyone should be able to use"

but it also is about attribution the original author

and making sure people can't just de-facto "size public goods"

the kind of AI usage is removing attribution and is often sizing public goods in a way far worse then most companies which just ignored the license did

so today there is more need then ever in the last few decades for GPL like licenses

amiga386 · a day ago
You've said "size" twice in comments, did you mean "seize"?
webstrand · a day ago
Its purpose "if you run the software you should be able to inspect and modify that software, and to share those modifications with your peers" not explicitly resist copyright. Yes copyright is bad in that it often prevents one from doing that, but it is not the purpose of the GPL to dismantle copyright.

Reducing it to "well you can clone the proprietary software you're forced to use by LLM" is really missing the soul of the GPL.

pocksuppet · a day ago
If not for copyright, you could always do that and copyleft wouldn't be needed.
paxys · 21 hours ago
Until there is a capable open source open weight AI that is easily hostable by an average person - no, we still have a long way to go. You aren't going to have software freedom when the tool that enables it is controlled by a handful of powerful tech companies.
mikkupikku · a day ago
I agree with almost all of that, except the part about GNU changing their stance. I think GNU should stay true and consistent, if for no other reason than to not make many of their supporters who aren't on board with AI feel betrayed and have GNUs legacy soured. If the cause of LLMs conquering proprietary software needs an organization to champion it, let that be a new organization, not GNU.
cubefox · a day ago
That's naive. Copyright doesn't just apply to software. There already have been countless lawsuits about copying music long before the term "open source" was invented. No, changing the lyrics a bit doesn't circumvent copyright. Nor does translating a Stephen King novel to German and switching the names of the places and characters.

A court ordered the first Nosferatu movie to be destroyed because it had too many similarities to Dracula. Despite the fact that the movie makes rather large deviations from the original.

If Claude was indeed asked to reimplement the existing codebase, just in Rust and a bit optimized, that could well be a copyright violation. Just like rephrasing A Song ot Ice and Fire a bit, and switching to a different language, doesn't remove its copyright.

zozbot234 · a day ago
Claude was asked to implement a public API, not an entire codebase. The definition of a public API is largely functional; even in an unusually complex case like the Java standard facilities (which are unusually creative even in the structure and organization of the API itself) the reimplementation by Google was found to be fair use.
Marsymars · a day ago
> Just like rephrasing A Song ot Ice and Fire a bit, and switching to a different language, doesn't remove its copyright.

There is some precedent for this, e.g. Alchemised is a recent best seller that had just enough changed from its Harry Potter fan fiction source in order to avoid copyright infringement: https://en.wikipedia.org/wiki/Alchemised

(I avoided the term “remove copyright” here because the new work is still under copyright, just not Harry Potter - related copyright.)

xantronix · a day ago
So not only are we moving goalposts here, but we've decided the GNU team should join the other team? I don't understand how GNU would see mass model LLM training as anything but the most flagrant violations of their ethos. LLM labs, in their view, would be among the most evil software corporations to have ever existed.
wolvesechoes · a day ago
> AI is eroding copyright

Unless it is IP of the same big corpos that consumed all content available. Good luck with eroding them.

re-thc · a day ago
> What AI are eroding is copyright.

At the moment it's people that are eroding copyright. E.g. in this case someone did something.

"AI" didn't have a brain, woke up and suddenly decided to do it.

Realistically nothing to do with AI. Having a gun doesn't mean you randomly shoot.

Deleted Comment

thomastjeffery · a day ago
While I personally agree with you, Richard Stallman (the creator of the GPL) does not. He has always advocated in favor of strong copyright protection, because the foundation of the GPL is the monopoly power granted by copyright. The problem that the GPL is intended to solve is proprietary software.

Generative models (AI) are not really eroding copyright. They are calling its bluff. The very notion of intellectual property depends on a property line: some arbitrary boundary where the property begins and ends. Generative models blur that line, making it impractical to distinguish which property belongs to whom.

Ironically, these models are made by giant monopolistic corporations whose wealth is quite literally a market valuation (stock price) of their copyrights! If generative models ever become good enough to reimplement CUDA, what value will NVIDIA have left?

The reality is that generative models are nowhere near good enough to actually call the bluff. Copyright is still the winning hand, and that is likely to continue, particularly while IP holders are the primary authors of law.

---

This whole situation is missing the forest for the trees. Intellectual Property is bullshit. A system predicated on monopoly power can only result in consolidated wealth driving the consolidation of power; which is precisely what has happened. The words "starving artist" ring every bit as familiar today as any time in history. Copyright has utterly failed the very goals it was explicitly written with.

It isn't the GPL that needs changing. So long as a system of copyright rules the land, copyleft is the best way to participate. What we really need is a cohesive political movement against monopoly power; one that isn't conveniently ignorant of copyright as its most significant source.

pennomi · a day ago
Right, anything that can be copied instantly for free cannot be realistically owned.
martin-t · a day ago
This is naive. Advertisement and network effects win. Individuals cannot compete with corporations on equal ground here.
Gigachad · a day ago
Someone should put this to the test. Take the recently leaked Minecraft source code and have Copilot build an exact replica in another programming language and then publish it as open source. See if Microsoft believes AI is copyright infringement or not.
robmccoll · a day ago
As described, this would not be the same thing. If the AI is looking at the source and effectively porting it, that is likely infringement. The idea instead should be "implement Minecraft from scratch" but with behavior, graphics, etc. identical. Note that you'll need to have an AI generate assets or something since you can't just reuse textures and models.
Gigachad · a day ago
AI models have already looked at the source of GPL software and contain it in their dataset. Adding the minecraft source to the mix wouldn't seem much different. Of course art assets and trade marks would have to be replaced. But an AI "clean room" implementation has yet to be legally tested.
NiloCK · a day ago
A room "as clean" as the one under dispute (chardet) is very easy to replicate.

AI 1: - (reads the source), creates a spec + acceptance criteria

AI 2: - implements from spec

AI 1 is in the position of the maintainer who facilitated the license swap.

yunnpp · 19 hours ago
> Note that you'll need to have an AI generate assets or something since you can't just reuse textures and models.

As far as I know, you can as long as you own a copy of the original. In other words, you can't redistribute the assets, but you can distribute the code that works with them. This is literally how every free/libre game remake works. The copyright of your new, from-scratch code, is in no way linked to that of the assets.

smsm42 · 20 hours ago
"Behavior, graphics, etc." would likely constitute separate IP from the code. I am not sure there's a model that allows you to make AI reproduce Minecraft without telling it what "Minecraft" is - which would likely contaminate it with IP-protected information.
u1hcw9nx · a day ago
This was not about legality.

> That question is this: does legal mean legitimate?

Just because something is legal does not mean it's moral thing to do.

larodi · a day ago
this question should've been posed earlier when first LLMs were training. many people chose to ignore the question, and now, several distillation epochs later, it is not a question that matters, as both yes/no are true, and not true.

is it legitimate for millions of people to exploit, expound on knowledge that was perhaps, to begin with, not legitimate to use? well they did already, who's to judge the commons now?

Aboutplants · a day ago
I’ve often thought that the key to fighting this is through this exact method. Turn the tool against them
peacebeard · a day ago
The big question is: if copyrighted material was used in the training material, is the LLM's output copyright infringement when it resembles the training material? In your example, you are taking the copyrighted material and giving it to the LLM as input and instructing the LLM to process it. Regardless of where the legal cards fall, this is a much less ambiguous scenario.
wvenable · 21 hours ago
There's a couple of different issues here that all get mangled together. If you're producing effectively the same expression that's infringement. You draw Captain America from memory, it's still Captain America, and therefore infringement. If you draw Captain Canada by tracing around Captain America that's also infringement but of a different type.

When it comes to software, again it's the expression that matters -- literally the actual source code. Software that does the same thing but uses entirely different code to do it is not the same expression. Like with the tracing example above, if you read the original source code then it's harder to claim that it isn't the same expression. This is why clean room implementations are necessary.

LPisGood · a day ago
I think Disney ran into this with people generating Marvel characters etc
GuB-42 · 20 hours ago
I think it will become interesting when AI will be able to decompile binaries.
gowld · 20 hours ago
Decompiling binaries is easy when they are C# or Java, even before AI. C# is a Microsoft language, and C# games have thriving mod communities with deep hooks into the core game, and detailed documentation reverse-engineered from the binary.
fruitworks · 21 hours ago
this is the question of the hour. Imagine using this LLM proxy to license-strip major parts of leaked Windows source code to produce code for WINE.

On top of all of this, there are the attempts at binary decompilation using LLMs and other new tools that have been discussed on this site recently.

amelius · a day ago
You will probably run into design patents.
VorpalWay · a day ago
Software patents is not a thing in EU.
martin-t · a day ago
They might not care. Products win not by quality or features but by advertisement, hype and network effects.

The original implementation would still have the upper hand here. OTOH if I as a nobody create something cool, there's nothing stopping a huge corporation from "reimplementing" (=stealing) it and and using their huge advertising budget to completely overshadow me.

And that's how they like it.

Gigachad · a day ago
Given how hard companies like Nintendo and Microsoft have been taking down leaks or fan creations, it seems they very much do care about keeping this stuff locked down.
sharkjacobs · a day ago
> Blanchard's account is that he never looked at the existing source code directly. He fed only the API and the test suite to Claude and asked it to reimplement the library from scratch

This feels sort of like saying "I just blindly threw paint at that canvas on the wall and it came out in the shape of Mickey Mouse, and so it can't be copyright infringement because it was created without the use of my knowledge of Micky Mouse"

Blanchard is, of course, familiar with the source code, he's been its maintainer for years. The premise is that he prompted Claude to reimplement it, without using his own knowledge of it to direct or steer.

dathinab · a day ago
> Blanchard is, of course, familiar with the source code, he's been its maintainer for years.

I would argue it's irrelevant if they looked or didn't look at the code. As well as weather he was or wasn't familiar with it.

What matters is, that they feed to original code into a tool which they setup to make a copy of it. How that tool works doesn't really matter. Neither does it make a difference if you obfuscate that it's an copy.

If I blindfold myself when making copies of books with a book scanner + printer I'm still engaging in copyright infringement.

If AI is a tool, that should hold.

If it isn't "just" a tool, then it did engage in copyright infringement (as it created the new output side by side with the original) in the same way an employee might do so on command of their boss. Which still makes the boss/company liable for copyright infringement and in general just because you weren't the one who created an infringing product doesn't mean you aren't more or less as liable of distributing it, as if you had done so.

Legend2440 · a day ago
>that they feed to original code into a tool which they setup to make a copy of it

Well, no. They fed the spec (test cases, etc) into a tool which made a new program matching the spec. This is not a copy of the original code.

But also this feels like arguing over the color of the iceberg while the titanic sinks. If you have a tool that can make code to spec, what is the value in source code anymore? Even if your app is closed-source, you can just tell claude to write new code that does the same thing.

spullara · a day ago
if the actual text of the code isn't the same or obviously derivative, copyright doesn't apply at all.
margalabargala · a day ago
> If it isn't "just" a tool, then it did engage in copyright infringement

Copyright infringement is a thing humans do. It's not a human.

Just like how the photos taken by a monkey with a camera have no copyright. Human law binds humans.

logicprog · a day ago
I just don't see how it's relevant whether he did look or didn't. In my opinion, it's not just legally valid to make a re-implementation of something if you've seen the code as long as it doesn't copy expressive elements. I think it's also ethically fine as well to use source code as a reference for re-implementing something as long as it doesn't turn into an exact translation.
atomicnumber3 · a day ago
It's actually not legally fine, or at least it's extremely dangerous. Projects that re-implement APIs presented by extremely litigious companies specifically do not allow people who, for instance, have seen the proprietary source code to then work on the project.
simonw · a day ago
Right. The alternative is that we reward Dan for his 14 years of volunteer maintenance of a project... by banning him from working on anything similar under a different license for the rest of his life.
sarchertech · a day ago
Ignoring the legal or ethical concerns. Let’s say we live in a world where the cost of copying code is so close to zero that it’s indistinguishable from a world without copyright.

Anything you put out can and will be used by whatever giant company wants to use it with no attribution whatsoever.

Doesn’t that massively reduce the incentive to release the source of anything ever?

axus · a day ago
Oracle had it's day in court with Google over the Java APIs. Reimplementing APIs can be done without copyright infringement, but Oracle must have tried to find real infringement during discovery.

In this case, we could theoretically prove that the new chardet is a clean reimplementation. Blanchard can provide all of the prompts necessary to re-implement again, and for the cost of the tokens anyone can reproduce the results.

Aurornis · a day ago
Can anyone find the actual quote where Blanchard said this?

My understanding was that his claim was that Claude was not looking at the existing source code while writing it.

duskdozer · 18 hours ago
That is what he claimed. However, his design document instructs the AI to download the codebase, references specific files in the codebase, and to create a rewrite of the same project by name. It seems very unlikely it didn't look at the code while working, even forgetting that it had already likely been trained on it.

He would have had a better argument if he created a matching spec from scratch using randomized names.

pklausler · a day ago
Conveniently ignoring the likelihood that Claude had been trained on the freely accessible source code.
mrgoldenbrown · a day ago
Does he have access to Claude's training data? How can he claim Claude wasn't trained on the original code?
NewsaHackO · a day ago
>This feels sort of like saying "I just blindly threw paint at that canvas on the wall and it came out in the shape of Mickey Mouse, and so it can't be copyright infringement because it was created without the use of my knowledge of Micky Mouse"

IANAL, but that analogy wouldn't work because Mickey Mouse is a trademark, so it doesn't matter how it is created.

SpicyLemonZest · a day ago
Isn't this a red herring? An API definition is fair use under Google v. Oracle, but the test suite is definitely copyrightable code!

Deleted Comment

esafak · a day ago
If you only stick to the API and ignore the implementation, it is not Mickey Mouse any more but a rodent. If it was just a clone it wouldn't be 50x as fast. Nevertheless, APIs apparently can be copyrightable. I generally disagree with this; it's how PC compatibles took off, giving consumers better options.
amarant · a day ago
Wait what, didn't oracle lose the case against Google? Have I been living in an alternate reality where API compatibility is fair use?

Dead Comment

re-thc · a day ago
> This feels sort of like saying "I just blindly threw paint at that canvas on the wall and

> He fed only the API and the test suite to Claude and asked it

Difference being Claude looked; so not blind. The equivalent is more like I blindly took a photo of it and then used that to...

Technically did look.

amarant · a day ago
The article is poorly written. Blanchard was a chardet maintainer for years. Of course he had looked at it's code!

What he claimed, and what was interesting, was that Claude didn't look at the code, only the API and the test suite. The new implementation is all Claude. And the implementation is different enough to be considered original, completely different structure, design, and hey, a 48x improvement in performance! It's just API-compatible with the original. Which as per the Google Vs oracle 2021 decision is to be considered fair use.

babypuncher · a day ago
What if we said that generative AI output is simply not copyrightable. Anything an AI spits out would automatically be public domain, except in cases where the output directly infringes the rights of an existing work.

This would make it so relicensing with AI rewrites is essentially impossible unless your goal is to transition the work to be truly public domain.

I think this also helps somewhat with the ethical quandary of these models being trained on public data while contributing nothing of value back to the public, and disincentivize the production of slop for profit.

kjksf · a day ago
We did in fact say so.

https://www.carltonfields.com/insights/publications/2025/no-...

> No Copyright Protection for AI-Assisted Creations: Thaler v. Perlmutter

> A recent key judicial development on this topic occurred when the U.S. Supreme Court declined to review the case of Thaler v. Perlmutter on March 2, 2026, effectively upholding lower court rulings that AI-generated works lacking human authorship are not eligible for copyright protection under U.S. law

idle_zealot · a day ago
> This would make it so relicensing with AI rewrites is essentially impossible unless your goal is to transition the work to be truly public domain.

That's not true at all. Anyone could follow these steps:

1. Have the LLM rewrite GPL code.

2. Do not publish that public domain code. You have no obligation to.

3. Make a few tweaks to that code.

4. Publish a compiled binary/use your code to host a service under a proprietary license of your choice.

munk-a · a day ago
I think the missing thing here is that the license violation already happened. Most of the big models trained on data in a manner that violated terms of service. We'll need a court case but I think it's extremely reasonable to consider any model trained on GPL code to be infected with open licensing requirements.
crazygringo · a day ago
You might wish that were true, but there are very strong arguments it's not. Training on copyleft licensed code is not a license violation. Any more than a person reading it is. In copyright terms, it's such an extreme transformative use that copyright no longer applies. It's fair use.

But agreed that we're waiting for a court case to confirm that. Although really, the main questions for any court cases are not going to be around the principle of fair use itself or whether training is transformative enough (it obviously is), but rather on the specifics:

1) Was any copyrighted material acquired legally (not applicable here), and

2) Is the LLM always providing a unique expression (e.g. not regurgitating books or libraries verbatim)

And in this particular case, they confirmed that the new implementation is 98.7% unique.

jazzyjackson · 21 hours ago
Transformative is not the only component of determining fair use, there’s also the economic displacement aspect. If you’re doing a book report and include portions of the original (or provide an interface for viewing portions à la Google Books) you aren’t a threat to the original authors ability to make a living.

If you’ve used copyrighted books and turned them into a free write-a-book machine, you are suddenly using the authors own works against them, in a way that a judge might rule is not very fair.

“ Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner’s original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work (for example, by displacing sales of the original) and/or whether the use could cause substantial harm if it were to become widespread.”

https://www.copyright.gov/fair-use/

madeofpalk · a day ago
A human reading a unit of work is not a “copy”. I’m pretty sure our legal systems agree that thought or sight is not copying something.

Training an LLM inherently requires making a copy of the work. Even the initial act of loading it from the internet and copying it into memory to then train the LLM is a copy that can be governed by its license and copyright law

pessimizer · 19 hours ago
> Training on copyleft licensed code is not a license violation. Any more than a person reading it is. In copyright terms, it's such an extreme transformative use that copyright no longer applies. It's fair use.

This is just an assertion that you're making. There's no argument here. I'm aware that this is also an assertion that some judges have made.

My claim is that LLMs are not human, therefore when you apply words like "training" to them, you're only doing it metaphorically. It's no more "training" than copying code to a different hard drive is training that hard drive. And it's no more "transformative" than rar'ing or zipping the code, then unzipping it. I can't sell my jpgs of pngs I downloaded from Getty.

I have no idea how LLMs can be considered transformative work that immunizes me from owing the least bit of respect to the source material, but if I sample 2-6 second snatches from 10 different songs, put them through over 9000 filters and blend them into a new work, I owe money to everyone involved. I might even owe money to the people who wrote the filters, depending on the licensing.

> 98.7% unique.

This doesn't mean anything. This is a meaningless arrangement of words. The way we figure out things are piracy is through provenance, not bizarre ad hoc measurements. If I read a book in Spanish and rewrite it in English, it doesn't suddenly become mine even though it's 96.6492387% unique. Not even if I drop a few chapters, add in a couple of my own, and change the ending.

gspr · a day ago
> Training on copyleft licensed code is not a license violation. Any more than a person reading it is.

Some might hold that we've granted persons certain exemptions, on account of them being persons. We do not have to grant machines the same.

> In copyright terms, it's such an extreme transformative use that copyright no longer applies.

Has the model really performed an extreme transformation if it is able to produce the training data near-verbatim? Sure, it can also produce extremely transformed versions, but is that really relevant if it holds within it enough information for a (near-)verbatim reproduction?

Copyrightest · a day ago
The big difference between people reading code and LLMs reading code is that people have legal liability and LLMs do not. You can't sue an LLM for copyright infringement, and it's almost impossible for users to tell when it happens.

BTW in 2023 I watched ChatGPT spit out hundreds of lines of F# verbatim from my own GitHub. A lot of people had this experience with GitHub Copilot. "98.7% unique" is still a lot of infringement.

NewsaHackO · a day ago
I agree there has to be a court case about it. I think the current argument, however, is that it is transformative, and therefore falls under fair use.
munk-a · a day ago
Yea, a finding that training is transformative would be pretty significant and it's likely that the precedent of thumbnail creation being deemed transformative would likely steer us towards such a finding. Transformative is always a hard thing to bank on because it is such a nebulous and judgement based call. There are excellent examples of how precise and gritty this can get in audio sampling.
jazzyjackson · 21 hours ago
You don’t get to simply claim fair use based on how transformative your derivative work is.

“”” Section 107 calls for consideration of the following four factors in evaluating a question of fair use:

Purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes: Courts look at how the party claiming fair use is using the copyrighted work, and are more likely to find that nonprofit educational and noncommercial uses are fair. This does not mean, however, that all nonprofit education and noncommercial uses are fair and all commercial uses are not fair; instead, courts will balance the purpose and character of the use against the other factors below. Additionally, “transformative” uses are more likely to be considered fair. Transformative uses are those that add something new, with a further purpose or different character, and do not substitute for the original use of the work.

Nature of the copyrighted work: This factor analyzes the degree to which the work that was used relates to copyright’s purpose of encouraging creative expression. Thus, using a more creative or imaginative work (such as a novel, movie, or song) is less likely to support a claim of a fair use than using a factual work (such as a technical article or news item). In addition, use of an unpublished work is less likely to be considered fair.

Amount and substantiality of the portion used in relation to the copyrighted work as a whole: Under this factor, courts look at both the quantity and quality of the copyrighted material that was used. If the use includes a large portion of the copyrighted work, fair use is less likely to be found; if the use employs only a small amount of copyrighted material, fair use is more likely. That said, some courts have found use of an entire work to be fair under certain circumstances. And in other contexts, using even a small amount of a copyrighted work was determined not to be fair because the selection was an important part—or the “heart”—of the work.

Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner’s original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work (for example, by displacing sales of the original) and/or whether the use could cause substantial harm if it were to become widespread. “””

https://www.copyright.gov/fair-use/

paxys · 21 hours ago
The act of training by itself has been ruled to be fair use over and over again, including for LLMs, and there isn't much debate left there.

The test for infringement is if the output is transformative enough, and that is what NYT vs OpenAI etc. are arguing.

steve_gh · 14 hours ago
Is the LLM acting as my agent? If the LLM has been exposed to the source code then have I been exposed to the source code? So in that case is a "clean room" implementation possible?
kelseyfrog · a day ago
In the corporate world, we've started using reimplementation as a way to access tooling that security won't authorize.

Sec has a deny by default policy. Eng has a use-more-AI policy. Any code written in-house is accepted by default. You can see where this is going.

We've been using AI to reimplement tooling that security won't approve. The incentives conspired in the worst outcome, yet here we are. If you want a different outcome, you need to create different incentives.

kemitchell · a day ago
Not Invented Here's long, slow mutagenic march toward full antibiotic resistance continues apace.

There is a fundamental corpo-cognitive dissonance, to boot. If "AI" is cheap enough and good enough to implement security-relevant software from `git init` repeatedly, why isn't it also cheap enough and good enough to assess and approve the security of third-party software at pace with internal adoption? Is there some basis to believe LLMs' leverage on production differs from its leverage on analysis of existing code?

PaulDavisThe1st · a day ago
If Blanchard is claiming not to have been substantively involved in the creation of the new implementation of chardet (i.e. "Claude did it"), then the new implementation is machine generated, and in the USA cannot be copyright and thus cannot be licensed.

If he is claiming to have been somehow substantively "enough" involved to make the code copyrightable, then his own familiarity with the previous LGPL implementation makes the new one almost certainly a derivative of the original.

sigmar · a day ago
>then his own familiarity with the previous LGPL implementation makes the new one almost certainly a derivative of the original.

The "clean room rewrite" is just an extreme way to have a bulletproof shield against litigation. Not doing it that way doesn't automatically make all new code he writes derivative solely because he saw how the code worked previously.

PaulDavisThe1st · a day ago
If the clean room re-write was done entirely by Claude, then the result cannot be copyright in the USA, and thus there is no license at all.

And if he was in fact more involved (which he appears to deny) that it's a bit weak to say that someone with huge familiarity with chardet could choose to reimplement chardet without the result being derivative.