Readit News logoReadit News
jhaile · 4 months ago
One aspect that I feel is ignored by the comments here is the geo-political forces at work. If the US takes the position that LLMs can't use copyrighted work or has to compensate all copyright holders – other countries (e.g. China) will not follow suit. This will mean that US LLM companies will either fall behind or be too expensive. Which means China and other countries will probably surge ahead in AI, at least in terms of how useful the AI is.

That is not to say that we shouldn't do the right thing regardless, but I do think there is a feeling of "who is going to rule the world in the future?" tha underlies governmental decision-making on how much to regulate AI.

oooyay · 4 months ago
Well hell, by that logic average citizens should be able to launder corporate intellectual property because China will never follow suit in adhering to intellectual property law. I'm game if you are.
jowea · 4 months ago
Isn't that sort of logic precisely why China doesn't adhere to IP law?
rollcat · 4 months ago
Well I always felt rebellious about the contemporary face of "rules for thee but not for me", specifically regarding copyright.

Musicians remain subject to abuse by the recording industry; they're making pennies on each dollar you spend on buying CDs^W^W streaming services. I used to say, don't buy that; go to a concert, buy beer, buy merch, support directly. Nowadays live shows are being swallowed whole through exclusivity deals (both for artists and venues). I used to say, support your favourite artist on Bandcamp, Patreon, etc. But most of these new middlemen are ready for their turn to squeeze.

And now on top of all that, these artists' work is being swallowed whole by yet another machine, disregarding what was left of their rights.

What else do you do? Go busking?

seanmcdirmid · 4 months ago
In the long run private IP will eventually become very public despite laws you have, it’s been like that since the Stone Age. The American Industrial Revolution was built partially on stolen IP from Britain. The internet has just sped up diffusion. You can stop it if you are willing to cut the line, but legal action is only some friction and even then only in the short term
Bjorkbat · 4 months ago
I broadly agree in that sure, unfettered access to copyrighted material will AI more capable, but more capable of what exactly?

For national security reasons I'm perfectly fine with giving LLMs unfettered access to various academic publications, scientific and technical information, that sort of thing. I'm a little more on the fence about proprietary code, but I have a hard time believing there isn't enough code out there already for LLMs to ingest.

Otherwise though, what is an LLM with unfettered access to copyrighted material better at vs one that merely has unfettered access to scientific / technical information + licensed copyrighted material? I would suppose that besides maybe being a more creative writer, the other LLM is far more capable of reproducing copyrighted works.

In effect, the other LLM is a more capable plagiarism machine compared to the other, and not necessarily more intelligent, and otherwise doesn't really add any more value. What do we have to gain from condoning it?

I think the argument I'm making is a little easier to see in the case of image and video models. The model that has unfettered access to copyrighted material is more capable, sure, but more capable of what? Capable of making images? Capable of reproducing Mario and Luigi in an infinite number of funny scenarios? What do we have to gain from that? What reason do we have for not banning such models outright? Not like we're really missing out on any critical security or economic advantages here.

Teever · 4 months ago
If common culture is an effective substrate to communicate ideas as in we can use shared pop culture references to make metaphors to explain complex ideas then the common culture that large companies have ensnared in excessively long copyrights and trademarks to generate massive profits is a useful thing for an LLM that is designed to convey ideas to have embedded in it.

If I'm learning about kinematics maybe it would be more effective to have comparisons to Superman flying faster than a speeding bullet and no amount of dry textbooks and academic papers will make up for the lack of such a comparison.

This is especially relevant when we're talking about science-fiction which has served as the inspiration for many of the leading edge technologies that we use including stuff like LLMs and AI.

bigbuppo · 4 months ago
The real problem here is that AI companies aren't even willing to follow the norms of big business and get the laws changed to meet their needs.
johnnyanmac · 4 months ago
This is pre iselt why we need proportional fees for courts. We can't just let companies treat the law as a cost benefits analysis. They should live in fear of a court result against their favor.
hulitu · 4 months ago
> One aspect that I feel is ignored by the comments here is the geo-political forces at work. If the US takes the position that LLMs can't use copyrighted work or has to compensate all copyright holders – other countries (e.g. China) will not follow suit.

Oh really ? They didn't had any problem when people installed copyrighted Windows to come after them. BSA. But now Microsoft turns a blind eye because it suits them.

stonogo · 4 months ago
Big "Mr. President, we cannot allow a mineshaft gap" energy going on, even if it's difficult for me personally to believe that LLMs contribute in any sense to ruling the world.
therouwboat · 4 months ago
If AI is so important, maybe it should be owned by the government and free to use for all citizens.
pc86 · 4 months ago
Name two non-military things that the government owns and aren't complete dumpster fires that barely do the thing they're supposed to do.

Even (especially?) the military is a dumpster fire but it's at least very good at doing what it exists to do.

bgwalter · 4 months ago
The same president that is putting 145% tariffs on China could put 1000% tariffs on Internet chat bots located in China. Or order the Internet cables to be cut as a last resort (citing a national emergency as is the new practice).

I'm not sure at all what China will do. I find it likely that they'll forbid AI at least for minors so that they do not become less intelligent.

Military applications are another matter that are not really related to these copyright issues.

pc86 · 4 months ago
How exactly does one add a tariff to a foreign-based chat bot?
gruez · 4 months ago
>Or order the Internet cables to be cut as a last resort (citing a national emergency as is the new practice).

what if they route through third countries?

arp242 · 4 months ago
I get what you're saying, but this is just a race to the bottom, no?

It's annoying to see the current pushback against China focusing so much on inconsequential matters with so much nonsense mixed in, because I do think we do need to push back against China on some things.

1vuio0pswjnm7 · 4 months ago
The design, manufacture and supply of electronics is far more important than one particular usage, e.g, "LLMs". It has never been a requirement to violate copyrights to produce electronics, or computer software. In fact, arguably there would be no "MicroSoft" were it not for Gates' lobbying for the existence and enforcement of "software copyright". The "Windows" franchise, among others, relies on it. The irony of Microsoft's support for OpenAI is amusing. Copyright enforcement for me but not for thee.
asddubs · 4 months ago
you could apply that same logic to any IP breaches though, not just AI
Ekaros · 4 months ago
Your employee steals your source code and sells it to multiple competitors. Why should you have any right to go after those competitors?
mattxxx · 4 months ago
Well, firing someone for this is super weird. It seems like an attempt to censor an interpretation of the law that:

1. Criticizes a highly useful technology 2. Matches a potentially-outdated, strict interpretation of copyright law

My opinion: I think using copyrighted data to train models for sure seems classically illegal. Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against. When I look at the litany of derivative fantasy novels, it's obvious they're not all fully independent works.

Since AI is and will continue to be so useful and transformative, I think we just need to acknowledge that our laws did not accomodate this use-case, then we should change them.

madeofpalk · 4 months ago
> Humans can read a book, get inspiration, and write a new book and not be litigated against

Humans get litigated against this all the time. There is such thing as, charitably, being too inspired.

https://en.wikipedia.org/wiki/List_of_songs_subject_to_plagi...

jrajav · 4 months ago
If you follow these cases more closely over time you'll find that they're less an example of humans stealing work from others and more an example of typical human greed and pride. Old, well established musicians arguing that younger musicians stole from them for using a chord progression used in dozens of songs before their own original, or a melody on the pentatonic scale that sounds like many melodies on the pentatonic scale do. It gets ridiculous.

Plus, all art is derivative in some sense, it's almost always just a matter of degree.

zelphirkalt · 4 months ago
The law covers these cases pretty well, it is just that the law has very powerful extremely rich adversaries, whose greed has gotten the better of them again and again. They could use work released sufficiently long ago to be legally available, or they could take work released as creative commons, or they could run a lookup, to make sure to never output verbatim copies of input or outputs, that are within a certain string editing distance, depending on output length, or they could have paid people to reach out to all the people, whose work they are infringing upon. But they didn't do any of that, of course, because they think they are above the law.
nadermx · 4 months ago
I'm confused, so you're saying its illegal? Because last I checked it's still in the process of going through the courts. And need we forget that copyright's purpose is to advance the arts and sciences. Fair use is codified into law, which states each case is seen on a use by use basis, hence the litigation to determine if it is in fact, legal.
ashoeafoot · 4 months ago
Obviously a revenue tracking weight should be trained in allowing the tracking and collection of all values generated from derivative works.
hochstenbach · 4 months ago
Humans are not allowed to do what AI firms want to do. That was one of the copyright office arguments: a student can't just walk into a library and say "I want a copy of all your books, because I need them for learning".

Humans are also very useful and transformative.

timdiggerm · 4 months ago
Or we could acknowledge that something could be a bad idea, despite its utility
ceejayoz · 4 months ago
> Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against.

You're still not gonna be allowed to commercially publish "Hairy Plotter and the Philosophizer's Rock".

WesolyKubeczek · 4 months ago
No, but you are most likely allowed to commercially publish "Hairy Potter and the Philosophizer's Rock", a story about a prehistoric community. The hero is literally a hairy potter who steals a rock from a lazy deadbeat dude who is pestering the rest of the group with his weird ideas.
anigbrowl · 4 months ago
You are if it's parody, cf 'Bored of the Rings'.
ActionHank · 4 months ago
Assuming this means copyright is dead, companies will be vary upset and patents will likely follow.

The hold US companies have on the world will be dead too.

I also suspect that media piracy will be labelled as the only reason we need copyright, an existing agency will be bolstered to address this concern and then twisted into a censorship bureau.

regularjack · 4 months ago
Then they need to be changed for everyone and not just AI companies, but we all know that ain't happening.
dns_snek · 4 months ago
The problem with this kind of analysis is that it doesn't even try to address the reasons why copyright exists in the first place. This belief that training LLMs on content without permission should be allowed is incompatible with the belief that copyright is useful, you really have to pick a lane here.

Go back to the roots of copyright and the answers should be obvious. According to the US constitution, copyright exists "To promote the Progress of Science and useful Arts" and according to the EU, "Copyright ensures that authors, composers, artists, film makers and other creators receive recognition, payment and protection for their works. It rewards creativity and stimulates investment in the creative sector."

If I publish a book and tech companies are allowed to copy it, use it for "training", and later regurgitate the knowledge contained within to their customers then those people have no reason to buy my book. It is a market substitute even though it might not be considered such under our current copyright law. If that is allowed to happen then investment will stop and these books simply won't get written anymore.

p0w3n3d · 4 months ago
it's funny how a law becomes potentially-outdated only when big corporations want to violate in on a global scale.

As a private person I no longer feel incentivised to create new content online because I think that all I create will eventually be stolen from me...

franczesko · 4 months ago
> Piracy refers to the illegal act of copying, distributing, or using copyrighted material without authorization. It can occur in various forms

Professing of IP without a license AND offering it as a model for money doesn't seem like an unknown use-case to me

SilasX · 4 months ago
>My opinion: I think using copyrighted data to train models for sure seems classically illegal. Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against. When I look at the litany of derivative fantasy novels, it's obvious they're not all fully independent works.

Huh? If you agree that "learning from copyrighted works to make new ones" has traditionally not been considered infringement, then can you elaborate on why you think it fundamentally changes when you do it with bots? That would, if anything, seem to be a reversal of classic copyright jurisprudence. Up until 2022, pretty much everyone agreed that "learning from copyrighted works to make new ones" is exactly how it's supposed to work, and would be horrified at the idea of having to separately license that.

Sure, some fundamental dynamic might change when you do it with bots, but you need to make that case in an enforceable, operationalized way.

Deleted Comment

bitfilped · 4 months ago
Sorry but AI isn't that useful and I don't see it becoming any more useful in the near term. It's taken since ~1950 to get LLMs working well enough to become popular and they still don't work well.

Dead Comment

Dead Comment

jeroenhd · 4 months ago
Pirating movies is also useful, because I can watch movies without paying on devices that apps and accounts don't work on.

That doesn't make piracy legal, even though I get a lot of use out of it.

Also, a person isn't a computer so the "but I can read a book and get inspired" argument is complete nonsense.

Workaccount2 · 4 months ago
It's only complete non-sense if you understand how humans learn. Which we don't.

What we do know though is that LLMs, similar to humans, do not directly copy information into their "storage". LLMs, like humans, are pretty lossy with their recall.

Compare this to something like a search indexed database, where the recall of information given to it is perfect.

datavirtue · 4 months ago
And everyone here is downloading every show and movie in existence without even a hint of guilt.
apercu · 4 months ago
>Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against.

Corporations are not humans. (It's ridiculous that they have some legal protections in the US like humans, but that's a different issue). AI is also not human. AI is also not a chipmunk.

Why the comparison?

stevenAthompson · 4 months ago
Doing a cover song requires permission, and doing it without that permission can be illegal. Being inspired by a song to write your own is very legal.

AI is fine as long as the work it generates is substantially new and transformative. If it breaks and starts spitting out other peoples work verbatim (or nearly verbatim) there is a problem.

Yes, I'm aware that machines aren't people and can't be "inspired", but if the functional results are the same the law should be the same. Vaguely defined ideas like your soul or "inspiration" aren't real. The output is real, measurable, and quantifiable and that's how it should be judged.

mjburgess · 4 months ago
I fear the lack of our ability to measure your mind might render you without many of the legal or moral protections you imagine you have. But go ahead, tare down the law to whatever inanity can be described by the trivial machines of the world's current popular charlatans. Presumably you weren't using society's presumption of your agency anyway.
toast0 · 4 months ago
> Doing a cover song requires permission, and doing it without that permission can be illegal.

I believe cover song licensing is available mechanically; you don't need permission, you just need to follow the procedures including sending the licensing fees to a rights clearing house. Music has a lot of mechanical licenses and clearing houses, as opposed to other categories of works.

datavirtue · 4 months ago
"If it breaks and starts spitting out other peoples work verbatim (or nearly verbatim) there is a problem."

Why is that? Seems all logic gets thrown out the window when invoking AI around here. References are given. If the user publishes the output without attribution, NOW you have a problem. People are being so rabid and unreasonable here. Totally bat shit.

vessenes · 4 months ago
Thank you - a voice of sanity on this important topic.

I understand people who create IP of any sort being upset that software might be able to recreate their IP or stuff adjacent to it without permission. It could be upsetting. But I don't understand how people jump to "Copyright Violation" for the fact of reading. Or even downloading in bulk. The Copyright controls, and has always controlled, creation and distribution of a work. In the nature even of the notice is embedded the concept that the work will be read.

Reading and summarizing have only ever been controlled in western countries via State's secrets type acts, or alternately, non-disclosure agreements between parties. It's just way, way past reality to claim that we have existing laws to cover AI training ingesting information. Not only do we not, such rules would seem insane if you substitute the word human for "AI" in most of these conversations.

"People should not be allowed to read the book I distributed online if I don't want them to."

"People should not be allowed to write Harry Potter fanfic in my writing style."

"People should not be allowed to get formal art training that involves going to museums and painting copies of famous paintings."

We just will not get to a sensible societal place if the dialogue around these issues has such a low bar for understanding the mechanics, the societal tradeoffs we've made so far, and is able to discuss where we might want to go, and what would be best.

datavirtue · 4 months ago
Exactly, it is an immense privilege to have your works preserved and promulgated through the ages for instant recall and automated publishing. It's literally what everyone wants. The creators and the consumers. The AI companies are not robbing your money or IP. Period.
caconym_ · 4 months ago
If it was as obvious as you claim, the legal issues would already be settled, and your characterization of what LLMs are doing as "reading and summarizing" is hilariously disingenuous and ignores essentially the entire substance of the debate (which is happening not just on internet forums but in real courts, where real legal professionals and scholars are grappling with how to fit AI into our framework of existing copyright law, e.g.^[1]).

Of course, if you start your thought by dismissing anybody who doesn't share your position as not sane, it's easy to see how you could fail to capture any of that.

^[1] https://arstechnica.com/tech-policy/2025/05/judge-on-metas-a...

jasonlotito · 4 months ago
> But I don't understand how people jump to "Copyright Violation" for the fact of reading.

The article specificaly talks about the creation and distribution of a work. Creation and distribution of a work alone is not a copyright violation. However, if you take in input from something you don't own, and genAI outputs something, it could be considered a copyright violation.

Let's make this clear; genAI is not a copyright issue by itself. However, gen AI becomes an issue when you are using as your source stuff you don't have the copyright or license to. So context here is important. If you see people jumping to copyright violation, it's not out of reading alone.

> "People should not be allowed to read the book I distributed online if I don't want them to."

This is already done. It's been done for decades. See any case where content is locked behind an account. Only select people can view the content. The license to use the site limits who or what can use things.

So it's odd you would use "insane" to describe this.

> "People should not be allowed to write Harry Potter fanfic in my writing style."

Yeah, fan fiction is generally not legal. However, there are some cases where fair use covers it. Most cases of fan fiction are allowed because the author allows it. But no, generally, fan fiction is illegal. This is well known in the fan fiction community. Obviously, if you don't distribute it, that's fine. But we aren't talking about non-distribution cases here.

> "People should not be allowed to get formal art training that involves going to museums and painting copies of famous paintings."

Same with fan fiction. If you replicate a copyrighted piece of art, if you distribute it, that's illegal. If you simply do it for practice, that's fine. But no, if you go around replicating a painting and distribute it, that's illegal.

Of course, technically speaking, none of this is what gen AI models are doing.

> We just will not get to a sensible societal place if the dialogue around these issues has such a low bar for understanding the mechanics

I agree. Personifying gen AI is useless. We should stick to the technical aspects of what it's doing, rather than trying to pretend it's doing human things when it's 100% not doing that in any capacity. I mean, that's fine for the the layman, but anyone with any ounce of technical skill knows that's not true.

wnevets · 4 months ago
> Minnesota woman to pay $220,000 fine for 24 illegally downloaded songs [1]

https://www.theguardian.com/technology/2012/sep/11/minnesota... [1]

gruez · 4 months ago
How is this relevant?

>The RIAA accused her of downloading and distributing more than 1,700 music files on file-sharing site KaZaA

Emphasis mine. I think most people would agree that whatever AI companies are doing with training AI models is different than sending verbatim copies to random people on the internet.

breakingcups · 4 months ago
Well, Facebook torrented the copyrighted material they used for training, which means they distributed all those files too. With the personal approval of Zuck. What is the difference according to you?

Source: https://futurism.com/the-byte/facebook-trained-ai-pirated-bo...

wnevets · 4 months ago
> I think most people would agree that whatever AI companies are doing with training AI models is different than sending verbatim copies to random people on the internet.

I think most artist who had their works "trained by AI" without compensation would disagree with you.

hulitu · 4 months ago
> How is this relevant?

She was training RI (real intelligence). Is now relevant ? Or does she has to be rich and pay some senators to be relevant ?

jofla_net · 4 months ago
Who knew alls she needed was to change the tempo, pitch, timbre, add/remove lyrics, add/subtract a few notes, rearrange harmony, put it behind a web portal with a fancy name, claim it had an inspirational muse or assume all mortal beings as being without one in the first place so it doesn't matter, and proceed to make millions off of said process methodically rather than giving it away for free, and she'd be right as rain.
Workaccount2 · 4 months ago
I have yet to see someone explain in detail how transformer model training works (showing they understand the technical nitty gritty and the overall architecture of transformers) and also layout a case for why it is clearly a violation of copyright.

You can find lots of people talking about training, and you can find lots (way more) of people talking about AI training being a violation of copyright, but you can't find anyone talking about both.

Edit: Let me just clarify that I am talking about training, not inference (output).

jfengel · 4 months ago
I'm not sure I understand your question. It's reasonably clear that transformers get caught reproducing material that they have no right to. The kind of thing that would potentially result in a lawsuit if you did it by hand.

It's less clear whether taking vast amounts of copyrighted material and using it to generate other things rises to the level of copyright violation or not. It's the kind of thing that people would have prevented if it had occurred to them, by writing terms of use that explicitly forbid it. (Which probably means that the Web becomes a much smaller place.)

Your comment seems to suggest that writers and artists have absolutely no conceivable stake in products derived from their work, and that it's purely a misunderstanding on their part. But I'm both a computer scientist and an artist and I don't see how you could reach that conclusion. If my work is not relevant then leave it out.

gruez · 4 months ago
>I'm not sure I understand your question. It's reasonably clear that transformers get caught reproducing material that they have no right to. The kind of thing that would potentially result in a lawsuit if you did it by hand.

Is that a problem with the tool, or the person using it? A photocopier can copy an entire book verbatim. Should that be illegal? Or is it the problem that the "training" process can produce a model that has the ability to reproduce copyrighted work? If so, what implication does that hold for human learning? Many people can recite an entire song's lyrics from scratch, and reproducing an entire song's lyrics verbatim is probably enough to be considered copyright infringement. Does that mean the process of a human listening to music counts as copyright infringement?

tensor · 4 months ago
If I write a math book, and you read it, then tell someone about the math within it. You are not violating copyright. In fact, you could write your OWN math book, or history book, or whatever, and as long as you're not copying my actual text, you are not violating copyright.

However, when an LLM does the same, people now what it to be illegal. It seems pretty straightforward to apply existing copyright law to LLMs in the same way we apply them to humans. If the actual text they generate is substantially similar to a source material that it would constitute a copyright violation if a human were to have done it, then it should be illegal. Otherwise it should not.

edit: and in fact it's not even whether an LLM reproduces text, it's wether someone subsequently publishes that text. The person publishing that text should be the one taking on the legal hit.

Workaccount2 · 4 months ago
My comment is about training models, not model inference.

Most artists can readily violate copyright, that doesn't me we block them from seeing copyright.

mr_toad · 4 months ago
> It's the kind of thing that people would have prevented if it had occurred to them, by writing terms of use that explicitly forbid it.

The AI companies will likely be arguing that they don’t need a license, so any terms of use in the license are irrelevant.

gitremote · 4 months ago
They never said model training is a violation of copyright. The ruling says model training on copyrighted material for analysis and research is NOT copyright infringement, but the commercial use of the resulting model is:

"When a model is deployed for purposes such as analysis or research… the outputs are unlikely to substitute for expressive works used in training. But making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries."

Workaccount2 · 4 months ago
The vast trove of copyright work has to refer to training. ChatGPT is likely on the order of 5-10TB in size. (Yes, Terabyte).

There are college kids with bigger "copyright collections" than that...

belorn · 4 months ago
I would also like to see such explanation, especially one that explains how it differ from regular transformers found in video codecs. Why is a lossy compression a clear violation of copyright, but not a generative AI?
moralestapia · 4 months ago
Because it's a machine that reproduces other people's work, who are copyrighted. Copyright protects the essence of original work even after its present in or turned into derivative work.

Some try to make the argument of "but that's what humans do and it's allowed", but that's not a real argument as it has not been proven, nor it is easy to prove, that machine learning equates human reasoning. In the absence of evidence, the law assumes NO.

jsiepkes · 4 months ago
This isn't about training AI on a book, but AI companies never paying for the book at all. As in: They "downloaded the e-book from a warez site" and then used it for training.
xhkkffbf · 4 months ago
This is what's most offensive about it. At least buy one friggin copy.
autobodie · 4 months ago
I have yet to see someone explain in detail how writing the same words as another person works (showing they understand the technical nitty gritty and the overall architecture of the human mind) and also layout a case for why it is clearly a violation of copyright. You can find lots of people talking about reading, and you can find lots (way more) of people talking about plagarism being a violation of copyright, but you can't find anyone talking about both.
xhkkffbf · 4 months ago
A big part of copyright law is protecting the market for the original creator. Not guaranteeing them anything. Just preventing someone else from coming along and copying someone else's work in a way that hurts their sales.

While AIs don't reproduce things verbatim like pirates, I can see how they really undermine the market, especially for non-fiction books. If people can get the facts without buying the original book, there's much less incentive for the original author to do the hard research and writing.

dmoy · 4 months ago
Not a ton of expert programmer + copyright lawyers, but I bet they're out there

You can probably find a good number of expert programmer + patent lawyers. And presumably some of those osmose enough copyright knowledge from their coworkers to give a knowledgeable answer.

At the end of the day though, the intersection of both doesn't matter. The lawyers win, so what really matters is who has the pulse on how the Fed Circuit will rule on this

Also in this specific case from the article, it's irrelevant?

kranke155 · 4 months ago
It doesn’t matter how they work, it only matters what they do.
anhner · 4 months ago
because people who understand how training works also understand that it's not a violation of copyright...
nickpsecurity · 4 months ago
I did here with proofs of infingement:

https://gethisword.com/tech/exploringai/

prvc · 4 months ago
The released draft report seems merely to be a litany of copyright holder complaints repeated verbatim, with little depth of reasoning to support the conclusions it makes.
bgwalter · 4 months ago
The required reasoning is not very deep though: If an AI reads 100 scientific papers and churns out a new one, it is plagiarism.

If a savant has perfect recall, remembers text perfectly and rearranges that text to create a marginally new text, he'd be sued for breach of copyright.

Only large corporations get away with it.

scraptor · 4 months ago
Plagiarism is not an issue of copyright law, it's an entirely separate system of rules maintained by academia. The US Copyright Office has no business having opinions about it. If a AI^W human reads 100 papers and then churns out a new one this is usually called research.
shkkmo · 4 months ago
> If a savant has perfect recall, remembers text perfectly and rearranges that text to create a marginally new text, he'd be sued for breach of copyright.

Any suits would be based on the degree the marginally new copy was fair use. You wouldn't be able to sue the savant for reading and remembering the text.

Using AI to creat marginally new copies of copyrighted work is ALREADY a violation. We don't need a dramatic expansion of copyright law that says that just giving the savant the book to real is a copyright violation.

Plagarism and copyright are two entirely different things. Plagarism is about citations and intellectual integrity. Copyright is a about protecting economic interests, has nothing to to with intellectual integrity, and isn't resolved by citing the original work. In fact most of the contexts where you would be accused of plagarism, would be places like reporting, criticism, education or research goals make fair use arguments much easier.

glial · 4 months ago
It reminds me of the old joke.

"To steal ideas from one person is plagiarism; to steal from many is research."

wizee · 4 months ago
Is reading and memorizing a copyrighted text a breach of copyright? I.e. is creating a copy of the text in your mind a breach of copyright or fair fair use? Is it a breach of copyright if a digital “mind” similarly memorizes copyrighted text? Or is it only a breach of copyright to output and publish that memorized text?

What about loosely memorizing the gist of a copyrighted text. Is that a breach or fair use? What if a machine does something similar?

This falls under a rather murky area of the law that is not well defined.

satanfirst · 4 months ago
That's not logical. If the savant has perfect recall and makes minor edits they are like a digital copy and aren't really like a human, neural network or by extension any other ML model that isn't over-fitted.
tantalor · 4 months ago
If AI really could "churn out a new scientific paper" we would all be ecstatically rejoicing in the dawning of an age of AGI. We are nowhere near that.
JKCalhoun · 4 months ago
My understanding — LLMs are nothing at all like a "savant with perfect recall".

More like a speed-reader who retains a schema-level grasp of what they’ve read.

Maxatar · 4 months ago
Plagiarism isn't illegal, has nothing to do with the law.
mr_toad · 4 months ago
> If a savant has perfect recall

AI don’t have perfect recall.

nadermx · 4 months ago
Not only does it read like a litany[0]. It seems like the copyright holders are not happy with how the meta case is working through court and are trying to sidestep fair use entirely.

https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

mr_toad · 4 months ago
Copywriter holders have always hated fair use, and often like to pretend it doesn’t exist.

The average copywrite holder would like you to think that the law only allows use of their works in ways that they specifically permit, i.e. that which is not explicitly permitted is forbidden.

But the law is largely the reverse; it only denies use of copyright works in certain ways. That which is not specifically forbidden is permitted.

raverbashing · 4 months ago
I don't have much spare sympathy here honestly

Dead Comment

stevetron · 4 months ago
It's amazing the amount of bad deeds coming out of the current administration in support of special interests.
elif · 4 months ago
Intellectual property law is quickly becoming an institution of hegemonic corporate litigation of the spreading of ideas.

If it's illegal to know the entire contents of a book it is arbitrary to what degree you are able to codify that knowing itself into symbols.

If judges are permitted to rule here it is not about reproduction of commercial goods but about control of humanity's collective understanding.

throw0101c · 4 months ago
See "Copyright and Artificial Intelligence Part 3: Generative AI Training" (PDF):

* https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...