Many news articles have social media posts as sources. Most articles have other articles as sources. And then wikipedia takes the info from the news articles and compiles them. Now google takes all of these and creates summaries again and they have links to original sources in the ai summaries. EU Commission seems very naive and fallen out of time. They are not gonna stuff the AI revolution back into the bottle no matter how hard they try.
Relying on AI to just ‘summarize and link back’ is like expecting a blender to cook a gourmet meal - it’s technically doing something, but the nuance gets lost. Meanwhile, millions of site owners are already watching their traffic drop like ice cubes in a hot sun. The EU isn’t ‘anti-AI,’ they’re just noticing the kitchen is on fire.
I still have not gotten anyone to provide a reasonable response to a simple question; if training an AI on some content, how is your reading the same content and then including that in a synthesis of that information along with other information to form your own understanding of the world any different?
Alternately, will you start using royalties in perpetuity whenever you talk about some event, because you read an article or a book about that topic once and included something you learned in that article?
Basically everything you know, that is even somewhat recent is based on others’ content, do you track and cite every single thing you’ve ever read and send them royalties with every conversation?
I’m not trying to defend these big corporations, but for me this is a fundamental question we need to be asking.
As consequential as it will be, for me, the answer is that as long as you paid the cost of accessing the content (be it free or a subscription price) while collecting the information that is used to fundamentally transform the information in ways that seem to fall under fair use, then you cannot expect rights, short of full copy/paste plagiarism.
A training dataset is a document, not a method of processing a document. This type of document regularly gets reproduced and distributed in a commercial environment. Even if the distribution is contained within a large corporation, it still counts as distribution. Should that be allowed within the scope of copyright law? This seems like a legitimate question.
I’m often horrified to follow them down the rabbit hole and see it is a Redditor’s comment. That should terrify you if you have ever used Reddit. Sometimes it is correct, but a lot of times it is very much not right.
I'm pretty excited to see how this will develop, especially in the context of "Google Zero". Proving the existence of an anti-competitive effect and quantifying it precisely could be difficult.
I wonder if this will turn into the equivalent of music streaming. Where there's a pot of money that's allocated to different sources. Regardless this is going to negatively impact the current news business model (as do ad blockers and sites that prevent paywalls)
Google's AI summaries are actively harming quite a lot of people. They're regularly filled with misinformation, but they're presented as facts, complete with references. Many people do not understand the limitations of this technology, and simply believe what they're presented.
I'm not convinced that Google understands the limitations, to be honest. The most charitable interpretation I can give of their motivations is that they're terrified of competition from OpenAI, and are trying to present an alternative. Unfortunately, they're presenting a woefully inadequate product.
It goes further though, into legitimate questions of copyright, which the tech industry has always fought against. (Take first, deal with it later is the MO.)
> The European Commission said it would examine whether the firm used data from websites to provide this service - and if it failed to offer "appropriate compensation" to publishers.
While the EU wastes their time with things like this, they fall further and further behind the curve, still wondering why no one wants to start a business there.
Somewhat more difficult to run a business when an American multinational steals your revenue and your content.
On the other hand, the complainer mentioned is the Daily Mail.
I'd much rather see a non specific ruling over whether or not summarizing already short articles is copyright infringement - regardless of who's doing it. Copyright litigation and legislation tends to favor the richer party no matter where it happens.
Newspapers are notorious for lifting stories and photos from social media. They rarely bother to compensate the original creator either.
Perhaps a better approach is to make sure that the AI summaries are just as liable for libel actions, and regulator mandatory corrections, as the newspapers.
> make sure that the AI summaries are just as liable for libel actions
Is libel in AI generated summaries a problem?
Also, it seems you are fundamentally missing how AI is different. What would you expect a “regulator mandatory correction” to look like, a one sentence summary comes with a notice that it was corrected at some point?
AI is also going to make regulators and bureaucrats totally superfluous if done properly, where AI simply “regulates” based on laws written in a clear text and open weight manner.
Does Google still follow robots.txt? I think so. Should be easy to exclude Google Crawlers if that's what you're after. But of course everyone wants to profit off of Google's reach so excluding them won't work for most either.
We've had ~20+ years to come up with something better than copyright with nothing to show for. First it was the plebs ignoring copyrights, then it was the search engines and social networks and their knowledge graphs and now it's the billionaires and their AI companies that hoover up the web.
I thought here in HN we agreed that copying information was not stealing? You know, how you are not depriving the original website of their information or anything, because everything can be copied infinitely.
The idea that just because Europe still has some profitable businesses left, there is no need to compete for global technological leadership is so absurd that even putting it into words is hard.
E.g. Germany, the largest EU economy, is very dependent on their car export industry. Guess which industry isn't too hot right now? Do you think you salary will survive the EU losing their export markets? Mine surely will not.
Where "curve" = "exporting shiny toys without thought to long-term consequences". Good to see the EU is finally catching up to the harms of this and other US web tech.
I keep hearing this, especially on X which now hates the EU because it has fined X.
People need to understand that U.S. "tech" is barely considered tech in the EU as far as social media platforms and search engines go. You could cut off the Magnificent 7 completely and the EU would switch to new data sources and operating systems within a month.
U.S. "tech" is mostly entertainment, and the EU has also been behind Hollywood for the mass market movies for a long time.
In which bubble are you living right now? Almost all the EU tech companies uses AWS, Google cloud or Microsoft Azure. Good luck with recovering any data if you completely cut off Mag7. Also Without iOS or Android play store, you're back using Nokia or Chinese counterpart.
The pure ignorance the europeans have on their tech reliance on US tech is astounding.
It's not really about Google, I think the reason that HN in general is.. annoyed at the actions of the EU is because they're worried these rulings will be far reaching. This one in particular: fetching content from a website and feeding it to an AI to extract information or summarize it requires that the person doing it "compensate" the website operator. Well there goes one of the most useful tools in the AI toolbox being able to search the web for external information. It also codifies that accessing a webpage is a weird kind of transaction which also might put ad blockers in a weird legal grey area.
I don't want my Kagi quick answers, Summarize page, or Ask questions about page buttons to be turned off, I find them extremely useful.
GDPR is an excellent idea if it was actually enforced, which it wasn't. To their credit, the non-enforcement was consistent regardless of whether the offender was EU-based or not.
Yeah, everyone in the EU is just working on this one law case. The guy next to me just cooked the meals for the guy that made the paper the case was filed on and now has to take an extended break. /s
People can and will do many things at once, like actually pursuing monopoly issues AND trying to improve the situation for everyone else. Its almost like there is only limited amount of one thing: space on page 1 of media outlets.
Alternately, will you start using royalties in perpetuity whenever you talk about some event, because you read an article or a book about that topic once and included something you learned in that article?
Basically everything you know, that is even somewhat recent is based on others’ content, do you track and cite every single thing you’ve ever read and send them royalties with every conversation?
I’m not trying to defend these big corporations, but for me this is a fundamental question we need to be asking.
As consequential as it will be, for me, the answer is that as long as you paid the cost of accessing the content (be it free or a subscription price) while collecting the information that is used to fundamentally transform the information in ways that seem to fall under fair use, then you cannot expect rights, short of full copy/paste plagiarism.
Genuine question, how are you able to do that? Searching by exact matches with some portions of the AI suggested "response"? Some other method?
I'm not convinced that Google understands the limitations, to be honest. The most charitable interpretation I can give of their motivations is that they're terrified of competition from OpenAI, and are trying to present an alternative. Unfortunately, they're presenting a woefully inadequate product.
It goes further though, into legitimate questions of copyright, which the tech industry has always fought against. (Take first, deal with it later is the MO.)
They might as well just ban all non-EU tech at this point.
While the EU wastes their time with things like this, they fall further and further behind the curve, still wondering why no one wants to start a business there.
On the other hand, the complainer mentioned is the Daily Mail.
I'd much rather see a non specific ruling over whether or not summarizing already short articles is copyright infringement - regardless of who's doing it. Copyright litigation and legislation tends to favor the richer party no matter where it happens.
Newspapers are notorious for lifting stories and photos from social media. They rarely bother to compensate the original creator either.
Perhaps a better approach is to make sure that the AI summaries are just as liable for libel actions, and regulator mandatory corrections, as the newspapers.
Is libel in AI generated summaries a problem?
Also, it seems you are fundamentally missing how AI is different. What would you expect a “regulator mandatory correction” to look like, a one sentence summary comes with a notice that it was corrected at some point?
AI is also going to make regulators and bureaucrats totally superfluous if done properly, where AI simply “regulates” based on laws written in a clear text and open weight manner.
Somewhat more difficult to run a business when EU commissioners keep making up fines to steal your revenue.
We've had ~20+ years to come up with something better than copyright with nothing to show for. First it was the plebs ignoring copyrights, then it was the search engines and social networks and their knowledge graphs and now it's the billionaires and their AI companies that hoover up the web.
The EU is a big place with a lot going on. You will persuade more people and learn more if you engage in a more open style.
Like investigations into Apple, X and others...
E.g. Germany, the largest EU economy, is very dependent on their car export industry. Guess which industry isn't too hot right now? Do you think you salary will survive the EU losing their export markets? Mine surely will not.
Dead Comment
People need to understand that U.S. "tech" is barely considered tech in the EU as far as social media platforms and search engines go. You could cut off the Magnificent 7 completely and the EU would switch to new data sources and operating systems within a month.
U.S. "tech" is mostly entertainment, and the EU has also been behind Hollywood for the mass market movies for a long time.
The pure ignorance the europeans have on their tech reliance on US tech is astounding.
I can't begin to understand the level of delusion you have reached. You truly are fish unaware of the water you are in.
I don't want my Kagi quick answers, Summarize page, or Ask questions about page buttons to be turned off, I find them extremely useful.
People can and will do many things at once, like actually pursuing monopoly issues AND trying to improve the situation for everyone else. Its almost like there is only limited amount of one thing: space on page 1 of media outlets.
Dead Comment
I wish the US would call their bluff and avenge those bullshit fines sevenfold with tariffs.