Readit News logoReadit News
skipants · a year ago
It's pretty funny to me that this is used as a hit-piece against AI generated content. Here's an open-secret for everyone: web-based sports content companies have been automatically generating content articles for at least a decade; way before LLMs became popular.

It is and was mostly done for search ranking. The more seemingly applicable content, the better your SEO.

This situation has probably happened several times before AI but has gone unnoticed or noticed to little fanfare. It's more indicative of ESPN not having its finger to the pulse and ensuring that one of the copywriters manually updated this particular article. It's not too surprising to me. They've always been known to favour quantity over quality.

source: I worked as a developer on the tech side of a company with this kind of content

omgJustTest · a year ago
Well AI certainly didn't make it better in this case.

The fact that it is profitable to make this generic sports-lingo-laced content is, on its own, pretty depressing.

fedeb95 · a year ago
the depression shouldn't arise from the generation, but from the consumption, for it is the latter that ultimately drives the former; that, in turn, can generate a reflection on what I myself consume.
YeahThisIsMe · a year ago
But sports reporting has pretty much always been this way.

Unless anything extraordinary happens, every post game interview features the same questions with the same answers and every article about the game looks the same with only the names of the participating players and their stats changing.

croes · a year ago
So you're are telling me AI is as bad as the previous method, with more data and higher energy consumption?
serial_dev · a year ago
Well, previously you needed an intern working maybe an hour (of they actually wrote good content, obviously more), that intern needs food, air conditioning / heating, so I'm not sure which consumes less energy. Now you have it in a minute and for a predictable low fee.
soperj · a year ago
> ensuring that one of the copywriters manually updated this particular article.

Clearly it was never "there" yet though previously, and obviously still isn't when this article is what's generated. You can tell that a lot of sports articles are essentially "fill in the blank", which is why they get the AP stories up right away, and then have their actual beat reporters come out with something later that night, or early next morning.

williamcotton · a year ago
Beat reporters will also write during the game, and if I’m remembering correctly, will work on differing versions of the ultimate winner at the same time.
skipants · a year ago
Pretty much. Though there were a lot of articles that are never touched, based on popularity.
add-sub-mul-div · a year ago
ESPN has been in decline since longer than LLMs have been around. More explicit use of AI is just accelerating the downward spiral.
bloomingeek · a year ago
Absolutely correct! In the old days it was mostly video replay and witty banter describing what you were watching, which was fun to watch. Now it's become something I couldn't care less about, which is less video and someone telling me what I should think about what I just watched.
nottorp · a year ago
What web have you been browsing in the past 5+ years? It's not only sports "content". Any kind of content is drowned into low quality SEO pages. Auto generated or not, it's as useless. LLMs just generate it cheaper.
yunwal · a year ago
What a weird and rude way to call this out. The parent wasn’t talking about non-sports content and never suggested that sports writing was the only type of content plagued by SEO spam.
fedeb95 · a year ago
to me it doesn't appear particularly against or in favour of. Seemingly, LLMs don't avoid completely some of the mistakes that were already being made. Just as with any technology, the questions should be: by how much the two errors differ? Does the cost justify this margin?
skipants · a year ago
Maybe hit-piece was a strong word; but I do think it's saying this happened _because_ of AI when to me it's been an issue for a long time. ESPN was always pretty egregious when it came to penny-pinching on content and ignoring less popular sports. Ask ice hockey fans how they feel about ESPN.
airstrike · a year ago
> has gone unnoticed or noticed to little fanfare

To be fair, I don't think it's gone unnoticed at all

LordDragonfang · a year ago
Not by people in the know, no. But I'd say 80% of "normal" people reading sports websites probably had no idea until now, and the only difference is that (gen)AI is suddenly a "feature" to be boasted about to customers to make shareholders happy. And suddenly all of these normal people are starting to have opinions on it.
locallost · a year ago
What makes this article a hit piece actually? Seems like straightforward reporting on facts.

Deleted Comment

Deleted Comment

dwighttk · a year ago
Those articles were awful and at least some were noticeable

Deleted Comment

greenthrow · a year ago
You're missing the entire point. LLMs are super overhyped and this is the 9000th example of how garbage LLM generated "content" is. It doesn't matter that garbage was being generated before. If LLMs were only being sold as "it might be better than the spam generator you're using today" we wouldn't have the current bubble where people are losing their jobs because C-suite clowns believe the hype.
flappyeagle · a year ago
If they were overhyped people would not actually lose their jobs
mp05 · a year ago
> It's pretty funny to me that this is used as a hit-piece against AI generated content.

Is it though? My takeaway was that they lament the terrible "journalistic" standards that ESPN embraces.

stefan_ · a year ago
Not to blow your bubble but uh, yes, everyone is aware, they were garbage then, too.
jetrink · a year ago
This match was broadcast by ESPN on both ESPN2 and ESPN+. In that case, they are presumably paying two knowledgeable commentators to talk about the match before, during, and after it happens, adding context and describing the important events. Are they not providing a transcript of that to the writerbot? That seems like real low-hanging fruit, especially since the commentary is already live captioned.
jetrink · a year ago
I got nerd-snipped by this. I transcribed the match using the whisper-small.en and then asked ChatGPT to create a summary using a neutral prompt:

Here is a transcript of a soccer match. In the style of an experienced professional sports reporter, please write a 200 word article about the match.

Its summary starts, "In her final professional match, Alex Morgan delivered a performance filled with emotion and resilience, though her San Diego Wave fell short in a 3-1 loss to North Carolina Courage. The game at Snapdragon Stadium in San Diego was more than just a contest; it was a tribute to one of soccer’s most iconic figures."

It did get the score wrong. Here's the rest:

1. https://chatgpt.com/share/de8c60d1-69ab-4291-99dc-d4d95af3d3...

regretaverse · a year ago
OpenAI transcription & LLM is likely more costly than what they're willing to splurge on this. Care to link soccer.txt, so I can try with Llama 3 / 3.1?
low_tech_love · a year ago
That’s a very good write up, so good it’s depressing. Am I the only one who is utterly uninterested in reading anything that an AI writes?
btown · a year ago
Sports have so much structured data, and such a high bar for describing it accurately (especially for a brand like ESPN), that there are significant risks to the hallucinations that might develop from a multi-hour transcript being fed into an LLM, especially with commentators excited about potential goals and other events that don't end up happening.

On the other hand, the rather simple task of "here's a set of goals, their times, who made them, who assisted... turn that into prose" could even be done without LLMs with a deterministic algorithm, and may very well have been in this case. Some of the grammar issues in the OP feel very pre-LLM in nature, like a combination of substitution rules gone awry.

Now, could you create a system that repeatedly interrogates the statements made by a first pass of an LLM on summarizing a long transcript, and comparing those results against structured data you know for accuracy? Would this lead to richer content and accessible error rates relative to the simpler approach? Would this be the type of thing that the best machine learning engineers in the world could probably prototype over a hackathon? The answer is very possibly yes to all three of these. But it's far from low-hanging fruit for any sizable, risk-averse organization. It's very difficult to fight against "the thing we have is imperfect, but at least it never gets the facts wrong."

btown · a year ago
s/accessible/acceptable/ - guess I should have run my comment through the kind of check-for-typos LLM step that I described above!
nolok · a year ago
It's a company that see itself as "media / journalism", and having consulted for a few of those I've always been amazed that their tech teams is most often isolated from the content team with very low access to said media. It's very different from what you would expect in tech (access to everything), or just common sense in general.

Note that I have no knowledge whatsoever of how ESPN work, I'm inferring from what I've seen elsewhere.

batesy · a year ago
This is what I was thinking too... Still early days I guess lol
sigmar · a year ago
If "this is the last game for Alex Morgan" wasn't included as any part of the input/prompt, how on earth could the AI summary have come up with this for inclusion in the game recap?

If some teenage intern was given a table with the goals scored (player and minute mark), they would have written a similar article... but that's definitely not a good excuse for news orgs and sports sites to just use generative AI for everything, so I can see why people are annoyed with ESPN.

tivert · a year ago
> If "this is the last game for Alex Morgan" wasn't included as any part of the input/prompt, how on earth could the AI summary have come up with this for inclusion in the game recap?

It couldn't, which is the problem. One of the big selling points of generative AI is to cut people out of the process of writing. If someone actually has to watch the game and describe what happened in the prompt, what's the point of the technology at all?

This is what an application of generative AI looks like: using low-quality input to generate something that looks like an article. This is our glorious future, brought to you by OpenAI.

kmoser · a year ago
I'm sure the fact that this would be Morgan's last game was somewhere out there on the Internet, and certainly should have been part of the training data, in which case I would guess that a more nuanced prompt to the AI could have elicited a more robust output.
HelloMcFly · a year ago
I don't think the critique is "how stupid of the AI" but rather "this is one example of how AI content falls short" even when it is supposedly reviewed by a human as ESPN claims

Deleted Comment

crazygringo · a year ago
Indeed. This doesn't really have anything to do with the limitations of AI inherently, but more about using it better.

It seems like it highlights a clear avenue for improvment: rather than just feeding the events of a game to the LLM and asking it to summarize them, it seems important for the prompt to include summaries of e.g. the last 10 news stories involving participants in the game (players, coaches, etc.) and maybe the 10 top all-time news stories as well.

Then the prompt can be asked to summarize the game (not the other information), but to draw from the other information where it might make the article better.

Seems like exactly the kind of things LLM's can do, right?

qq66 · a year ago
It was well-known to people who cared about this (i.e., anyone who would read this recap) that this was Alex Morgan's last game. If ESPN can't capture this and incorporate it into their piece, given their wide reach in sports media, why should a publisher use the ESPN recap service instead of random Y Combinator startup?
gs17 · a year ago
Really, they need to set up some kind of RAG system to include this kind of information in the prompt. You could definitely make this semi-automated, but it requires putting more effort in to the system.
karaterobot · a year ago
> ESPN made a point to note that “each AI-generated recap will be reviewed by a human editor to ensure quality and accuracy.” It’s unclear if the human editor failed to notice Morgan’s absence or also decided it was not worth mentioning.

I don't believe they'll have a human editor ensure quality and accuracy. The whole point of having AI write your stories is to minimize the number of people they have to pay, so paying a trained professional to thoroughly review every story is probably off the table. They may compromise on the thoroughness of the reviews, or the expertise of the editor, or they may just not have a human in the loop for every story, but what they will not do is pay a professional editor to do their job the right way, this I can guarantee.

GrinningFool · a year ago
> but what they will not do is pay a professional editor to do their job the right way, this I can guarantee

I don't think that holds up. They can save a lot of money by using generated articles, but they lose customer (advertiser) confidence if the content isn't accurate. One editor per ten replaced writers is still a significant cost savings.

A few iterations from now we might see editors getting replaced too, but I don't think we're there yet.

karaterobot · a year ago
The time required to write a good article—by write I mean compose sentences and type them—is a fraction of the time it takes to research it. Because the AI can't do actual reporting or journalism (interviewing people, emailing sources, tracking down documents and ingesting them, checking facts, etc.) then you either push all that work on to the editor, or you consign your publication to only writing stories that require no original reporting. If you push it all on to the editor, the economics no longer pencil out, because the editor is doing the work of 10 authors plus one editor. If you stop doing original reporting, you have a bad product.

Today, with the field of journalism in freefall, we actually have the worst of both worlds: not enough editors, not enough time to report, not enough original reporting being done, and too many AI or computationally generated articles. But, I don't see how getting rid of the humans and doubling down on the AI actually solves that problem in the medium and long terms.

slantedview · a year ago
> but they lose customer (advertiser) confidence if the content isn't accurate

The content isn't what it should have been, which is why the previous commenter rightly assumes that no editor looked at it.

reaperducer · a year ago
ESPN removed the humans from reporting, and the AI removed the humans from the report.

In their announcement of the service, ESPN made a point to note that “each AI-generated recap will be reviewed by a human editor to ensure quality and accuracy.” It’s unclear if the human editor failed to notice Morgan’s absence or also decided it was not worth mentioning.

"Blame the intern" has been the great scapegoat for the last hundred years.

jimt1234 · a year ago
That was my read, too - basically, "We have a teenager in India that quickly scans through a thousand AI-generated articles per day, looking for offensive language or anything that could get us sued, but that's it."
XCSme · a year ago
I hate the AI match recaps of WTT (World Table Tennis) matches.

The AI highlights cut off during a point, skip entire points or even sets. Not to mention they don't account for important events that happen between points, they might cut off exciting commentary/celebrations that happened after a point, and even the handshake at the end of the match.

low_tech_love · a year ago
I hate everything that AI generates. Is there such a thing as a textual uncanny valley?
segasaturn · a year ago
How are these AI recaps generated? Are they fed a video file of the entire game and it spits out a summary, or maybe a score tally with timestamps for goals (written by a human) which the AI then pads with language and makes into a story?
skipants · a year ago
Back when I worked on it, before AI, you have all the information from a game in an API and you just fill in the template with it.

Now, I imagine they take that raw API call and just use a prompt like, "write a summary article for a game using this data" and it spits it out. And I assume the prompt is more thought out than that (or not? It is ESPN after all).

I don't ever remember "retiring_players" being part of an API response, though, ;P

edit: Oh and yes, the play by play recap is documented EXTREMELY well. You would be surprised. The more popular sports like Gridiron Football and Basketball would literally have player locations by the second. This data all comes from feeds like SportsRadar.

They probably wouldn't pipe the fine tuned stuff like that in to a prompt, but you still have a decent summary like how many 3-pointers someone had and where they shot them from.

erickj · a year ago
If I was going to build this prototype I'd start with just a semistructured textual play by play recap as the input. Also including roster, injury, amd schedule information with a fairly basic prompt would probably go a long way.

This data exists for most live games at this point via various web services. I'm sure espn has significant resources internally to source that info

skipants · a year ago
I don't think ESPN does anything that takes significant resources. That's all handled by SportsRadar or ... there's another big provider but their name alludes me. They basically firehose you all the game information as structured data and you can use it programmatically however you'd like.
mason55 · a year ago
Yeah it feels like the ideal way is to feed in a transcript of the announcer audio + some standard stats. That would ensure you catch both the human stories & the factual content.

But I wonder if there are licensing issues with using the audio/transcript to generate your summary. I know that the raw stats are public domain but I wouldn't be surprised if they can't use the transcripts or audio.

mattmaroon · a year ago
There are a couple companies that provide real time sports data via API (or recaps after) so I’d bet they use that.
SoftTalker · a year ago
They use the box score and play-by-play events.
xyst · a year ago
The gaps between expectations and reality of “genAI” is too vast at this point to ignore. If a multibillion dollar system breaks down because of “human error”, then maybe its capabilities are way overstated. If it needs carefully crafted queries (“prompt engineering”), 100% error proof data, a megaton of power, and humans still need to re-check the output. What have we gained?

Can we all just admit this AI phase is just a bubble?

maxwell · a year ago
What kind of Mickey Mouse organization...
CameronBanga · a year ago
Under-appreciated comment here.