Readit News logoReadit News
kragen · 6 months ago
What I'd really like is a service that edits down YouTube videos by removing all the stock footage and talking head crap, then speeding up the audio to fit over the remaining novel information—whether that's new battlefield footage, electron micrographs, demonstrations of machining techniques, or just elephant toothpaste. The talking head filler seems like it would be easy to recognize, but stock footage recognition presumably would have a significant false negative rate, which is okay.

This would reduce some videos to just a transcription, which would be the ideal outcome, I think. The less of my limited time on Earth I waste watching some dumbshit reading a script at a camera, the better. Summarizing the transcript further like this site does might be occasionally useful, of course.

satvikpendem · 6 months ago
I use SponsorBlock (be sure to enable all categories of blocking such as filler content, not just sponsors), DeArrow (de-clickbaits thumbnails and titles) and a video speed changer extension to enable much of your stated functionality, though of course not all. I've saved likely years of watching due to this combination.
eco · 6 months ago
One of the best things in SponsorBlock is Highlight segments. If the video is 10 minutes of filler building up to a single interesting/exciting moment you can often see exactly on the timeline where to jump to.
soraminazuki · 6 months ago
How did your viewing experience change after using DeArrow? I'm curious because DeArrow has a great reputation, yet my gut feeling tells me that I should avoid watching clickbait videos altogether.
_Algernon_ · 6 months ago
When you need 4 extensions (counting ublock as well) to make a site useable, maybe it is time to reconsider what site you spend your time on?
rcfox · 6 months ago
YouTube has a speed changer built in. I'm curious what your extension would add.
Xen9 · 6 months ago
I don't watch YouTube but if I would / I would if I'd cut out anything with faces or speech & use an LLM to summarize what's technically relevant from the transcript in a way that fits length of what remains.

Pipeline such content, but use weighted random videos, with low weights for types of content with clickbait headings & perhaps blacklist for words like meme or lol in transcript to cut out things with stock footage. I am not sure of exact best way to remove it, actually, other than "using the transcript for some computational technique of probabilitistic stock footage prediction" which I bet would be most effective.

xandrius · 6 months ago
You described SponsorBlock, really a game changer. Also works on mobile!
bbstats · 6 months ago
how would you remove stock footage? B-roll has voice over it, it's kind of a necessity.
kragen · 6 months ago
By "speeding up the audio to fit over the remaining novel information", as I said. "B-roll" doesn't necessarily imply worthless stock-footage filler; in https://www.youtube.com/watch?v=woj4vfMLpao or https://www.youtube.com/watch?v=DdF_nzMW_i8 or https://www.youtube.com/watch?v=Eu_crbcBdNM, for example, the B-roll is the remaining novel information. The third of these also includes the kind of talking-head filler I'd like to remove.
GrumpyNl · 6 months ago
Or ask the original creator if he can publish the script used.
dinkumthinkum · 6 months ago
If that's what you want then why would you want a service like this? Surely there would be a non-video news sources of electron micrographs or elephant toothpaste for which there would probably be hundreds or thousands of LLM TL;DR things.
kragen · 6 months ago
I think you don't have a very clear idea of what I am talking about. You cannot present electron micrographs as text. You can present individual electron micrographs as still images, but not animation. Similarly, video of elephant toothpaste can only be presented as video, even if still images can be arresting. There is no sense in which a textual description of machining techniques or a Ukrainian battlefield is a substitute for video footage of them. In 5 seconds they can convey information that no amount of linguistic description can. Sometimes that information is even true.

What I hate is when I'm trying to find such irreplaceable information and instead my search results are full of vapid stock footage and hubristic talking heads overconfidently reading a script out loud as they gaze at a video camera. It's like AI slop without the creativity.

moralestapia · 6 months ago
Cool.

What's stopping you from doing it?

kragen · 6 months ago
Watching too many YouTube videos, probably.
kragen · 6 months ago
Wow, I'm really surprised that my comment describing 95% of YouTubers as "some dumbshit" got voted up to +19. I guess I'm not the only old man shaking his fist at the surveillance capitalism incompetent confident shouty bullshitter cloud?
henry2023 · 6 months ago
What if we summarize all the information in the world into a few hundred volumes of human knowledge, then summarize those into a 10,000 pages book, then that into a 10 long form essays, then those into a 100,000 chars blog post, then that into a pamphlet and finally we summarize one more time into a single tweet.
flemhans · 6 months ago
Not a single tweet, but 10 brief sentences, as per the AI overlords:

1. The universe is vast, mostly empty, and runs on fundamental laws that we barely understand but exploit well.

2. Life is a self-replicating, entropy-defying phenomenon that emerged through chemistry, evolved through selection, and adapts through intelligence.

3. Humans are social primates who dominate the planet through cooperation, tool-making, storytelling, and an insatiable drive for meaning.

4. Societies form through shared beliefs, laws, and trade, but oscillate between progress and collapse due to power, greed, and ignorance.

5. Technology is humanity’s amplifier, accelerating knowledge, comfort, and destruction in equal measure, with unintended consequences at every turn.

6. Economies are trust-based systems of resource distribution, prone to cycles of boom, bust, innovation, and inequality.

7. Morality is a human construct, evolving with culture, often conflicting between collective well-being and individual freedom.

8. Knowledge is a fractal—deeper the dive, more there is to know—yet most wisdom is rediscovery of old truths in new contexts.

9. The future is uncertain but shaped by the tension between human ingenuity and our own worst tendencies.

10. The meaning of life? Whatever gets you up in the morning and lets you sleep at night.

silvestrov · 6 months ago
> entropy-defying phenomenon

entropy-exploiting phenomenon

is a much better description as life does not defy any fundamental laws.

pas · 6 months ago
can you please share the prompt? and in general reproduction steps? many thanks!
qiine · 6 months ago
nice lifr tldr
fifilura · 6 months ago
Some times I think that would actually be useful for some politicians that do not care about history and prior knowledge.

* Rule of law is a good idea

* Dictatorship is a bad idea

* Allowing Germany to occpy Sudetenland in the Münich appeasement 1938 was a bad idea. [1]

* ...

[1] https://snyder.substack.com/p/appeasement-at-munich?triedRed...

But that said! If this service works I think I could use it. I can handle long articles, but have no time to watch YouTube clips.

someothherguyy · 6 months ago
> into a single tweet

It would say something like, "This text attempts to summarize the entirety of human knowledge".

Still, IMO summarizing videos is useful. Even if the summary is not accurate or a 1:1 representation of the content, you can mostly get the gist of what is being said without being baited into watching advertisements.

Although, this site doesn't seem to do a great job at summaries. Kagi's universal summarizer has much better results, https://kagi.com/summarizer/index.html . However, it requires transcripts to be available for videos.

Sakos · 6 months ago
I think a lot of people are sort of missing the benefit of something like this.

How do you read a book effectively? You skim the table of contents. You skim the contents of each chapter and mark interesting paragraphs. Then you go through the book another 1-2 times, each time getting deeper into the text and cross-referencing information between different parts of the book.

What tools like this will do is allow us to apply this same workflow to videos, which can greatly enhance our understanding of videos we're interested in and help us contextualise it with the rest of our knowledge.

I've already been doing this and it's helped me expand my knowledge and understanding in ways that wouldn't have been possible without an unreasonable investment of time and effort.

rayalez · 6 months ago
Tried asking Claude to do that, ended up with something pretty beautiful:

Everything is made of atoms & energy, life evolves, math describes reality, knowledge builds on itself, humans need each other & Earth to survive – test ideas, learn from mistakes, be kind, stay curious.

userbinator · 6 months ago
The answer will be a single number, 42.
henry2023 · 6 months ago
Then summarize it one last time into a single bit. I like to think it'd be '1'.
xlii · 6 months ago
…but who will have the question?
guybedo · 6 months ago
Gemini's output:

Our understanding of reality is fundamentally shaped by the power of stories and narratives.

Humanity constantly seeks to impose order and structure on the world through systems and frameworks.

The inherent human drive to create and innovate defines our art, technology, and design.

We are bound by the complex interplay of connection, conflict, and cooperation in our relationships.

Time's relentless flow drives change, progress, and the unfolding narrative of history.

The vastness of the unknown perpetually challenges and defines the limits of human knowledge.

The search for purpose, values, and meaning is a central and ongoing human endeavor.

Abstract concepts and models are powerful tools for understanding and navigating reality.

All living things are interconnected within a complex web of life and ecological relationships.

The future of humanity presents both boundless potential and significant challenges to overcome.

hollerith · 6 months ago
That's really bad, but also excellent in a particular way, which we might call glibness.
zoogeny · 6 months ago
This reminds me of the famous Library of Babel story, where the entire corpus of a language is imagined to live in a library. Like, every permutation of the characters of an alphabet for pages of a certain number of characters in books of a certain number of pages.

The reducto ab asurdum of this library is an alphabet of 0 and 1, a page size of 2 characters and a page count per book of 2.

PaulRobinson · 6 months ago
I know you’re making a joke, but more seriously I think most yt videos have atrocious signal/noise ratios so information compression is likely very useful. Less so for many academic papers (although they have some pretty awful filler sometimes).
al_borland · 6 months ago
I was on YouTube a few weeks ago and saw a 20 minute video with a title that looked interesting. Under it was an AI summary that saved me 20 minutes, and had me skip the video completely. I wish that was under every video.

This week I got a notification about the AI added to YouTube to allow users to ask questions about a video. I haven’t had a chance to use it yet, but I can see that also being useful to get the main points from a long video. Up until now, I mainly use the popularity indicator on the progress bar. Since I watch most videos on my TV, it’s harder to use the AI, as I would need to pull out my phone, open the same video, and ask… that’s a bad workflow.

I do find it a little ridiculous that we need AI to summarize long videos full of fluff, when the only reason they are full of that fluff in the first place is YouTube’s own monetization policies which pushed the average video from 2-4 minutes to 10 minutes.

debeloo · 6 months ago
This is exactly the problem. There are so many 20 minute videos that should have been 2 minutes.

In a way, it's much easier to make the 20 minute video. Just hit record, rant an rave, stop recording and publish.

There are indeed justified long videos stuffed full with knowledge, insight and witty comments to make it fun.

Then there are "slow" videos but magical. Paul Sellers has a 30 min video on how to make mortise and tenons joint with hand tools. Just you and him in real time. You get a (recorded) private lesson from a master craftsman. It's magic. Every minute of it is knowledge transfer.

https://m.youtube.com/watch?v=aBodzmUGtdw

crakhamster01 · 6 months ago
Some people inflate their video durations intentionally, but I think the majority of people truly think they're using the time wisely. Have you ever tried making a quick travel vlog of a vacation and ended up with a 15 minute short film? That B-roll at the airport was definitely critical to include!

I think the reality is that there are a lot of amateur video creators. Elevating the few talented creators through social engagement metrics isn't perfect, but I think it works well enough. Or at least more so than what these anodyne summarizations would give us.

sethammons · 6 months ago
42
0x073 · 6 months ago
Like "the book"?

Deleted Comment

stong1 · 6 months ago
Hi HN! I'm the author of this service. Thank you for your support.

There may have been some temporary downtime due to residential proxy running out of bandwidth. I have purchased additional bandwidth. (I run this service for free.)

There also may be some errors with particular videos because they are not accessible in certain regions. For now all requests to YouTube originate from United States, but open to change in the future to some kind of round-robin or fallback system.

I know it's not perfect. I developed the tool originally for my own use. It's open source and I'm open to any patches or pull requests.

Enjoy!

NewUser76312 · 6 months ago
Hey this is really cool, I literally had the same idea about a month ago but ultimately decided to not pursue it. Glad someone else did.

A few quick Qs -

1) Do you use the available auto-generated transcripts from youtube? Or do you do any audio parsing? I know transcripts aren't always available.

2) Do you have any plans to monetize in some way, do you think it would be possible? It's definitely a neat product but a tad generic and replicable, so I'm curious.

stong1 · 6 months ago
1.) We do no TTS of our own. We either use the original transcripts uploaded manually by the YouTuber or we use the auto-generated ones supplied by Google.

2.) No, I plan to keep it free as the operational costs are relatively minimal.

ANighRaisin · 6 months ago
How does this compare to gemini thinking experimental with apps?
noname120 · 6 months ago
Out of curiosity which residential proxy service do you use?

Deleted Comment

arccy · 6 months ago
how expensive is the bandwidth if you're buying from residential proxies?
stong1 · 6 months ago
Cheap. Less than $10/GB and since we only scrape metadata and transcripts, the traffic usage is low.
JTyQZSnP3cQGa8B · 6 months ago
> open-source

> OPENAI_API_KEY

Choose one.

SomeoneOnTheWeb · 6 months ago
The service itself is open source ; but it relies on a closed source service. Both are not incompatible.
jamesy0ung · 6 months ago
If you don't like it, just edit the open-source code to point to a locally hosted open-source llm.
suprjami · 6 months ago
> OPENAI_BASE_URL

> OPENAI_API_HOST

CM30 · 6 months ago
Tried it on 3 random videos I watched, and the results were... mostly good, albeit mixed.

On the one hand, it got my video about a Mario & Luigi: Brothership glitch dead right, immediately listing where you'd need to die to get an item early and what you'd get out of it.

It also did an okay job summarising a Zelda dungeon analysis video by someone I'm subscribed to, with some info on why that dungeon was well-designed that clearly came from the video.

Unfortunately, it did a poor job at summarising a video about plagiarism in the YouTube speedrunning essay space, associating the problem with smaller creators rather than the person the video was about and leaving out far too many details to be useful.

This seems to confirm my assumptions about how an AI summariser would work in general; if the original media is a straightforward piece about one easily understandable topic, then it'll do fine and work about as well as a human would. If it's a longer piece with multiple points backed by various examples, then it'll struggle to summarise it in a way that makes sense.

braiamp · 6 months ago
> If it's a longer piece with multiple points backed by various examples, then it'll struggle to summarise it in a way that makes sense

I've found the same problem with humans too, so it's not like an improvement over humans.

fullshark · 6 months ago
So what is everyone doing with all this free time they've now accumulated from not reading, watching, or listening to media?
righthand · 6 months ago
Chatting with AI bots.

I agree that this mentality of works being “too long so don’t ingest it”, is not a healthy way to go about life and thinking in general.

DennisP · 6 months ago
It's not that, it's "too long with low information density so ingest it more efficiently." That way we can spend our time on things that are more productive or enjoyable.
BeFlatXIII · 6 months ago
Sleeping. Occasionally hitting a drum.
tonyhart7 · 6 months ago
we automate so much, we end up making human redundant

Deleted Comment

zorgmonkey · 6 months ago
I tried a couple videos in both this site and Kagi's summarizer, both were decent but each time Kagi did better.
mtlynch · 6 months ago
Whoa, I've been using Kagi for two years and didn't realize the summarizer could do videos!
freeAgent · 6 months ago
It only works when there’s a transcript. It’s not “watching” the video. That said, it works very well most of the time for me.
gloosx · 6 months ago
Idea hackneyed since LLM's appeared. Cool that implementation is open-source, though yt automatic captions are sometimes completely off-point, especially when people talking in the video don't have a diction of a tv show host.

I wonder if an idea found it's niche after all? Do you guys summarise you videos to short texts and that leaves you satisfied? For me video is video, I can relax, sit and watch/listen to it. With text it is different, it is a mental exercise to read and process it, so turning video into text feels like an essential downgrade. I would prefer watching at 1.5/2x speed instead of text summary if I want to finish it faster.

amenhotep · 6 months ago
If I want to watch a video for the sake of watching a video, then no, a text summary would not at all be the same thing.

But most of the time I don't want to watch a video, I just want to get information. A text summary then would be strictly superior.

Dead Comment

righthand · 6 months ago
> Idea hackneyed since LLM's appeared.

It’s an idea that’s been around long before LLMs. Check out Yahoo under Marissa Mayer acquiring a news summary app. Though it is still hackneyed.

https://finance.yahoo.com/news/yahoo-acquires-summly-app-150...

downsplat · 6 months ago
> For me video is video, I can relax, sit and watch/listen to it. With text it is different, it is a mental exercise to read and process it, so turning video into text feels like an essential downgrade.

Exact opposite for me. Reading goes at my pace, in silence. Video is much more invasive, so I avoid it except for the highest quality stuff.

gloosx · 6 months ago
Interesting, I was always thinking that audio/visual information is naturally much easier to consume. For instance: I can watch a video and count to 10 in my head at the same time – I will still get everything what was in that video – but with text it's a much harder task since the head is fully occupied with "narrating" the text what I'm reading, so reading in the end turns into podcast inside the head before actually get consumed.
nottorp · 6 months ago
2x video speed is still a lot slower than text.
gus_massa · 6 months ago
I tried with a few Thunderf00t videos. He has good analysis, but the guys repeats everything too many times. Many are about silly impossible "inventions" / scams, but this is an experiment that he published in Nature Chemistry:

https://tldw.tube/?v=LmlAYnFF_s8 "High speed camera reveals why sodium explodes! --> "Coulombic explosion. (Sodium and water reaction)"

trainsarebetter · 6 months ago
I used to enjoy watching his stuff, but he became hyperbolic, and now has too much rage bait for the algorithm.
dvh · 6 months ago
That was the first thing that came to my mind, all his 20 minutes videos contains at best 2 minutes of information.

Deleted Comment