Readit News logoReadit News
nicd · 2 years ago
It would be really neat if after pulling the captions, an LLM was used to reword the content into an idiomatic "blogpost" (since speech is typically different than writing). Using LLMs, we could even choose the level of summarization and the output tone!

As someone who strongly prefers reading to watching instructional videos, I'd pay for this service :)

jakear · 2 years ago
I made something sorta like that specific to recipe videos. Basically converts recipe into an idiomatic format (inlines ingredients, detects and renders timers) and links each step in the recipe to its timestamp in the video for easy indexing while you're busy in the kitchen. (I spent too much time trying to scrub to that one spot where "how it's supposed to look" is shown while busy making it look that way)

See example: https://rexipie.com/watch?v=JiJXdoTjw8M

Just s/youtube/rexipie/ in any recipe video URL.

(full disclosure the step/transcript linking is paid-only as it requires a GPT-4 call, everything else is available to demo on free tier)

Timon3 · 2 years ago
That's really cool! For comparison, this is the recipe written by the same guy: https://www.allrecipes.com/toy-box-tomato-ricotta-cheese-tor...

I've gotta say, your website might be easier to use during cooking, since it provides the information in-line (especially serving sizes etc.)!

extraduder_ire · 2 years ago
Cool website. Much better than the SEO spam I came across earlier this week when I did a websearch for "pear qwerty horse" after seeing it in the tags under a binging with babish video.

Love the timers and jumping to sections of the video. Though, the second video I tried viewing didn't have linked steps.

jonny_eh · 2 years ago
The hyperlink "Food Wishes" at the top of the page is broken. It'd be nice too if there was a way on that page to request a new recipe (via video ID or whatever).
addandsubtract · 2 years ago
This is neat! Do you plan to support videos containing multiple recipes?
evrimoztamur · 2 years ago
I regularly use https://www.summarize.tech/ for this purpose.

Not exactly a blog post format, but it must've saved me a hundred hours, no joke!

column · 2 years ago
You can absolutely already do what you describe with GPT-4 plugins (Plus membership required). Using VoxScript and Video Insight :

https://chat.openai.com/share/229e3ac8-3924-48e4-abd5-35bcb2...

Etrnl_President · 2 years ago
Except for the complaining GPT will do, and some censorship based on the whims of its' programming group. No thanks; I'll stick to scripts, where the video dictates the content.
micw · 2 years ago
It seems that many of "my script can do [something] with [information in a different form]" can be superseded by LLVMs already or in the near future and the quality is way better than what the scripts are capable of.

I just wonder what the price of this is. I can run most of these scripts on an old laptop. But for the LLVM I need a pricy an beefy computer or (even worse) a paid subscription to a big tech's service.

jdthedisciple · 2 years ago
Honestly, I like the GPT 3.5 version I posted here way better.
jdthedisciple · 2 years ago
There you go, friend:

Step right up, folks! Gather around and feast your eyes on the magnificent creatures before us – the elephants! Now, what makes these majestic beings so fascinating, you ask? Well, let me tell you – it's all about their incredibly, unbelievably long... um, trunks! Yes, you heard that right. These gentle giants sport trunks that seem to stretch on for ages, and let me tell you, it's nothing short of impressive. So, as we stand here in awe of these marvelous creatures, remember, it's the little (or should I say, not-so-little) things like their remarkable trunks that make them truly stand out. And that, my friends, wraps up the lowdown on our pachyderm pals – fascinating trunks and all!

mikeravkine · 2 years ago
I've been working on just such a tool [1] to help me digest podcasts and senate hearings.

[1] https://github.com/the-crypt-keeper/tldw

xenonite · 2 years ago
Did you consider directly taking the subs from yt?
Shrezzing · 2 years ago
>since speech is typically different than writing

Is a scripted video significantly different to a written blogpost? It might be a symptom of the type of YT videos I watch, but most of them seem to be essay-style "intro/thesis/points 1, 2, 3/counterpoint/conclusion", and the only thing that hints at speech is the umming-and-arring of the presenter.

tyingq · 2 years ago
It is to me...an example from a CNN transcript:

"Former chief-of-staff, Mark Meadows asking a federal judge to put his surrender on hold, while deciding whether to move his trial to federal court, and former DOJ official Jeffrey Clark, seeking the same, making a pretty remarkable argument in his filing."

That's someone doing a sort of play-by-play explanation of what viewers are seeing in a video. Compare to a purposefully written story:

"A federal judge in Georgia rejected a request by former White House chief of staff Mark Meadows to postpone his surrender and arrest in Fulton County, Georgia, as an attempt to move the case to federal court is litigated, according to a court order issued Wednesday."

It seems like there could be some value in an LLM that would rewrite the first into something more like the second.

zigzag312 · 2 years ago
Or at least chunk transcription onto logical groups.

Chunking on the example webpage[1] is poor.

[1] https://obra.github.io/Youtube2Webpage/example/

furkangok · 2 years ago
can I self-promote here? we are not doing exactly the same but we are transcribing videos ourselves (no auto YT captions) If you want to read a high quality transcript & summarize videos, you can do that at https://alphy.app
cookingrobot · 2 years ago
This is a blank page on iOS Chrome.
jve · 2 years ago
Constantly my brain gets a question: Can I search specific youtube video captions?

Ok, this may be an answer... but is there an online service that given YT URL would spit captions out for me? Or maybe a browser extension?

Maybe even youtube has a hidden link somwhere where I could see all the text?

This submission triggered me for reasearch and found this gem: https://filmot.com/

The guy who created it: https://www.reddit.com/r/linguistics/comments/oo8xbd/search_...

kalleboo · 2 years ago
On the YouTube website, if you click the "・・・" button next to share/clip/save, there is a "Show transcript" option and you can use your browsers in-page search to search in it
kishmat · 2 years ago
that's for specific vid, the site shared above lets you search Indexing over 760 million captions across 687 million videos and 52 million channels.
jshmrsn · 2 years ago
There’s a website designed for language learning from watching YouTube captions with inline translations and dictionary lookup. It also has support for searching videos by subtitle content. But it has a limited index and isn’t free for all features. I thought its source was available but I can’t find it now… https://languageplayer.io/
prometheon1 · 2 years ago
Show HN: YouTube Full Text Search – Search all of a channel from the commandline

https://news.ycombinator.com/item?id=36009774

mmahemoff · 2 years ago
Searching transcripts is really something YouTube itself should be doing as part of just regular search (and fed into Google search too). I have a feeling the regular search already does it to some extent, as the system presumably is tagging videos based on its caption extraction. However, that would only apply to somewhat broad topics, not specific combinations of words and the matching text is not surfaced in the UI.
adhvaryu · 2 years ago
That's a good website you posted.

I encountered the "need" for this functionality a few years ago to find the video of a YouTuber specifically saying something.

Back then I used a website that's actually specifically dedicated to the YouTuber (Northernlion): https://babypig.men/nlss-search?q=Basmati

I'm surprised the website is still live!

samuelg123 · 2 years ago
YouTube just added a feature to search transcripts. Seems like I might be in an experiment though
twayt · 2 years ago
Also try www.askYouTube.ai, not exactly pure text search but it can help you find videos that answer your query using LLMs
jve · 2 years ago
I wanted, but when I press ENTER, it asks to register... I click cancel and notice PRICING page. I click on it and again it asks for login. That is NOT how one onboards users.
iamflimflam1 · 2 years ago
You really have to wonder what YouTube are doing. Trying to play catch up with TikTok instead of innovating with what they have.

They’ve had transcriptions for ages - but it’s hidden away and practically useless.

The things they could do with a bit of creative thinking…

gumballindie · 2 years ago
> creative thinking…

The death of creative thinking is management and processes. Assembly line work doesnt permit creativity. Google is heading the same route as IBM and other once great corporations made by creative people. When the discussion shifts from ideas to processes and triviliaities it's game over. Long live whatever replaces google. It's over.

conception · 2 years ago
There is a term in business for this I can’t remember. Where basically your money comes from business process X and so you protect X at all costs. Which makes it very difficult to innovate away from supporting X. It’s almost impossible for a company to pivot to making their money from Y. You see this classically in things like Sears, Woolworths etc. unable to keep up with the times.

The solution to this seems to be to basically start a new company in your company completely separate from your core company more or less. Meta seems to have initially done this very well with their pivot from Facebook to Instagram. Perhaps less great in their metaverse pivot but we’ll see. Google set themselves up for success at this with alphabet but I don’t think we’ve really seen them be able to have something they feel like they can really pivot into yet so business X continues to be the focus.

cal85 · 2 years ago
Hidden away yes, but it’s amazing. It creates a totally new way of consuming informational videos and everyone here should try it. You can scroll through the transcription vertically and tap on any bit and the video instantly jumps to that point - basically the transcription is the new seek bar. And about 1000x better at the job. No more skipping back to somewhere roughly where you stopped paying attention and just letting it play for a bit until you get back into the thread - now you can jump around with surgical precision. It’s like how you can easily skim back and forth in a text article, but with a video. It’s a total game changer and I find it bizarre that it’s so hidden away so most people won’t find it. It also works particularly well when casting from your phone to a TV, using the phone as the navigator. Oh and the transcriptions are about 99% accurate, which is good enough for me.
iamflimflam1 · 2 years ago
That’s really interesting - on my own website I’ve extracted the captions to my videos and I was thinking of wiring it up so you could navigate the videos. I may actually get round to doing this now.
hypercube33 · 2 years ago
I really wish I could search transcripts easier across YT.
chocological · 2 years ago
My friend and I made something similar a few years ago as a college hackathon project - it features automatic scene transition detection and a rough editor before publishing the final results.

(The demo site is down, but you can clone the repo and run the code locally)

https://gitlab.com/chocological00/bitcamp-2021

bonyt · 2 years ago
This is actually potentially helpful for me as a lawyer for generating a paper record, and something I was talking about (and meaning to write up a script for) the other day. Sometimes I want to use a Youtube video in a court filing (for example, as prior art in a patent case), and submitting a rough paper record of the video like this is helpful along with the actual video.
politelemon · 2 years ago
If I'm remembering correctly the example they've used is the first video uploaded to YouTube. https://obra.github.io/Youtube2Webpage/example/
obrajesse · 2 years ago
That’s correct.
frabcus · 2 years ago
Related, this is a super useful end user app that summarises videos using an LLM:

https://www.summarize.tech/

stenioaraujo · 2 years ago
This is interesting. I think the scenario it should be used is for non sublte messages, such as sarcasm. I gave it a try with KRAZAM's video and the answer is hilarious when you consider the video intended exactly the opposite.

https://www.summarize.tech/www.youtube.com/watch?v=_o7qjN3KF...

> In "The Hustle," the narrator shares their jam-packed daily routine that exemplifies their dedication to productivity. From an early morning workout to late-night preparations for the next day, their schedule is filled with various activities. They efficiently manage their time, incorporating work, social media updates, and even a well-deserved happy hour. The narrator's commitment to self-improvement is also evident through their habit of reading before bed and tweeting inspiring quotes. Overall, this video highlights the narrator's hustle and structured approach to maximizing their day.

riedel · 2 years ago
I just had to write a research report for a funding agency with many subprojects and could not get one input so I took a short video from a pitch presentation and converted it first to captions using spech2text and then to a research summary and it was really impressive.