FFmpeg by Example - Readit News

I've enjoyed using ffmpeg 1000% more since I was able to stop doing manually the tedious task of Googling for Stack Overflow answers and cobbling them into a command and got Chat GPT to write me commands instead.

simonw · a year ago

I use ffmpeg multiple times a week thanks to LLMs. It's my top use-case for my "llm cmd" tool:

  uv tool install llm
  llm install llm-cmd

  llm cmd use ffmpeg to extract audio from myfile.mov and save that as mp3

https://github.com/simonw/llm-cmd

resonious · a year ago

I tried this (though with a different tool called aichat) for extremely simple stuff like just "convert this mov to mp4" and it generated overly complex commands that failed due to missing libraries. When I removed the "crap" from the commands, they worked.

So much like code assistance, they still need a fair amount of baby sitting. A good boost for experienced operators but might suck for beginners.

pmarreck · a year ago

A while back I simply wrote my own bash function for this called `please`

as in

    bash> please "use ffmpeg to extract audio from myfile.mov and save it as mp3"

It will then courteously show you the command it wants to run before you agree to do it.

Here is the whole thing, with its two dependent functions, so that people stop writing their own versions of this lol. All it needs is an OPENAI_API_KEY, feel free to modify for other LLMs

EDIT: Moved to a gist: https://gist.github.com/pmarreck/9ce17f7996347dd532f3e20a2a3...

Suggestions welcome- for example I want to add a feature that either just copies it (for further modification) or prepopulates the command line with it somehow (possibly for further modification, or even for skipping the approval step)

atoav · a year ago

Did you just invent the LLM-equivalent of curl-piping unread shell scripts into sh?

I am sure that will never cause any problems.

dekhn · a year ago

"The future is already here. It's just not very well distributed"

(honestly, the work you share is very inspiring)

zahlman · a year ago

>This will then be displayed in your terminal ready for you to edit it, or hit <enter> to execute the prompt. If the command doesnt't look right, hit Ctrl+C to cancel.

I appreciate the UI choice here. I have yet to do anything with AI (consciously and deliberately, anyway) but this sort of thing is exactly what I imagine as a proper use case.

mvonballmo · a year ago

Hypertalk <https://en.wikipedia.org/wiki/HyperTalk> lives.

Beijinger · a year ago

uv?

th0ma5 · a year ago

You should figure out what went wrong for the other commenter and fix your tool.

mrweasel · a year ago

While I love that that works, I still feel like just maybe ffmpeg needs a better interface. Not necessarily a GUI, just a better designed command line.

Waterluvian · a year ago

I think I’m finally sold on actually attempting to add some LLM to my toolbelt.

As a helper and not a replacement, this sounds grand. Like the most epic autocomplete. Because I hate how much time I waste trying to figure out the command line incantation when I already know precisely what I want to do. It’s the weakest part of the command line experience.

levocardia · a year ago

For the longest time I had ffmpeg in the same bucket as regex: "God I really need to learn this but I'm going to hate it so much." Then ChatGPT came along and solved both problems!

zxvkhkxvdvbdxz · a year ago

Interesting. Being able to use regexps for text processing through my career has probably saved me a few thousand hours of programming one-off solutions so far. It is one of those skills that really pays off to learn proper.

And speaking of ffmpeg, or tooling in general, I tend to make notes. After a while you end up with a pretty decent curated reference.

earnestinger · a year ago

Not sure about ffmpeg, but you should definitely try memorising regexp. Casual Search&replace that becomes possible is worth it.

teaearlgraycold · a year ago

Gotta be honest, years of configuring automod on Reddit have honed me into a regex God.

jmb99 · a year ago

For me, it wasn’t so much learning ffmpeg, as it was understanding containers/codecs/encoders/streams/etc. Learning all of the intricacies there made ffmpeg make a lot more sense.

shlomo_z · a year ago

... Then ChatGPT came along and I had 3 problems! https://regex.info/blog/2006-09-15/247

hackingonempty · a year ago

CSS has entered the ChatGPT.

manbitesdog · a year ago

Same here, it's one of these things where AI has taken over completely and I'm just a broker that copy-pastes error traces.

jjcm · a year ago

In addition to the many others mentioned, here's a script I just threw together that simplifies a lot of these chained commands - llmpeg: https://github.com/jjcm/llmpeg

If you have ffmpeg installed and an OpenAI env api key set, it should work out of the box.

Demo: https://image.non.io/1c7a92ef-0917-49ef-9460-6298c7a9116c.we...

magarnicle · a year ago

My experience got even better once I learned how complex filters worked.

dylan604 · a year ago

learning how to use splits to do multiple things all in one command is a god send. the savings of only needed to read the source and convert to baseband video once is a great savings.

i started with avisynth, and it took time for my brain to switch to ffmpeg. i don't know how i could function without ffmpeg at this point

NetOpWibby · a year ago

Truly, a net positive to my life. Just a few days ago I asked my AI buddy (Claude) to create a zsh script to organize my downloads folder according to the Johnny Decimal system. I’ve since modified it to move the files to a JD setup on my desktop.

The sense of elation I get when I wonder aloud to my digital friend and they generate what I thought was too much to expect. Well worth the subscription.

bambax · a year ago

Basic syntax for re-encoding a video file did take me some time to memorize, but isn't in fact too hard:

  ffmpeg <Input file(s)> <Codec(s)> <MAPping of streams> <Video Filters> output_file

- input file: -i, can be repeated for multiple input files, like so:

  ffmpeg -i file1.mp4 -i file2.mkv

If there is more than one input file then some mapping is needed to decide what goes out in the output file.

- codec: -c:x where x is the type of codec (v: video, a: audio or s:subtitles), followed by its name, like so:

  -c:v libx265

I usually never set the audio codec as the guesses made by ffmpeg, based on output file type, are always right (in my experience), but deciding the video codec is useful, and so is the subtitles codec, as not all containers (file formats) support all codecs; mkv is the most flexible for subtitles codecs.

- mapping of streams: -map <input_file>:<stream_type>:<order>, like so:

  -map 0:v:0 -map 1:a:1 -map 1:a:0 -map 1:s:4

Map tells ffmpeg what stream from the input files to put in the output file. The first number is the position of the input file in the command, so if we're following the same example as above, '0' would be 'file1.mp4' and '1' would be 'file2.mkv'. The parameter in the middle is the stream type (v for video, a for audio, s for subtitles). The last number is the position of the stream IN THE INPUT FILE (NOT in the output file).

The position of the stream in the output file is determined by the position of the map command in the command line, so for example in the command above we are inverting the position of the audio streams (taken from 'file2.mkv'), as audio stream 1 will be in first position in the output file, and audio stream 0 (the first in the second input file) will be in second position in the output file.

This map thing is for me the most counter-intuitive because it's unusual for a CLI to be order-dependent. But, well, it is.

- video filters: -vf

Video filters can be extremely complex and I don't pretend to know how to use them by heart. But one simple video filter that I use often is 'scale', for resizing a video:

  -vf scale=<width>:<height>

width and height can be exact values in pixels, or one of them can be '-1' and then ffmpeg computes it based on the current aspect ratio and the other provided value, like this for example:

  -vf scale=320:-1

This doesn't always work because the computed value should be an even integer; if it's not, ffmpeg will raise an error and tell you why; then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).

And that's about it! ffmpeg options are immense, but this gets me through 90% of my video encoding needs, without looking at a manual or ask an LLM. (The only other options I use often are -ss and -t for start time and duration, to time-crop a video.)

izacus · a year ago

> This doesn't always work because the computed value should be an even integer; if it's not, ffmpeg will raise an error and tell you why; then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).

It's not about integer, but some of the sizes need to be even. You can use `-vf scale=320:-2` to ensure that.

0x38B · a year ago

A practical example of mapping streams:

    ffmpeg -i <movie-with-many-tracks.mkv> -map 0:0 -map 0:5 -map 0:12 -vcodec copy -acodec copy -scodec copy "output-movie.mkv"

Use: sometimes I have a file with a lot of audio and or subtitle streams but only want one or two of each – here, 0:0 is the video, 0:5 is English audio, and 0:12 was the subtitle track I wanted. Setting the codecs to “copy” means nothing gets reencoded.

jmb99 · a year ago

> then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).

Likely because the aspect ratio will no longer be the same. There will either be lost information (cropping), compression/stretching, or black bars, none of which should be default behaviour. Hence, the warning.

pdyc · a year ago

I ended up creating my own tool to generate ffmpeg commands https://newbeelearn.com/tools/videoeditor/

Over2Chars · a year ago

I think you're onto something. I've had hit or miss experiences with code from LLMs but it definitely makes the searching part different.

I had a problem I'd been thinking about for some time and I thought "Ill have some LLM give me an answer" and it did - it was wrong and didn't work but it got me to thinking about the problem in a slightly different way and my quacks after that got me an exact solution to this problem.

So I'm willing to give the AI more than partial credit.

sathishvj · a year ago

I would like to throw in a tool that I built into the ring: gencmd - https://gencmd.com/. There is a web version and also a CLI version.

If the CLI is installed, you can do: gencmd -c ffmpeg extract first 1 minute of video

Or you can just search for the same in the browser page.

Deleted Comment

nine_k · a year ago

I do it the old way: I write down the commands as a shell script, and reuse later.

But really what ffmpeg is missing is an expressive language to describe its operation. Something well-structured, like what jq does for JSON.

skydhash · a year ago

It already does. It’s the cli flags. What you’re missing is the semantic which you can get with learning about containers, codecs, and other stuff. You don’t use grep and sed with no understanding of what a text file is.

urda · a year ago

For me it was using a container of it, instead of having to install all the things FFmpeg needs on a machine.

michaelcampbell · a year ago

ffmpeg and jq are 2 commands I've about given up trying to "use" with any facility and am more than happy to pawn that off to one of the Gippity's; chat, claude, etc.

archerx · a year ago

Why not just use Handbrake? It’s just FFMpeg but with a GUI.

wildzzz · a year ago

That's fine for encoding but Handbrake doesn't let you do video streaming to my knowledge.

skirge · a year ago

llm - Clippit of 202x, but for the original Pentium was enough.

This reminds me I need to publish my write up on how I've been converting digitized home video tapes into clips using scene detection, but in case anyone is googling for it, here's a gist I landed on that does a good job of it [0] but sometimes it's fooled by e.g. camera flashes or camera shake so I need to give it a start and end file and have ffmpeg concatenate them back together [1]

Weird thing is I got better performance without "-c:v h264_videotoolbox" on latest Mac update, maybe some performance regression in Sequoia? I don't know. The equivalent flag for my windows machine with Nvidia GPU is "-c:v h264_nvenc" . I wonder why ffmpeg doesn't just auto detect this? I get about 8x performance boost from this. Probably the one time I actually earned my salary at work was when we were about to pay out the nose for more cloud servers with GPU to process video when I noticed the version of ffmpeg that came installed on the machines was compiled without GPU acceleration !

[0] https://gist.githubusercontent.com/nielsbom/c86c504fa5fd61ae...

[1] https://gist.githubusercontent.com/jazzyjackson/bf9282df0a40...

jack_pp · a year ago

> Probably the one time I actually earned my salary at work was when we were about to pay out the nose for more cloud servers with GPU to process video when I noticed the version of ffmpeg that came installed on the machines was compiled without GPU acceleration !

Issue with cloud CPU's is that they don't come with any of the consumer grade CPU built-in hardware video encoders so you'll have to go with the GPU machines that cost so much more. To be honest I haven't tried using HW accel in the cloud to have a proper price comparison, are you saying you did it and it was worth it?

radicality · a year ago

Are the hardware encoders even good? I thought that unless you need something realtime, it's always better to spend the cpu cycles on a better encode with th software encoder. Or have things changed ?

jazzyjackson · a year ago

We were a quick and dirty R&D team that had to do a lot of video processing quickly, we were not very cost sensitive and didn’t have anything other than AWS to work with, so I can’t speak to whether it was worth it :)

dekhn · a year ago

I used ffmpeg for empty scene detection- I have a camera pointed at the flight path for SFO, and stripped out all the frames that didn't have motion in them. You end up with a continuous movie of planes passing through, with none of the boring bits.

hnuser123456 · a year ago

Then can you merge all the clips starting when motion starts and see hundreds of planes fly across at once?

rahimnathwani · a year ago

  -c:v h264_nvenc

This is useful for batch encoding, when you're encoding a lot of different videos at once, because you can get better encoding throughput.

But in my limited experiments a while back, I found the output quality to be slightly worse than with libx264. I don't know if there's a way around it, but I'm not the only one who had that experience.

ziml77 · a year ago

IIRC they have improved the hardware encoder over the generations of cards, but yes NVENC has worse quality than libx264. NVENC is really meant for running the compression in real-time with minimal performance impact to the system. Basically for recording/streaming games.

icelancer · a year ago

So counterintuitive that nvenc confers worse quality than QSV/x264 variants, but it is both in theory and in my testing as well.

But for multiple streams or speed requirements, nvenc is the only way to fly.

xnx · a year ago

Co-signing. Encode time was faster with nvenc, but quality was noticeably worse even to my untrained eye.

Gormo · a year ago

> I wonder why ffmpeg doesn't just auto detect this?

Hardware encoding is often less configurable and involves greater trade-offs than using sophisticated software codecs, and don't produce exactly equivalent results even with equivalent parameters. On top of that, systems often have multiple hardware APIs to choose from that often different features.

FFMpeg is a complex command-line tool intended for users who are willing to learn its intricacies, so I'm not sure it makes sense for it to set defaults based on assumptions.

Trixter · a year ago

In your snippets, you don't appear to be deinterlacing. If your pre-digitized clips are already deinterlaced, that's fine, but if they're not, you're encoding interlaced material as progressive, and mangling the quality. Try adding a bwdif filter so that your 30i content gets encoded as 60p (which will look more like the original videotapes).

self.process = ( ffmpeg. input( "pipe:", format="rawvideo", pix_fmt="yuyv422", s="{}x{}".format(1280, 720), threads=8 ) .output( fname, pix_fmt="yuv422p", vcodec="libx264", crf=13 ) .overwrite_output() .global_args("-threads", "8") .run_async(pipe_stdin=True) )