I know there are options for everything in FFmpeg, and I’m thankful to the community and maintainers for providing such a powerful and well-documented tool. I sometimes just want a quick video edit that doesn’t involve reading the man pages for minutes or hours.
If you ever needed to do that more than once, using a basic video editor (of which there are many, free and open-source, no need for a commercial behemoth like Final Cut), playing with it for ten minutes once would give you all the knowledge you need forever, even when you are without access to your LLM. And you can keep the app installed, you don’t need to download it every time. There’s also no need to watch YouTube videos, most of these basic editors have evident interfaces that anyone could figure out on their own for simple tasks. People did figure out things before YouTube tutorials. Or hey, if you’re that keen on LLMs, ask them where the option you want is.
Furthermore, you have not addressed at all the crux of the point. How are you even getting the exact time stamps to give to the LLM of FFmpeg for the cut? Or how do you decide that 2x is the exact speedup you need? Or how do you know what size and position and text font and colour even make sense?
All of those are visual decisions which need confirmation because video is visual. It doesn’t make sense to blindly run lengthy FFmpeg commands over and over to see if the result is any good.
Not all video work is visual-first storytelling. In engineering/lab contexts you often just need “good enough” trims, concatenation, speedups, and a few labels to document an experiment. As I said in another comment, I usually get the timestamps by noting them down while watching the video, or in rarer cases from timestamped sensor data.
Sorry if my explanation wasn't good enough...