Working on replacing my wireless keyboard and trackpad with some "gloves" so I can use it while on hikes or just generally outside. Then, gonna integrate some custom AR and ML/GPT.
(disclaimer: I used to work there)
This approach was the one I tried first also (I also tried the frequency one fwiw, which has its own, worse drawbacks). But using loudness runs into issues if the source loudness isn't (relatively) even across the entire source media. Using a single sensitivity setting like this would be a problem if:
* recording gain is set to automatic, and there are sudden changes in noise floor like wind (if recorded in 24-bit or lower)
* crew adjusts gain partway through recording (big no-no but happens)
* talent/host moves in and out of microphone sweet spot
* talent/host adjusts themselves in a squeaky chair during silence or transition-to-silence (or coughs, or breaths loudly, or ambulance goes by...)
If you apply the edit w/ a single sensitivity and something like the above is true, it would cut in the wrong place. Unfortunately, you would have to watch the entire show, skipping to boundaries with your full attention to know that ever got a cut wrong.
IMO the solution will probably be a combination of browser based UI and cloud based processing. The drawback of this approach is that the server would need to host the project files.
The latter should work but the former requires an extra step for now iirc: https://bun.sh/docs/cli/install#lifecycle-scripts