Having worked with some MP4 demuxing for my extension [1], I feel the pain. Lots of times I would play the video only to find inexplicable issues such as drifting audio. I highly recommend using an mp4 inspector tool, such as mp4box [2], to debug these issues.
Nice, when playing around one weekend trying to see if I could use ipfs as a transport layer for streaming video I got hung up because most video formats I tried behaved very poorly with inconsistent streams where you may not have the beginning. I ended up on mpeg-ts as the best behaving of the bunch. It felt a little weird, as I was sort of expecting something more modern to have better performance, but seeing as my goal was not to evaluate video formats but just ship them around I just accepted it and moved on.
Thinking back on it now, I just did a little trial and error until I found something that worked, but what would I search for if I was trying to find data on how... ?streamable? an encoding is?
If curious, I got my proof of concept working but it was unpleasantly slow. I blindly chunked the incoming stream into megabyte sized chunks registered the chunks on ipfs then used ipfs pubsub to announce the chunk to any watchers. The watcher would watch the pubsub channel for announcements download the chunk and try to reassemble it in order and play it. one neat side effect that I found was when the stream was done if I had stored all the ipfs address I could then generate a whole ipfs file structure you could use to download the stream at a later date.
Can someone explain how does an existing media player understand the new mdat format without modification? I assume if they find a completed moov at end of the file, it would recognize the file as a unfragmented mp4. It should then try to find a list of recognized codecs directly inside the mdat (like in the first picture), but instead they will find another moov, a bunch of moofs and sub-mdats, all of which are clearly not proper for a unfragmented mp4. Why doesn't the player report this as a "unrecognizable, badly formatted" mp4 file?
The mdat box does not have a defined structure, and the specification actually states that attempting to define a structure is almost certainly a mistake. In order to find the data the player is looking for it has to read the moov box, which contains the byte offsets and sizes of "chunks" of data. Since there is no requirement for chunks to be contiguous, or even in the same file, we can simply skip over the fragmentation-related boxes within the data box.
The moov contains a list of byte offsets which the player can use to directly access media data. You can skip the moofs and other headers inside by using gaps in the offsets.
This is awesome work. I’ve coded some extensions for mp4 livestream to handling dozens of real-time streams and I’d love to try out the multi stream mux / demux…
> It kind of hurts that several days of work and research can be summed up in a couple paragraphs, but that's what the "pain" part in the subtitle is for.
Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.
I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)
> Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.
Would be curious to hear what goals you had with writing a muxer yourself as well, given that most people just use LibAV/GStreamer/GPAC and call it a day.
> I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)
I certainly wouldn't call the OBS implementation "clean" either. It's very much inspired by the FFmpeg/LibAV implementation since that one is fairly straightforward (not a lot of abstraction), and gets the job done (and also is GPL/LGPL so not a huge concern looking at it).
The short answer is, it's for an exploit. It involves some slightly less-well-trodden boxes, and adding specially crafted metadata to live-generated videos in real-time, which existing libraries couldn't help me with much (and I did spend some time fighting a few libraries, but couldn't make them do precisely what I wanted).
"Library" is perhaps an overstatement, it does the things I need and not much more.
I always forget about GStreamer but I think I have a perfect application for it. Hopefully it’s easier to use as a library than MediaFoundation or FFMpeg.
MP4. The answer to the question of "Is there a way to make RIFF and AVI even worse somehow?" It makes you genuinely pine for MPEG2 Transport Streams. ISO 13818 for life.
1: https://github.com/Andrews54757/FastStream
2: https://gpac.github.io/mp4box.js/test/filereader.html
Thinking back on it now, I just did a little trial and error until I found something that worked, but what would I search for if I was trying to find data on how... ?streamable? an encoding is?
If curious, I got my proof of concept working but it was unpleasantly slow. I blindly chunked the incoming stream into megabyte sized chunks registered the chunks on ipfs then used ipfs pubsub to announce the chunk to any watchers. The watcher would watch the pubsub channel for announcements download the chunk and try to reassemble it in order and play it. one neat side effect that I found was when the stream was done if I had stored all the ipfs address I could then generate a whole ipfs file structure you could use to download the stream at a later date.
Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.
I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)
Would be curious to hear what goals you had with writing a muxer yourself as well, given that most people just use LibAV/GStreamer/GPAC and call it a day.
> I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)
I certainly wouldn't call the OBS implementation "clean" either. It's very much inspired by the FFmpeg/LibAV implementation since that one is fairly straightforward (not a lot of abstraction), and gets the job done (and also is GPL/LGPL so not a huge concern looking at it).
"Library" is perhaps an overstatement, it does the things I need and not much more.
I always forget about GStreamer but I think I have a perfect application for it. Hopefully it’s easier to use as a library than MediaFoundation or FFMpeg.
Would love to see MP4 Hybrid supported in popular packages like mp4-muxer [1] and mp4box [2] someday.
1: https://github.com/Vanilagy/mp4-muxer 2: https://github.com/gpac/mp4box.js