skandium (u/skandium)

skandium commented on My new deadline: 20 years to give away virtually all my wealth gatesnotes.com/home/home-... · Posted by u/nrvn

skandium · 4 months ago

One of the great tragedies of the world is that while he is arguably the philanthropist with the highest positive impact in human history, a significant part of the population seems to still think he is the literal Antichrist.

skandium commented on Perceptually lossless (talking head) video compression at 22kbit/s mlumiste.com/technical/li... · Posted by u/skandium

hinkley · 10 months ago

No, I really don’t. He acknowledges it’s not in keeping with the title or the thesis and then just sort of waves it off.

Smells like rationalization to me.

skandium · 10 months ago

Well, this isn't probably a problem with the model, but the source frame having wrong eye gaze. Besides, perceptually lossless need not be defined in a side-by-side comparison context. If you were only viewing the right hand side video, how could you tell the eye gaze is off? The point was more on that the movement looks natural, unlike almost all neural avatars up to this year.

skandium commented on Perceptually lossless (talking head) video compression at 22kbit/s mlumiste.com/technical/li... · Posted by u/skandium

zbobet2012 · 10 months ago

These sorts of models pop here quite a bit, and they ignore fundamental facts of video codecs (video specific lossy compression technologies).

Traditional codecs have always focused on trade offs among encode complexity, decode complexity, and latency. Where complexity = compute. If every target device ran a 4090 at full power, we could go far below 22kbps with a traditional codec techniques for content like this. 22kbps isn't particularly impressive given these compute constraints.

This is my field, and trust me we (MPEG committees, AOM) look at "AI" based models, including GANs constantly. They don't yet look promising compared to traditional methods.

Oh and benchmarking against a video compression standard that's over twenty years old isn't doing a lot either for the plausibility of these methods.

skandium · 10 months ago

This is my field as well, although I come from the neural network angle.

Learned video codecs definitely do look promising: Microsoft's DCVC-FM (https://github.com/microsoft/DCVC) beats H.267 in BD-rate. Another benefit of the learned approach is being able to run on soon commodity NPUs, without special hardware accommodation requirements.

In the CLIC challenge, hybrid codecs (traditional + learned components) are so far the best, so that has been a letdown for pure end to end learned codecs, agree. But something like H.267 is currently not cheap to run either.

skandium commented on Compressing Images with Neural Networks mlumiste.com/technical/co... · Posted by u/skandium

rottc0dd · a year ago

Something similar by Fabrice Bellard:

https://bellard.org/nncp/

skandium · a year ago

If you look at the winners of the Hutter prize, or especially the Large Text Compression Benchmark, then almost every approach uses some kind of machine learning approach for the adaptive probability model and then either arithmetic coding or rANS to losslessly encode it.

This is intuitive, as the competition organisers say: compression is prediction.

skandium commented on Compressing Images with Neural Networks mlumiste.com/technical/co... · Posted by u/skandium

lifthrasiir · a year ago

Any lossy compressor changes the original image for better compression at expense of the perfect accuracy.

skandium · a year ago

Exactly, in practice the alternatives are either blocky artifacts (JPEG and most other traditional codecs), blurring everything (learned codecs optimised for MSE) or "hallucinating" patterns when using models like GANs. However, in practice even the generative side of compression models is evaluated against the original image rather than only output quality, so the outputs tend to be passable.

To see what a lossy generator hallucinating patterns means in practice, I recommend viewing HiFiC vs original here: https://hific.github.io/