More than demoed, they've shipped DLSS in quite a few games now. The 1.0 version was underwhelming but the 2.0 version works extremely well in practice.
However Nvidia are treating DLSS as their secret sauce and not publishing any details, so Facebook's more open research is interesting even if it's not as refined yet.
A Titan V GPU, using the 4x4 upsampling, at a target resolution of 1080p takes 24.42ms or 18.25ms for "fast" mode. This blows out the 11ms budget you have to render at 90hz (6.9ms for 144hz), and it doesn't appear to include rendering costs at all...that time is purely in upsampling.
Cool tech but a ways to go in order to make it useful for VR.
it doesn't appear to include rendering costs at all...that time is purely in upsampling
That part wouldn't be an issue if the plan is to render low resolution images in the cloud and stream them to a device that can upsample them locally. There wouldn't be any local rendering costs.
I'd be very surprised if this is what it's to be used for. The technique requires color, depth and motion vectors. That's three separate video channels, and two of them contain data that isn't usually stuffed into videos.
Any compression artifacts are going to stick out like a sore thumb, so you'll need to stream very high quality, and you're going to have weird interactions between different layers being compressed differently.
Foveal rendering is in a weird spot. The software seems to be there, but mostly only in academia. The hardware is almost nowhere to be found, because it is another expense. People prefer spending that extra money on a better computer, since that improves every VR experience, not just the possibility of (part of) future experiences.
FVR needs a hook: what can it do that "dumb" VR headsets don't?
Do you need every single frame to be perfectly upsampled? Maybe there's a proportion of frames that could be rendered faster but with a less accurate method?
Oh interesting. I wonder if you could combine temporal antialiasing techniques with this to get a pseudo upsampling by only upsampling portions of the screen. Maybe focus on edges every other frame, and do different flat surfaces every few frames in between the edge passes. Then use TAA concepts to blend the pixels over frames.
No idea what "some of the input" means, or why you thought "Low Resolution Input" is disingenuous?
It uses color, depth and subpixel motion vectors of 1-4 previous frames. All things that modern game engines can easily calculate.
You didn't even need to read the paper to get this info, it's literally in a picture on the blog post.
Right - so a single low-res image should not be paired with the high-res one and labelled as input and output, because that implies the algorithm turned the one data into the other, which it did not do.
In contrast to DLSS1, the output of the NN is not color values, but sampling locations and weights, to look up the color values from the previous low-resolution frames.
There is a big difference between latency and throughput. FPS is throughput. If you assume the entire system is producing only the current frame then those numbers are directly correlated. But most systems, especially game engines/hardware, always have multiple things going in parallel simultaneously.
The H.264 encoder on my CPU introduces >16.7ms of latency into a video stream, but it can encode hundreds of frames per second of SD video all day. Adding ~1 more frame of latency may be worth a quadrupling in image quality/resolution in most circumstances.
However Nvidia are treating DLSS as their secret sauce and not publishing any details, so Facebook's more open research is interesting even if it's not as refined yet.
https://developer.nvidia.com/gtc/2020/video/s22698
A Titan V GPU, using the 4x4 upsampling, at a target resolution of 1080p takes 24.42ms or 18.25ms for "fast" mode. This blows out the 11ms budget you have to render at 90hz (6.9ms for 144hz), and it doesn't appear to include rendering costs at all...that time is purely in upsampling.
Cool tech but a ways to go in order to make it useful for VR.
That part wouldn't be an issue if the plan is to render low resolution images in the cloud and stream them to a device that can upsample them locally. There wouldn't be any local rendering costs.
Any compression artifacts are going to stick out like a sore thumb, so you'll need to stream very high quality, and you're going to have weird interactions between different layers being compressed differently.
Spending precious milliseconds perfecting the corners of the image for VR seems like a complete waste.
FVR needs a hook: what can it do that "dumb" VR headsets don't?
I suppose it's assumed that with the contributions of this one, future work can be done to make it faster.
It uses color, depth and subpixel motion vectors of 1-4 previous frames. All things that modern game engines can easily calculate. You didn't even need to read the paper to get this info, it's literally in a picture on the blog post.
The inputs are similar:
https://www.nvidia.com/content/dam/en-zz/Solutions/geforce/n...
In contrast to DLSS1, the output of the NN is not color values, but sampling locations and weights, to look up the color values from the previous low-resolution frames.
Great start but definitely needs additional work to be usable in games.
The H.264 encoder on my CPU introduces >16.7ms of latency into a video stream, but it can encode hundreds of frames per second of SD video all day. Adding ~1 more frame of latency may be worth a quadrupling in image quality/resolution in most circumstances.