A Nvidia Engineer Wrote a Vulkan Driver That Works on Older Raspberry Pi

Excellent effort.

I wonder whether there will compositor support this driver, to take advantage in the desktop environment. I had made several attempts to get a smoother desktop experience[1] on RPi 3 -

•LXDE + Openbox on Raspbian + X server

•Xfce4 + VC4 + X server + Arch Linux ARM + USB SSD

•Enlightenment + Wayland + Arch Linux ARM + USB SSD

Although Elightenment on Wayland with OpenGL was the smoothest of them all, it's not usable(frequent crashes with RPi) and since the frame buffer was limited to 2048x2048 none of them supported my 2560x1080 monitor.

Xfce4 + VC4 on Arch Linux is more usable, but is still not as stable as default Raspbian. I didn't see any productivity merits in continuing this adventure and decided to reclaim the memory from GPU to revert into headless[Arch+SSD] for a motion eye setup processing 3 720p camera streams simultaneously with average of ~ 50% CPU on all 4 cores when not watching the feed live(but motion active).

[1]https://abishekmuthian.com/getting-smoother-desktop-experien...

mmm_grayons · 6 years ago

I found arch to be quite stable on mine, though I didn't use a desktop most of the time. When I did, i3 was by far the fastest, so maybe give that a try.

Abishek_Muthian · 6 years ago

I agree reg Arch ARM, may it's just VC4 that's causing the issue in desktop environment. I'll give i3 a try if I pursue this, but IMHO RPi < Pi4 are best suited for headless operations.

How hard would it be to reverse engineer CUDA and make something like WINE that translates all CUDA operations into OpenCL or directly into third-party GPU instructions?

Obviously wouldn't be as fast as on NVIDIA hardware but it would potentially be much faster than the CPU versions of those pieces of software.

my123 · 6 years ago

Why reverse-engineering something that is very well documented? (and hell, PTX even has GCC and LLVM backends while we are at it)

The issue here is manpower, and not obfuscation in any way.

aspaceman · 6 years ago

It’s a lot of work.

That’s actually it with the CUDA stuff. Nvidia made a massive investment, and is still doing so, so the work got done. OpenCL just doesn’t have the same cash backing it up.

Oh also this is basically what AMDs HIP does. The issue is that the project hasn’t been moving very fast.

namibj · 6 years ago

Prepare for loads of reverse-engineering, as there are binary-only CUDA kernels in applications. This does tend to break when new GPUs come out, but a software update takes care of that. I assume some do that for obfuscation, while others are certainly using it for performance, see e.g. https://github.com/bryancatanzaro/nervana-lib-gpu-performanc... for some practical-ish examples.

bootloop · 6 years ago

It's easier to replace the CUDA kernels with probably designed OpenCL kernels instead. Edit: For third party: Thats what the driver does, compiles OpenGL and OpenCL code into GPU machine code. In case of mesa based on reverse engineering.

dheera · 6 years ago

> It's easier to replace the CUDA kernels

If it's your software, yes. But not if you're trying to fix an already-written framework or run other peoples' machine learning models.

pcwalton · 6 years ago

A few years back I wrote an N64 emulator graphics module for the Raspberry Pi because I was unhappy with the poor performance of the existing ones and I would have really wanted this. Drawcall overhead for Broadcom's official drivers is extreme: you can't have more than a single-digit number of drawcalls per frame and still maintain 60 FPS. I ended up having to go to ubershaders for everything just to maintain 30 FPS. I'm certain the hardware was capable of much more, but the drivers were holding it back.

mobilio · 6 years ago

This is one of RPi limitation - closed source firmware and drivers. But hardware is much better...

dividuum · 6 years ago

Only partially true on the Pi4: The complete OpenGL stack is now open source using Mesa (it was optional on previous models). Video decoding is slowly moving over from the closed MMAL stack to KMS and V4L2. The boot firmware is still closed though.

0xfaded · 6 years ago

The video core documentation for the gpu is open and you can actually write code for the GPU (and an open source assembler exists)

app4soft · 6 years ago

> I'm certain the hardware was capable of much more, but the drivers were holding it back.

Exactly like iPhone story on performance decreasing "for battery life" for older devices, each time on next device model rolled out.

oceanswave · 6 years ago

Or like samsung phones, where they just stop giving you updates after 2 major releases

soylentgraham · 6 years ago

Do you mean hardware module? Have you got any blogs or docs about this? Came across the same thing doing a baremetal kernel (fun!) and this never crossed my mind!

jeroenhd · 6 years ago

Very nice, this can speed up some of the old Pis still hanging around quite significantly if software authors start making use of Vulkan on ARM.

I wonder how Nvidia is looking at this with their terrible anti-open source mindset. I hooe an engineer of theirs with experience from their company writing a video driver doesn't get the author any repercussions.

int_19h · 6 years ago

Not just old - this also supports Zero and Zero W, which are in a category of their own.

tosh · 6 years ago

repo: https://github.com/Yours3lf/rpi-vk-driver

ensiferum · 6 years ago

Just makes me wonder whether there could be any conflict of interest here and who owns the copyright. I know some of the big corps assume ownership of whatever their employees produce even outside working hours. Or Maybe the project was signed off ?

squarefoot · 6 years ago

Any chances to see something similar ported to older Mali GPUs such as the ones contained in lower end Allwinner ARM SOCs? So far it seems only newer ones are (officially) supported. https://developer.arm.com/solutions/graphics-and-gaming/apis...

stelf · 6 years ago

Does it mean you’ll be able to speed up the compositor of gui and ff for example?

It doesn't look like this supports GLSL, so that would have to be addressed first. Furthermore, there's no Vulkan backend yet for WebRender.

simcop2387 · 6 years ago

Actually it might be, Zink just got OpenGL 3.0 support.

Zink is an OpenGL Implementation on top of Vulkan and that might actually be a good way to do full OpenGL on these devices.