The stand-alone code generator (a statically linked executable written in Go with no dependencies outside the Go standard library) generates stand-alone POSIX C code for the neural net, requiring only gcc to compile.
Also see Fabrice Bellard's LibNC:
C API. Small library, no external dependencies, available for Linux and Windows.
If you need a little bit of inference (say, 20 ReNet50s per second per CPU core) as part of a larger system, there's nothing cheaper. If you're doing a small amount of inference, perhaps limited by other parts of the system, you can't keep a GPU fed and the GPU is a huge waste of money.
AVX-512, with its masked operations and dual-input permutations, is an expressive and powerful SIMD instruction set. It's a pleasure to write code for, but we need good hardware support (which is literally years overdue).