Readit News logoReadit News
37ef_ced3 commented on AVX 512 will be the future   realworldtech.com/forum/?... · Posted by u/fulafel
37ef_ced3 · 3 years ago
Linus Torvalds:

  And I claim that that is the real problem with AVX-512 (and pretty much any vectorization). I personally cannot find a single benchmark that does anything I would ever do - not even remotely close. So if you aren't into some chess engine, if you aren't into parsing (but not using) JSON, if you aren't into software raytracing (as opposed to raytracing in games, which is clearly starting to take off thanks to GPU support), what else is there?
Answer? Neural net inference, e.g., https://NN-512.com

If you need a little bit of inference (say, 20 ReNet50s per second per CPU core) as part of a larger system, there's nothing cheaper. If you're doing a small amount of inference, perhaps limited by other parts of the system, you can't keep a GPU fed and the GPU is a huge waste of money.

AVX-512, with its masked operations and dual-input permutations, is an expressive and powerful SIMD instruction set. It's a pleasure to write code for, but we need good hardware support (which is literally years overdue).

37ef_ced3 commented on Mapping Out the HPC Dependency Chaos   arxiv.org/abs/2211.05118... · Posted by u/setheron
37ef_ced3 · 3 years ago
There are completely stand-alone systems like:

https://NN-512.com

The stand-alone code generator (a statically linked executable written in Go with no dependencies outside the Go standard library) generates stand-alone POSIX C code for the neural net, requiring only gcc to compile.

Also see Fabrice Bellard's LibNC:

https://bellard.org/libnc/

C API. Small library, no external dependencies, available for Linux and Windows.

37ef_ced3 commented on Tinygrad: A simple and powerful neural network framework   tinygrad.org/... · Posted by u/masterofsome
JacobiX · 3 years ago
I love those tiny DNN frameworks, some examples that I studied in the past (I still use PyTorch for work related projects) :

thinc.by the creators of spaCy https://github.com/explosion/thinc

nnabla by Sony https://github.com/sony/nnabla

LibNC by Fabrice Bellard https://bellard.org/libnc/

Dlib dnn http://dlib.net/ml.html#add_layer

37ef_ced3 · 3 years ago
37ef_ced3 commented on Incremental Parsing in Go   dev-nonsense.com/posts/in... · Posted by u/willdaly
xyzzy4747 · 3 years ago
For max optimization, wouldn’t it be better to create a Rust or C library for parsing that Go links into? I personally don’t see the usefulness of trying to optimize Go itself too much as it’s handicapped by the runtime and garbage collection.
37ef_ced3 · 3 years ago
You're in for a big surprise. Try using the language.

Spend some time using Go, and you will be impressed by its performance.

You'll wonder, "Were all those haters on Hacker News misinformed?"

Deleted Comment

37ef_ced3 commented on Decompiling x86 Deep Neural Network Executables   github.com/monkbai/DNN-de... · Posted by u/matt_d
bertr4nd · 3 years ago
By “fully fused” do you mean no function call boundaries? (“Fused” is such an overloaded term)
37ef_ced3 · 3 years ago
Convolutions are fused into convolutions, elementwise operations are fused into convolutions, everything is inlined except where function calls are needed for pthread work units (and those work units are all custom/arbitrary).
37ef_ced3 commented on Decompiling x86 Deep Neural Network Executables   github.com/monkbai/DNN-de... · Posted by u/matt_d
rootw0rm · 3 years ago
exe sample?
37ef_ced3 · 3 years ago
So...

...you were unable to decipher this Hacker News comment thread...

...unable find some C code and build it with GCC and make an executable for yourself...

...but you think you can reverse engineer the executable?

37ef_ced3 commented on Decompiling x86 Deep Neural Network Executables   github.com/monkbai/DNN-de... · Posted by u/matt_d
userbinator · 3 years ago
If you can do it, show me. But I know you can't

I've been out of the cracking scene for over a decade now, but I expect that to be none other than a challenge, having seen how far publicly available decompilers have progressed.

37ef_ced3 · 3 years ago
Here is the C code for a DenseNet-121 generated by NN-512:

https://nn-512.com/browse/DenseNet121

Even if you had the C code available to you, you would have a hard time producing the input graph.

Good luck reverse engineering it after GCC has compiled it!

NN-512 has an incredibly flexible code generator. It can easily be tweaked to produce completely different code for the same convolution, so everyone can apply their own twist to defeat the reverse engineers ("the intellectual property thieves").

37ef_ced3 commented on Decompiling x86 Deep Neural Network Executables   github.com/monkbai/DNN-de... · Posted by u/matt_d
37ef_ced3 · 3 years ago
This will not be able to reverse engineer fully-customized, fully-fused neural networks generated by NN-512:

https://NN-512.com

NN-512 generates custom code for all the operations, custom units of work for the threads, custom code around tensor edges, everything is fused and unrolled and customized. If they can deduce the network graph specification from the AVX-512 code, I will be astonished.

If you can do it, show me. But I know you can't.

Anyone who cares about model privacy will use their own variant of a tool like NN-512. It's security through obscurity, but that's the best you can hope for if you are distributing an executable.

Deleted Comment

u/37ef_ced3

KarmaCake day898December 2, 2020
About
Generate stand-alone C code for AVX-512 neural nets:

https://NN-512.com

Jonathan Aylard's NN-512 ResNet50 example:

https://github.com/jonatron/test_nn512

View Original