Readit News logoReadit News
version_five · 3 years ago
Context: https://twitter.com/a1k0n/status/1674644631156555780

"I reimplemented GPT-2 from scratch in C++ as an exercise to really understand the nuts and bolts of LLMs. GPT2-117M isn't a super great model, but it's extremely satisfying to get it to generate basically the same thing as other reference implementations."

"I" refers to the guy that wrote this, I, version_five have nothing to do with it, I just thought it looked cool.

metiscus · 3 years ago
Indeed it does, thanks for posting this.
kken · 3 years ago
From the same twitter thread, a dense implementation in less than 100 lines of plan C:

https://github.com/davidar/eigenGPT/tree/c

eclectic29 · 3 years ago
I found this extremely helpful even though it's not in C++: https://jaykmody.com/blog/gpt-from-scratch/.
JPLeRouzic · 3 years ago
Yes it's very nice from the author to provide this very readable text. I saved this text to read it later.
eugenhotaj · 3 years ago
This is pretty cool. I had the same idea but in zig: https://github.com/EugenHotaj/zig_gpt2

Not fully finished yet, haven't gotten around to implementing bpe encoding/decoding and only some ops use BLAS.

rhelz · 3 years ago
Well, I downloaded and compiled it (cool! Thanks!) but no matter what prompt I give it, it just prints out gibberish....where do I go now to learn how to properly use it?