Readit News logoReadit News
tjake commented on Vector indexing all of Wikipedia on a laptop   foojay.io/today/indexing-... · Posted by u/tjake
gfourfour · a year ago
Ah didn’t realize it was every language. Yes I’m using a light weight open model - but also my use case doesn’t require anything super heavy weight. Wikipedia articles are very feature-dense and differentiable from one another. It doesn’t require a massive feature vector to create meaningful embeddings.
tjake · a year ago
it's 35M 1024 vectors Plus the text
tjake commented on Jlama (Java) outperforms llama.cpp in F32 Llama 7B Model   twitter.com/tjake/status/... · Posted by u/tjake
version_five · 2 years ago
Where does the performance difference come from? And in what kind of processor & gpu? I didn't even know llama.cpp had a 32 bit option. For now I'm pretty suspicious it's a fair comparison.
tjake · 2 years ago
The default for `convert.py` is F32. This is just SIMD CPU comparison.

Jlama uses the vector api in java20 but also better thread scheduling with work stealing and zero allocation.

u/tjake

KarmaCake day915March 12, 2008
About
coder, dreamer...

http://twitter.com/tjake

View Original