Readit News logoReadit News

Deleted Comment

lanceflt commented on Apple M3 Ultra   apple.com/newsroom/2025/0... · Posted by u/ksec
bearjaws · 6 months ago
Not sure why you are being downvoted, we already know the performance numbers due to memory bandwidth constraints on the M4 Max chips, it would apply here as well.

525GB/s to 1000GB/s will double the TPS at best, which is still quite low for large LLMs.

lanceflt · 6 months ago
Deepseek R1 (full, Q1) is 14t/s on an M2 Ultra, so this should be around 20t/s
lanceflt commented on Tencent Hunyuan-Large   github.com/Tencent/Tencen... · Posted by u/helloericsf
Tepix · 10 months ago
I'm no expert on these MoE models with "a total of 389 billion parameters and 52 billion active parameters". Do hobbyists stand a chance of running this model (quantized) at home? For example on something like a PC with 128GB (or 512GB) RAM and one or two RTX 3090 24GB VRAM GPUs?
lanceflt · 10 months ago
RAM for 4-bit is 1GB per 2 billion parameters. So you will want 256GB RAM and at least one GPU. If you only have one server and one user, it's the full parameter count. (If you have multiple GPUs/servers and many users in parallel, you can shard and route it so you only need the active parameter count per GPU/server. So it's cheaper at scale.)
lanceflt commented on Leak claims RTX 5090 has 600W TGP, RTX 5080 hits 400W   tomshardware.com/pc-compo... · Posted by u/quxinxin
kiririn · a year ago
Even the 250W 2080Ti (+150W Intel) is oppressive to be in the same room with during warmer months. I know it probably won't be, but it should be a hard sell in countries that don't have air conditioning as standard. Not to mention the noise needed to cool such heat
lanceflt · a year ago
I'm running a 4090 at 280W, and I'm seeing ~96% of the performance of 450W. There's no need to run it at full power.
lanceflt commented on Have Swiss scientists made a chocolate breakthrough?   bbc.co.uk/news/articles/c... · Posted by u/cmsefton
lanceflt · a year ago
This is just an ad for the Swiss chocolate industry. The only people quoted are being funded directly by chocolate manufacturers.
lanceflt commented on Extracting concepts from GPT-4   openai.com/index/extracti... · Posted by u/davidbarker
realPtolemy · a year ago
Indeed, and the very last section about how they’ve now “open sourced” this research is also a bit vague. They’ve shared their research methodology and findings… But isn’t that obligatory when writing a public paper?
lanceflt · a year ago
https://github.com/openai/sparse_autoencoder

They actually open sourced it, for GPT-2 which is an open model.

lanceflt commented on Llama 3-V: Matching GPT4-V with a 100x smaller model and 500 dollars   aksh-garg.medium.com/llam... · Posted by u/minimaxir
nomel · a year ago
It's llama 3 training cost + their cost. Meta "kindly" covered the first $700M.

> We add a vision encoder to Llama3 8B

lanceflt · a year ago
They didn't train the vision encoder either, it's unchanged SigLIP by Google.
lanceflt commented on Llama 3-V: Matching GPT4-V with a 100x smaller model and 500 dollars   aksh-garg.medium.com/llam... · Posted by u/minimaxir
lanceflt · a year ago
- Llava is not the SOTA open VLM, InternVL-1.5 is https://huggingface.co/spaces/opencompass/open_vlm_leaderboa...

You need to compare the evals to strong open VLMs including this and CogVLM

- This is not "first-ever multimodal model built on top of Llama3", there's already a Llava on Llama3-8b https://huggingface.co/lmms-lab

u/lanceflt

KarmaCake day63May 28, 2024View Original