Readit News logoReadit News
qeternity commented on Writing Speed-of-Light Flash Attention for 5090 in CUDA C++   gau-nernst.github.io/fa-5... · Posted by u/dsr12
doctorpangloss · 2 days ago
Hmm, but supposing the accelerated NVIDIA specific inference data types were available for Triton, then you would just use that? Why not contribute to Triton, they accept PRs? Like so what if you do free product ecosystem development for NVIDIA and giant corporations by contributing to Triton?
qeternity · 2 days ago
Second line of the post:

> The main objective is to learn writing attention in CUDA C++, since many features are not available in Triton, such as MXFP8 / NVFP4 MMA for sm120.

qeternity commented on Mark Zuckerberg freezes AI hiring amid bubble fears   telegraph.co.uk/business/... · Posted by u/pera
Jagerbizzle · 4 days ago
I guess there's no 'M' in "FAANG" but there's this:

https://www.geekwire.com/2025/im-good-for-my-80-billion-what...

qeternity · 4 days ago
FAANG has been replaced by Mag7: Alphabet, Amazon, Apple, Broadcom, Meta, Microsoft, and Nvidia.
qeternity commented on GPT-5 vs. Sonnet: Complex Agentic Coding   elite-ai-assisted-coding.... · Posted by u/intellectronica
arcticfox · 17 days ago
> Note that Claude 4 Sonnet isn’t the strongest model from Anthropic’s Claude series. Claude Opus is their most capable model for coding, but it seemed inappropriate to compare it with GPT-5 because it costs 10 times as much.

Well - I would have been interested in GPT-5 vs. Opus. Claude Code Max is affordable with Opus.

qeternity · 17 days ago
> Claude Code Max is affordable with Opus

Because Anthropic is presumably massively subsidizing the usage.

qeternity commented on GPT-5 vs. Sonnet: Complex Agentic Coding   elite-ai-assisted-coding.... · Posted by u/intellectronica
chromejs10 · 17 days ago
This should have been compared with Opus... I know OP says he didn't because of cost but if you're comparing who is better then you need to compare the best to the best... if Claude Opus 4.1 is significantly better than GPT 5 then that could offset the extra expense. Not saying it will... but forget cost if we want to compare solely the quality
qeternity · 17 days ago
> but forget cost if we want to compare solely the quality

I think this is the whole reason not to compare it to Opus...

qeternity commented on OpenAI Leaks 120B Open Model on Hugging Face   twitter.com/main_horse/st... · Posted by u/skadamat
Nerd_Nest · 24 days ago
Whoa, 120B? That’s huge.
qeternity · 24 days ago
120B MoE. The 20B is dense.

As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B.

qeternity commented on Study mode   openai.com/index/chatgpt-... · Posted by u/meetpateltech
Spivak · a month ago
Professor might be overselling it but lecturer for undergrad and intro graduate courses for sure.
qeternity · a month ago
I think this is overselling most professors.
qeternity commented on Claude Code weekly rate limits    · Posted by u/thebestmoshe
johnpaulkiser · a month ago
I doubts thats what they want. They want a static fixed price, $5k a month for example and never have to think about it.
qeternity · a month ago
Take the API and assume 24/7 usage (or whatever working hours are). That’s your fixed cost.

It’s more likely that this sum is higher than they want. So really it’s not about predictability.

qeternity commented on People kept working, became healthier while on basic income: report (2020)   cbc.ca/news/canada/hamilt... · Posted by u/jszymborski
throaway955 · a month ago
those labour markets are in shambles atm for most people who aren't upper middle class
qeternity · a month ago
In shambles compared to when? Quality of life is the highest it's ever been across socioeconomic strata. It's just our expectations outpace reality.
qeternity commented on Cerebras launches Qwen3-235B, achieving 1.5k tokens per second   cerebras.ai/press-release... · Posted by u/mihau
Voloskaya · a month ago
There is no way not to use SRAM on a GPU/Cerebras/most accelerators. This is where the cores fetch the data.

But that doesn’t mean you are only using SRAM, that would be impractical. Just like using a CPU just by storing stuff in the L3 cache and never going to the RAM. Unless I am missing something from the original link, I don’t know how you got to the conclusion that they only used SRAM.

qeternity · a month ago
> I don’t know how you got to the conclusion that they only used SRAM.

Because they are doing 1,500 tokens per second.

qeternity commented on People kept working, became healthier while on basic income: report (2020)   cbc.ca/news/canada/hamilt... · Posted by u/jszymborski
strken · a month ago
> Lewchuk added that while some people did stop working, about half of them headed back to school in hopes of coming back to a better job.

I believe previous UBI experiments have shown the same results: most people keep working, some people stop, but they usually have decent reasons. Education, extending parental leave, or being a caregiver aren't necessarily things we want to discourage if they result in a greater return.

qeternity · a month ago
> if they result in a greater return.

Greater return than what and to whom?

We already have existing labor markets that are very capable of determining returns.

u/qeternity

KarmaCake day8641May 4, 2016View Original