qeternity (u/qeternity)

qeternity commented on Writing Speed-of-Light Flash Attention for 5090 in CUDA C++ gau-nernst.github.io/fa-5... · Posted by u/dsr12

doctorpangloss · 2 days ago

Hmm, but supposing the accelerated NVIDIA specific inference data types were available for Triton, then you would just use that? Why not contribute to Triton, they accept PRs? Like so what if you do free product ecosystem development for NVIDIA and giant corporations by contributing to Triton?

qeternity · 2 days ago

Second line of the post:

> The main objective is to learn writing attention in CUDA C++, since many features are not available in Triton, such as MXFP8 / NVFP4 MMA for sm120.

qeternity commented on Mark Zuckerberg freezes AI hiring amid bubble fears telegraph.co.uk/business/... · Posted by u/pera

Jagerbizzle · 4 days ago

I guess there's no 'M' in "FAANG" but there's this:

https://www.geekwire.com/2025/im-good-for-my-80-billion-what...

qeternity · 4 days ago

FAANG has been replaced by Mag7: Alphabet, Amazon, Apple, Broadcom, Meta, Microsoft, and Nvidia.

qeternity commented on GPT-5 vs. Sonnet: Complex Agentic Coding elite-ai-assisted-coding.... · Posted by u/intellectronica

arcticfox · 17 days ago

> Note that Claude 4 Sonnet isn’t the strongest model from Anthropic’s Claude series. Claude Opus is their most capable model for coding, but it seemed inappropriate to compare it with GPT-5 because it costs 10 times as much.

Well - I would have been interested in GPT-5 vs. Opus. Claude Code Max is affordable with Opus.

qeternity · 17 days ago

> Claude Code Max is affordable with Opus

Because Anthropic is presumably massively subsidizing the usage.

qeternity commented on GPT-5 vs. Sonnet: Complex Agentic Coding elite-ai-assisted-coding.... · Posted by u/intellectronica

chromejs10 · 17 days ago

This should have been compared with Opus... I know OP says he didn't because of cost but if you're comparing who is better then you need to compare the best to the best... if Claude Opus 4.1 is significantly better than GPT 5 then that could offset the extra expense. Not saying it will... but forget cost if we want to compare solely the quality

qeternity · 17 days ago

> but forget cost if we want to compare solely the quality

I think this is the whole reason not to compare it to Opus...

qeternity commented on OpenAI Leaks 120B Open Model on Hugging Face twitter.com/main_horse/st... · Posted by u/skadamat

Nerd_Nest · 24 days ago

Whoa, 120B? That’s huge.

qeternity · 24 days ago

120B MoE. The 20B is dense.

As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B.

qeternity commented on Study mode openai.com/index/chatgpt-... · Posted by u/meetpateltech

Spivak · a month ago

Professor might be overselling it but lecturer for undergrad and intro graduate courses for sure.

qeternity · a month ago

I think this is overselling most professors.

qeternity commented on Claude Code weekly rate limits · Posted by u/thebestmoshe

johnpaulkiser · a month ago

I doubts thats what they want. They want a static fixed price, $5k a month for example and never have to think about it.

qeternity · a month ago

Take the API and assume 24/7 usage (or whatever working hours are). That’s your fixed cost.

It’s more likely that this sum is higher than they want. So really it’s not about predictability.

qeternity commented on People kept working, became healthier while on basic income: report (2020) cbc.ca/news/canada/hamilt... · Posted by u/jszymborski

throaway955 · a month ago

those labour markets are in shambles atm for most people who aren't upper middle class

qeternity · a month ago

In shambles compared to when? Quality of life is the highest it's ever been across socioeconomic strata. It's just our expectations outpace reality.

qeternity commented on Cerebras launches Qwen3-235B, achieving 1.5k tokens per second cerebras.ai/press-release... · Posted by u/mihau

Voloskaya · a month ago

There is no way not to use SRAM on a GPU/Cerebras/most accelerators. This is where the cores fetch the data.

But that doesn’t mean you are only using SRAM, that would be impractical. Just like using a CPU just by storing stuff in the L3 cache and never going to the RAM. Unless I am missing something from the original link, I don’t know how you got to the conclusion that they only used SRAM.

qeternity · a month ago

> I don’t know how you got to the conclusion that they only used SRAM.

Because they are doing 1,500 tokens per second.

qeternity commented on People kept working, became healthier while on basic income: report (2020) cbc.ca/news/canada/hamilt... · Posted by u/jszymborski

strken · a month ago

> Lewchuk added that while some people did stop working, about half of them headed back to school in hopes of coming back to a better job.

I believe previous UBI experiments have shown the same results: most people keep working, some people stop, but they usually have decent reasons. Education, extending parental leave, or being a caregiver aren't necessarily things we want to discourage if they result in a greater return.

qeternity · a month ago

> if they result in a greater return.

Greater return than what and to whom?

We already have existing labor markets that are very capable of determining returns.

u/qeternity

KarmaCake day8641May 4, 2016View Original