mxwsn (u/mxwsn) - Readit News

mxwsn commented on AI is different antirez.com/news/155... · Posted by u/grep_it

mxwsn · 12 days ago

AI with ability but without responsibility is not enough for dramatic socioeconomic change, I think. For now, the critical unique power of human workers is that you can hold them responsible for things.

edit: ability without accountability is the catchier motto :)

mxwsn commented on Unlike ChatGPT, Anthropic has doubled down on Artifacts ben-mini.com/2025/claude-... · Posted by u/bewal416

mxwsn · a month ago

Has anyone come across any really cool artifacts? I'd be curious to see

mxwsn commented on Web3 Onboarding Was a Flop – and Thank Goodness tomhadley.link/blog/web3-... · Posted by u/solumos

gregmac · 2 months ago

Am I the only one struggling to decipher this?

I thought web3 was supposed to be some kind of decentralized compute, where rather than run on your own hardware or IaaS/PaaS you could make use of compute resources that vary wildly day-to-day in availability, performance, and cost, because they were somehow also mining rigs or something? But it's "decentralized" because there's not one entity running the thing.

There is not a mention of that in the article.

Is it actually supposed to just be microtranscations paid with cryptocurrency? Where's the "decentralized" part of that?

Anyway, instead the best I can see this article seems to be talking about how it turns out people aren't using blockchain for buying things, and makes the (apparently) shocking conclusion "the one thing people always wanted: money that just works."

mxwsn · 2 months ago

Stablecoins transferred $27 trillion in 2024 - more than Visa and Mastercard combined. This is right in the article.

Stablecoins operate using decentralized ledgers on e.g. Ethereum which use decentralized compute. This isn't mentioned explicitly because the target audience knows this already.

mxwsn commented on Claude 4 anthropic.com/news/claude... · Posted by u/meetpateltech

throwaway314155 · 3 months ago

Gemini can beat the game?

mxwsn · 3 months ago

Gemini has beat it already, but using a different and notably more helpful harness. The creator has said they think harness design is the most important factor right now, and that the results don't mean much for comparing Claude to Gemini.

mxwsn commented on The booming, high-stakes arms race of airline safety videos thehustle.co/originals/th... · Posted by u/gmays

mxwsn · 5 months ago

Huh, I imagined this was because of relaxing regulation.

mxwsn commented on Deep Learning Is Not So Mysterious or Different arxiv.org/abs/2503.02113... · Posted by u/wuubuu

cgdl · 5 months ago

Agreed, but PAC-Bayes or other descendants of VC theory is probably not the best explanation. The notion of algorithmic stability provides a (much) more compelling explanation. See [1] (particularly Sections 11 and 12)

[1] https://arxiv.org/abs/2203.10036

mxwsn · 5 months ago

Good read, thanks for sharing

mxwsn commented on Some thoughts on autoregressive models wonderfall.dev/autoregres... · Posted by u/Wonderfall

mxwsn · 6 months ago

> But what is the original purpose of AI research? I will speak for myself here, but I know many other AI researchers will say the same: the ultimate goal is to understand how humans think. And we think the best (or the funniest) way to understand how humans think is to try to recreate it.

Eh. To riff on Dijkstra, this is like submarine engineers saying their ultimate goal is to understand how fish swim.

mxwsn commented on AI is stifling new tech adoption? vale.rocks/posts/ai-is-st... · Posted by u/kiyanwang

mxwsn · 6 months ago

This ought to be called the qwerty effect, for how the qwerty keyboard layout can't be usurped at this point. It was at the right place at the right time, even though arguably its main design choices are no longer relevant, and there are arguably better layouts like dvorak.

Python and React may similarly be enshrined for the future, for being at the right place at the right time.

English as a language might be another example.

mxwsn commented on Mini-R1: Reproduce DeepSeek R1 "Aha Moment" philschmid.de/mini-deepse... · Posted by u/jonbaer

mxwsn · 7 months ago

What's surprising about this is how sparsely defined the rewards are. Even if the model learns the formatting reward, if it never chances upon a solution, there isn't any feedback/reward to push it to learn to solve the game more often.

So what are the chances of randomly guessing a solution?

The toy Countdown dataset here has 3 to 4 numbers, which are combined with 4 symbols (+, -, x, ÷). With 3 numbers there are 3! * 4^3 = 384 possible symbol combinations, with 4 there are 6144. By the tensorboard log [0], even after just 10 learning steps, the model already has a success rate just below 10%. If we make the simplifying assumption that the model hasn't learned anything in 10 steps, then the probability of 1 (or more) success in 80 chances (8 generations are used per step), guessing randomly for a success rate of 1/384 on 3-number problems, is 1.9%. One interpretation is to take this as a p-value, and reject that the model's base success rate is completely random guessing - the base model already has slightly above chance success rate at solving the 3-number CountDown game.

This aligns with my intuition - I suspect that with proper prompting, LLMs should be able to solve CountDown decently OK without any training. Though maybe not a 3B model?

The model likely "parlays" its successes on 3 numbers to start to learn to solve 4 numbers. Or has it? The final learned ~50% success rate matches the frequency of 4-number problems in Jiayi Pan's CountDown dataset [1]. Phil does provide examples of successful 4-number solutions, but maybe the model hasn't become consistent at 4 numbers yet.

[0]: https://www.philschmid.de/static/blog/mini-deepseek-r1/tenso... [1]: https://huggingface.co/datasets/Jiayi-Pan/Countdown-Tasks-3t...

mxwsn commented on I still like Sublime Text ohdoylerules.com/workflow... · Posted by u/james2doyle

mxwsn · 7 months ago

I used sublime from 2013 to 2021. It was great. Since, I've switched to VS Code and haven't looked back.