Muon Is Scalable for LLM Training - Readit News

Posted by u/renonce 10 months ago

Muon Is Scalable for LLM Training github.com/MoonshotAI/Moo...

yorwba · 10 months ago

For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/