after the idea that Claude 3.5 Sonnet used SAEs to improve its coding ability i'm not sure if i'm aware of any actual practical use of them yet beyond Golden Gate Claude (and Golden Gate Gemma (https://x.com/swyx/status/1818711762558198130)
has anyone tried out Anthropic's matching SAE API yet? wondering how it compares with Goodfire's and if there's any known practical use.
We have a notebook about that here: https://docs.goodfire.ai/notebooks/dynamicprompts
optillm authors suggest that the additional computations in Entropics don’t bring any better results in comparison with the simple CoT decoding (but I am not sure if they also check efficiency):https://x.com/asankhaya/status/1846736390152949966
It looks to me that many problems with LLMs come from something like semantic leaking, or distraction by irrelevant information (like in the GSM Symbolic paper) - maybe there is some space for improving attention too.
I wrote a couple of blog posts on these subjects: https://zzbbyy.substack.com/p/semantic-leakage-quick-notes, https://zzbbyy.substack.com/p/llms-and-reasoning, https://zzbbyy.substack.com/p/o1-inference-time-turing-machi...