Related work:
Interpreting Modular Addition in MLPs https://www.lesswrong.com/posts/cbDEjnRheYn38Dpc5/interpreti...
Paper Replication Walkthrough: Reverse-Engineering Modular Addition https://www.neelnanda.io/mechanistic-interpretability/modula...
And more recently, [Language Models Use Trigonometry to Do Addition](https://arxiv.org/abs/2502.00873)
The link in the paper to their Java implementation is now broken: does anyone have a current link?