There are no watermarks in the arena.
There are no visible watermarks, but model makers can use steganographic codes to identify outputs from their own models.
I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won).
Dead Comment
Deleted Comment
https://arxiv.org/pdf/2510.06525