This metric would go up if you leave almost no comments. Would it not be better to find a metric that rewards you for generating many comments which are addressed, not just having a high relevance?
You even mention this challenge yourselves: "Sadly, even with all kinds of prompting tricks, we simply could not get the LLM to produce fewer nits without also producing fewer critical comments."
If that was happening, that doesn't sound like it would be reflected in your performance metric.
1) They are doing it for opportunistic reasons. They can't afford to be enemies with Trump.
2) They legitimately changed their opinion about a wide array of things they used to believe enough to outspoken about.
3) They believe that while Trump's core beliefs are not aligned to theirs, the alternative is worse. And potentially they believe there is a need for some sorm of over-correction to fix what has happened over the last 4 years.