Measuring Goodhart’s Law

What is best of n sampling? I didn’t follow. Is that like, which of these n images is best, and then turning that into a measure?

alextp · 3 years ago

Choose the top N according to the proxy objective and then use the real objective to choose the best out of those N candidates.

jmalicki · 3 years ago

That was my initial understanding, which left me confused.

But they're taking the top n according to the model, then taking the top according to the proxy, not actual, objective. This avoids the Winner's Curse problem of top model ranking with reasonable probability.

They are then comparing this to the highest scoring actual preference.

paulorlando · 3 years ago

Thanks for posting this. Would like to learn more detail on your work. I've been drawn to the topic by originally hearing about Goodhart's from developers, even though it interestingly originated with observations on UK's monetary policy.

(I wrote a (non-math, non-AI) version to learn more about Goodhart's Law here: https://unintendedconsequenc.es/new-morality-of-attainment-g...)

AlbertCory · 3 years ago

I think the same thing applies to some biomarkers; cholesterol, for one.

Cholesterol is an accepted biomarker for stroke & heart disease. Thus, Big Pharma can "prove" that lowering your cholesterol by M is worth N.n years on your life. Therefore, take our statins. Daily. Forever.

I'm not buying it. That's optimizing the measure. "Exercise more & lose weight" is thought impossible for the average person, or at least, unlikely to be heeded, so give them a statin instead.

Deleted Comment

dr_dshiv · 3 years ago

buscoquadnary · 3 years ago

Sounds likes the outcome of this measuring isn't such a great measure anymore.

uoaei · 3 years ago

The irony is the point, I think.