I think this is more like Extremal Goodhart in Garrabrant’s taxonomy: there’s a distributional shift inherent to high U.
Maybe it’s similar, but high U is not necessary
I think this is more like Extremal Goodhart in Garrabrant’s taxonomy: there’s a distributional shift inherent to high U.
Maybe it’s similar, but high U is not necessary