that your more recent writing on Goodhart-style problems suggests that you think we can deal with such problems to the best of our ability by just modelling everything we must already know about our uncertainty and about our preferences (e.g., that they have diminishing returns).
To a large extent I do, but there may be some residual effects similar to the above, so some anti-optimising pressure might still be useful.
To a large extent I do, but there may be some residual effects similar to the above, so some anti-optimising pressure might still be useful.