Safetywashing describes a phenomenon that is real, inevitable, and profoundly unsurprising (I am still surprised whenever I see it, but that’s my fault for knowing something is probable and being surprised anyway). Things like this are fundamental to human systems; people who read the Sequences know this.
This post doesn’t prepare people, at all, for the complexity of how this would play out in reality. It’s possible that most posts would fail to prepare people, because these posts change goalposts; and in the mundane process of following their incentives, both adversaries and wishful thinkers (and everything in between) automatically adapt around the cultural expectations set. However, it is a critical first step and vastly superior to nothing at all.
Anticipating ways for Goodhart’s law to play out in reality isn’t a nerdy hobby, it isn’t even a way of life, it’s being an adult/agent in the real world.
Safetywashing describes a phenomenon that is real, inevitable, and profoundly unsurprising (I am still surprised whenever I see it, but that’s my fault for knowing something is probable and being surprised anyway). Things like this are fundamental to human systems; people who read the Sequences know this.
This post doesn’t prepare people, at all, for the complexity of how this would play out in reality. It’s possible that most posts would fail to prepare people, because these posts change goalposts; and in the mundane process of following their incentives, both adversaries and wishful thinkers (and everything in between) automatically adapt around the cultural expectations set. However, it is a critical first step and vastly superior to nothing at all.
Anticipating ways for Goodhart’s law to play out in reality isn’t a nerdy hobby, it isn’t even a way of life, it’s being an adult/agent in the real world.