Common failures aren’t common because they happen most of the time, they’re common because, conditioned on a failure happening, they’re likely.
The example is a bit contrived, but safety goals being poorly specified or outright inconsistent and contradictory seems quite plausible in general, as they have to try to incorporate input from PR, HR, legal compliance, etc. And this will always be a cost center, so minimal effort as long as it’s not making the model too painfully stupid.
Common failures aren’t common because they happen most of the time, they’re common because, conditioned on a failure happening, they’re likely.
The example is a bit contrived, but safety goals being poorly specified or outright inconsistent and contradictory seems quite plausible in general, as they have to try to incorporate input from PR, HR, legal compliance, etc. And this will always be a cost center, so minimal effort as long as it’s not making the model too painfully stupid.