That doesn’t mean do zero attempts to mitigate this but at some point the whole effort is counterproductive as it creates context that creates what it is worried about, without giving you much chance of winning.
Curious what you’re imagining here in particular.
I agree with your overall take on AI control, but, one of the elements as I understand it is paying or striking deals with AI that pay out after the acute risk period ends, and, like, generally trying to interact fairly with them. Which seems attempting at least to address the “creates what it is worried about” failure mode, and I’m not sure what you’re expecting.
Curious what you’re imagining here in particular.
I agree with your overall take on AI control, but, one of the elements as I understand it is paying or striking deals with AI that pay out after the acute risk period ends, and, like, generally trying to interact fairly with them. Which seems attempting at least to address the “creates what it is worried about” failure mode, and I’m not sure what you’re expecting.