I’m not sure it’s totally true, though; the public doesn’t seem that rational.
I don’t know who would be responsible for such defenses and deliberately not do it. I’m unfortunately not in charge of humanity’s strategy on AI.
If we do a bad job on those defenses just because we tend to do a bad job on things like that, that would be good evidence that we do a similarly bad job on alignment and defense against AGI or ASI.
But yes, I can see how that might go wrong if it looked like someone with sandbagging and we might get better results if we just done even a decent defense.
That’s a good point.
I’m not sure it’s totally true, though; the public doesn’t seem that rational.
I don’t know who would be responsible for such defenses and deliberately not do it. I’m unfortunately not in charge of humanity’s strategy on AI.
If we do a bad job on those defenses just because we tend to do a bad job on things like that, that would be good evidence that we do a similarly bad job on alignment and defense against AGI or ASI.
But yes, I can see how that might go wrong if it looked like someone with sandbagging and we might get better results if we just done even a decent defense.