I also think it is very unlikely that AIs 4 SDs above the human range would be controllable, I’d expect them to be able to fairly easily sabotage research they were given without humans noticing. When I think of intelligence gaps like that in humans it feels pretty insurmountable
Seems reasonable. It’s worth noting there might be a distinction between:
We can extract useful work from these systems.
We can prevent these systems from successfully causing egregious security failures.
I think both are unlikely but plausible (given massive effort) and that preventing egregious security failures seems plausible even without massive effort (though still unlikely.
Seems reasonable. It’s worth noting there might be a distinction between:
We can extract useful work from these systems.
We can prevent these systems from successfully causing egregious security failures.
I think both are unlikely but plausible (given massive effort) and that preventing egregious security failures seems plausible even without massive effort (though still unlikely.