“build superintelligence and use it to take unilateral world-scale actions in a manner inconsistent with existing law and order”
The whole point of the pivotal act framing is that you are looking for something to do that you can do with the least advanced AI system. This means it’s definitely not a superintelligence. If you have an aligned superintelligence this I think makes that framing not really make sense. The problem the framing is trying to grapple with is that we want to somehow use AI to solve AI risk, and for that we want to use the very dumbest AI that we can use for a successful plan.
I know, this is what I pointed at in footnote 1. Although “dumbest AI” is not quite right: the sort of AI MIRI envision is still very superhuman in particular domains, but is somehow kept narrowly confined to acting within those domains (e.g. designing nanobots). The rationale mostly isn’t assuming that at that stage it won’t be possible to create a full superintelligence, but assuming that aligning such a restricted AI would be easier. I have different views on alignment, leading me to believe that aligning a full-fledged superintelligence (sovereign) is actually easier (via PSI or something in that vein). On this view, we still need to contend with the question, what is the thing we will (honestly!) tell other people that our AI is actually going to do. Hence, the above.
I always thought “you should use the least advanced superintelligence necessary”. I.e., in not-real-example of “melting all GPUs” your system should be able to design nanotech advanced enough to target all GPUs in open enviroment, which is superintelligent task, while not being able to, say, reason about anthropics and decision theory.
The whole point of the pivotal act framing is that you are looking for something to do that you can do with the least advanced AI system. This means it’s definitely not a superintelligence. If you have an aligned superintelligence this I think makes that framing not really make sense. The problem the framing is trying to grapple with is that we want to somehow use AI to solve AI risk, and for that we want to use the very dumbest AI that we can use for a successful plan.
I know, this is what I pointed at in footnote 1. Although “dumbest AI” is not quite right: the sort of AI MIRI envision is still very superhuman in particular domains, but is somehow kept narrowly confined to acting within those domains (e.g. designing nanobots). The rationale mostly isn’t assuming that at that stage it won’t be possible to create a full superintelligence, but assuming that aligning such a restricted AI would be easier. I have different views on alignment, leading me to believe that aligning a full-fledged superintelligence (sovereign) is actually easier (via PSI or something in that vein). On this view, we still need to contend with the question, what is the thing we will (honestly!) tell other people that our AI is actually going to do. Hence, the above.
I always thought “you should use the least advanced superintelligence necessary”. I.e., in not-real-example of “melting all GPUs” your system should be able to design nanotech advanced enough to target all GPUs in open enviroment, which is superintelligent task, while not being able to, say, reason about anthropics and decision theory.