Very interesting paper. Thanks for sharing! I agree with several of the limitations suggested in the paper, such as the correlation between number of uses of the oracle AI and catastrophic risk, the analogy to AI to a nuclear power plant (obviously with the former having potentially much worse consequences), and the disincentives for corporations to cooperate with containment safety measures. However, one area I would like to question you on is the potential dangers of super intelligence. Its referred to throughout the paper, but never really explicitly explained.
I agree that super intelligent AI, as opposed to human level AI, should probably be avoided, but if we design the containment system well enough, I would like to know how having a super intelligent AI in a box would really be that dangerous. Sure, the super intelligent AI could theoretically make subtle suggestions which end up changing the world (a la the toothpaste example you use), and exploit other strategies we are not aware of, but in the worst case I feel that still buys us valuable time to solve alignment.
In regards to open weight models, I agree that at some point regulation has to be put in place to prevent unsafe AI development (possibly on an international level). This may not be so feasible, but regardless, I view comprehensive alignment as unlikely to be achieved before 2030, so I feel like this is still the best safety strategy to pursue if existential risk mitigation is our primary concern.
Very interesting paper. Thanks for sharing! I agree with several of the limitations suggested in the paper, such as the correlation between number of uses of the oracle AI and catastrophic risk, the analogy to AI to a nuclear power plant (obviously with the former having potentially much worse consequences), and the disincentives for corporations to cooperate with containment safety measures. However, one area I would like to question you on is the potential dangers of super intelligence. Its referred to throughout the paper, but never really explicitly explained.
I agree that super intelligent AI, as opposed to human level AI, should probably be avoided, but if we design the containment system well enough, I would like to know how having a super intelligent AI in a box would really be that dangerous. Sure, the super intelligent AI could theoretically make subtle suggestions which end up changing the world (a la the toothpaste example you use), and exploit other strategies we are not aware of, but in the worst case I feel that still buys us valuable time to solve alignment.
In regards to open weight models, I agree that at some point regulation has to be put in place to prevent unsafe AI development (possibly on an international level). This may not be so feasible, but regardless, I view comprehensive alignment as unlikely to be achieved before 2030, so I feel like this is still the best safety strategy to pursue if existential risk mitigation is our primary concern.
The only use case of superintelligeneу is a weapon against other superintelligences. Solving aging and space exploration can be done with 300 IQ.