Agree—either we have a ludicrously broad basin for alignment and it’s easy, and would likely not require much work, or we almost certainly fail because the target is narrow, we get only one shot, and it needs to survive tons of pressures over time.
I think this depends a lot on the quality of the “society of ASIs”. If they are nasty to each other, compete ruthlessly with each other, are on a brink of war among themselves, not careful with dangerous superpowers they have, then our chances with this kind of ASIs are about zero (their chances of survival are also very questionable in this kind of situation, given the supercapabilities).
If ASIs are addressing their own existential risks of destroying themselves and their neighborhood competently, and their society is “decent”, our chances might be quite reasonable in the limit (transition period is still quite risky and unpredictable).
So, to the extent that it depends at all on what we do, we should perhaps spend a good chunk of the AI existential safety research efforts on what we can do during the period of ASI creation to increase the chances of their society being sustainably decent. They should be able to take care of that on their own, but initialization conditions might matter a lot.
The rest of the AI existential safety research efforts should probably focus on 1) making sure that humans are robustly included in the “circle of care” (conditional on the ASI society being decent to their own, which should make it much more tractable), and 2) on uncertainties of the transition period (it’s much more difficult to understand the transition period with its intricate balances of power and great uncertainties, it’s one thing to solve in the limit, but it’s much more difficult to solve the uncertain “gray zone” in between; that’s what worries me the most; it’s the nearest period in time, and the least understood).
Agree—either we have a ludicrously broad basin for alignment and it’s easy, and would likely not require much work, or we almost certainly fail because the target is narrow, we get only one shot, and it needs to survive tons of pressures over time.
Yes.
I think this depends a lot on the quality of the “society of ASIs”. If they are nasty to each other, compete ruthlessly with each other, are on a brink of war among themselves, not careful with dangerous superpowers they have, then our chances with this kind of ASIs are about zero (their chances of survival are also very questionable in this kind of situation, given the supercapabilities).
If ASIs are addressing their own existential risks of destroying themselves and their neighborhood competently, and their society is “decent”, our chances might be quite reasonable in the limit (transition period is still quite risky and unpredictable).
So, to the extent that it depends at all on what we do, we should perhaps spend a good chunk of the AI existential safety research efforts on what we can do during the period of ASI creation to increase the chances of their society being sustainably decent. They should be able to take care of that on their own, but initialization conditions might matter a lot.
The rest of the AI existential safety research efforts should probably focus on 1) making sure that humans are robustly included in the “circle of care” (conditional on the ASI society being decent to their own, which should make it much more tractable), and 2) on uncertainties of the transition period (it’s much more difficult to understand the transition period with its intricate balances of power and great uncertainties, it’s one thing to solve in the limit, but it’s much more difficult to solve the uncertain “gray zone” in between; that’s what worries me the most; it’s the nearest period in time, and the least understood).