Super interesting post! I’m agnostic on whether this will happen and when, but I have something to add to the what we should do-section.
You are basically only talking there about alignment-action on the new models. I think that would be good to do, but at the same time I’m sceptical about alignment as a solution. Reasons include that I’m uncertain about the offense-defensebalance in a multipolar scenario and very sceptical that the goals we set for an ASI in a unipolar scenario will be good in the medium term (>10 yrs) (even if we solve technical alignment). I don’t think humanity is ready for having a god, even a steerable god. In addition, of course it could be that technical alignment does not get solved (in time), which is a more mainstream worry on LW.
Mostly for these reasons I put more trust in a regulatory approach. In this approach, we’d first need to inform the public (which is what I worked on the past four years), about the dangers of superintelligence (incl. human extinction), and then states would coordinate to arrive at global regulation (e.g. via our proposal, the Conditional AI Safety Treaty). By now, similar approaches are fairly mainstream in MIRI (because of technical alignment reasons), EA, FLI, PauseAI, and lots of other orgs. Hardware regulation is the most common way to enforce treaties, with sub-approaches such as FlexHEGs and HEMs.
If AGI would need a lot less flops, this would get a lot more difficult. I think it’s plausible that we arrive at this situation due to a new paradigm. Some say hardware regulation is not feasible at all anymore in such a case. I think it depends on the specifics: how many flops are needed, how much societal awareness do we have, which regulation is feasible?
I think that in addition to your what we should do-list, we should also:
Try our best to find out how many flops, how much memory, and how much money are/is needed for takeover-level AI (a probability distribution may be a sensible output).
For the most likely outcomes, figure out hardware regulation plans that would likely be able to pause/switch off development in case political support is available. (My org will work on this as well.)
Double down on FlexHEG/HEM hardware regulation options, while taking into account the scenario that a lot less flops/memory/money might be needed than previously expected.
Double down on increasing public awareness of xrisk.
Explore options beyond hardware regulation that might succeed in enforcing a pause/off switch for a longer time, while doing as little damage as possible.
Super interesting post! I’m agnostic on whether this will happen and when, but I have something to add to the what we should do-section.
You are basically only talking there about alignment-action on the new models. I think that would be good to do, but at the same time I’m sceptical about alignment as a solution. Reasons include that I’m uncertain about the offense-defense balance in a multipolar scenario and very sceptical that the goals we set for an ASI in a unipolar scenario will be good in the medium term (>10 yrs) (even if we solve technical alignment). I don’t think humanity is ready for having a god, even a steerable god. In addition, of course it could be that technical alignment does not get solved (in time), which is a more mainstream worry on LW.
Mostly for these reasons I put more trust in a regulatory approach. In this approach, we’d first need to inform the public (which is what I worked on the past four years), about the dangers of superintelligence (incl. human extinction), and then states would coordinate to arrive at global regulation (e.g. via our proposal, the Conditional AI Safety Treaty). By now, similar approaches are fairly mainstream in MIRI (because of technical alignment reasons), EA, FLI, PauseAI, and lots of other orgs. Hardware regulation is the most common way to enforce treaties, with sub-approaches such as FlexHEGs and HEMs.
If AGI would need a lot less flops, this would get a lot more difficult. I think it’s plausible that we arrive at this situation due to a new paradigm. Some say hardware regulation is not feasible at all anymore in such a case. I think it depends on the specifics: how many flops are needed, how much societal awareness do we have, which regulation is feasible?
I think that in addition to your what we should do-list, we should also:
Try our best to find out how many flops, how much memory, and how much money are/is needed for takeover-level AI (a probability distribution may be a sensible output).
For the most likely outcomes, figure out hardware regulation plans that would likely be able to pause/switch off development in case political support is available. (My org will work on this as well.)
Double down on FlexHEG/HEM hardware regulation options, while taking into account the scenario that a lot less flops/memory/money might be needed than previously expected.
Double down on increasing public awareness of xrisk.
Explore options beyond hardware regulation that might succeed in enforcing a pause/off switch for a longer time, while doing as little damage as possible.