ryan_greenblatt comments on Machines of Faithful Obedience

ryan_greenblatt 25 Jun 2025 0:24 UTC
13 points
2
I left some comments noting disagreements, but I thought it be helpful to note some areas of agreement:
- I agree AI could be (highly) risky and think that it’s good to acknowledge this (as you do).
- I agree that you could have a maximally obedient superintelligent AI. (Some questions around manipulation could be philosophically tricky, but this seems resolvable at least in principle.)
- I agree that obedience (instruction following) is a good target, though I think there are some caveats to this (and uncertainties I have).
- I agree there are substantial risks from arms races (which cause a race to the bottom on safety), AI enabled authoritarianism (which could be ~indefinitely stable), and undesirable societal shifts due to unemployment and general upheaval. I’m most worried about misalignment risks, though these risks might be increased by these other factors, particularly arms races.
- I agree that pure internal deployment, single monopoly, and safety underinvestment would (probably) make these risks worse. Though I might think of the downsides of these as somewhat different than what you’re focusing on.
- I agree that probably offense-defense issue can be handled if most intelligence is in the hands of good actors, and careful (and potentially quite strong) actions are taken to defend properly. (I’m maybe skeptical that sufficiently strong actions will be taken in practice for reasons similar to those discussed here, though I don’t agree with the bottom line of this linked post overall.)
- Boaz Barak 25 Jun 2025 14:39 UTC
  4 points
  0
  Parent
  Thank you! I am happy to see that there are so many points of agreement!