I think there’s space for the versions of “AI control” he lays out to be impossible, while it’s still possible to build AI that makes the future go much better than it otherwise would have.
For example, one desideratum he has is that our current selves, “H0”, shouldn’t be bossed around (via the AI) by versions of ourselves that have e.g. gone through some simulated dispute-resolution procedure. Which is a defensible consequence of “control,” but is I think way too strong if all we want is for the future to be good.
I think this is generally my vision, after thinking about it a bit more, as well.
It also seems to me that if there’s absolutely, really no way at all to make an agent starter than you do things that are good for you, then an agent that realizes that wouldn’t FOOM.
Thanks for the link!
I think there’s space for the versions of “AI control” he lays out to be impossible, while it’s still possible to build AI that makes the future go much better than it otherwise would have.
For example, one desideratum he has is that our current selves, “H0”, shouldn’t be bossed around (via the AI) by versions of ourselves that have e.g. gone through some simulated dispute-resolution procedure. Which is a defensible consequence of “control,” but is I think way too strong if all we want is for the future to be good.
Thanks for your reaction!
I think this is generally my vision, after thinking about it a bit more, as well.
It also seems to me that if there’s absolutely, really no way at all to make an agent starter than you do things that are good for you, then an agent that realizes that wouldn’t FOOM.