[only sort of on-topic] I’m concerned about coercion within CEV. Seems like to compute CEV, you’re in some sense asking for human agency to develop further, whether within a simulation or hypothetical reasoning / extrapolation by an AI, or in meatspace. But what a human is, seems to require interacting with other humans. And if you have lots of humans interacting, by default many of them will in some contexts be in some sense coercive; e.g. making threats to extort things from each other, and in particular people might push other people to cut off their own agency within CEV (or might have done that beforehand).
The CEV operation tries to return a fixed point of idealized value-reflection. Running immortal people forward inside of a simulated world is very much insufficiently idealized value-reflection, for the reasons you suggest, so simply simulating people interacting for a long time isn’t running their CEV.
How would you run their CEV? I’m saying it’s not obvious how to do it in a way that both captures their actual volition, while avoiding coercion. You’re saying “idealized reflection”, but what does that mean?
Yeah, fair—I dunno. I do know that an incremental improvement on simulating a bunch of people in an environment philosophizing is doing that but running an algorithm that prevents coercion, e.g.
I imagine that the complete theory of these incremental improvements (for example, also not running a bunch of moral patients for many subjective years while computing the CEV), is the final theory we’re after, but I don’t have it.
Like, encoding what “coercion” is would be an expression of values. It’s more meta, and more universalizable, and stuff, but it’s still something that someone might strongly object to, and so it’s coercion in some sense. We could try to talk about what possible reflectively stable people / societies would consider as good rules for the initial reflection process, but it seems like there would be multiple fixed points, and probably some people today would have revealed preferences that distinguish those possible fixed points of reflection, still leaving open conflict.
[only sort of on-topic] I’m concerned about coercion within CEV. Seems like to compute CEV, you’re in some sense asking for human agency to develop further, whether within a simulation or hypothetical reasoning / extrapolation by an AI, or in meatspace. But what a human is, seems to require interacting with other humans. And if you have lots of humans interacting, by default many of them will in some contexts be in some sense coercive; e.g. making threats to extort things from each other, and in particular people might push other people to cut off their own agency within CEV (or might have done that beforehand).
Then that isn’t the CEV operation.
The CEV operation tries to return a fixed point of idealized value-reflection. Running immortal people forward inside of a simulated world is very much insufficiently idealized value-reflection, for the reasons you suggest, so simply simulating people interacting for a long time isn’t running their CEV.
How would you run their CEV? I’m saying it’s not obvious how to do it in a way that both captures their actual volition, while avoiding coercion. You’re saying “idealized reflection”, but what does that mean?
Yeah, fair—I dunno. I do know that an incremental improvement on simulating a bunch of people in an environment philosophizing is doing that but running an algorithm that prevents coercion, e.g.
I imagine that the complete theory of these incremental improvements (for example, also not running a bunch of moral patients for many subjective years while computing the CEV), is the final theory we’re after, but I don’t have it.
Like, encoding what “coercion” is would be an expression of values. It’s more meta, and more universalizable, and stuff, but it’s still something that someone might strongly object to, and so it’s coercion in some sense. We could try to talk about what possible reflectively stable people / societies would consider as good rules for the initial reflection process, but it seems like there would be multiple fixed points, and probably some people today would have revealed preferences that distinguish those possible fixed points of reflection, still leaving open conflict.
Cf. https://www.lesswrong.com/posts/CzufrvBoawNx9BbBA/how-to-prevent-authoritarian-revolts?commentId=3LcHA6rtfjPEQne4N