Solving population ethics etc. can probably wait until we’ve escaped immediate disaster.
So there will be some way for people living inside the VR to change the AI’s values later, it won’t just be a fixed utility function encoding whatever philosophical views the people building the AI have? If that’s the case (and you’ve managed to avoid bugs and potential issues like value drift and AI manipulating the people’s philosophical reasoning) then I’d be happy with that. But I don’t see why it’s easier than FAI. Sure, you don’t need to figure out how to translate preferences from one domain to another in order to implement it, but then you don’t need to do that to implement CEV either. You can let CEV try to figure that out, and if CEV can’t, it can do the same thing you’re suggesting here, have the FAI implement a VR universe on top of the physical one.
Your idea actually seems harder than CEV in at least one respect because you have to solve how human-like consciousness relates to underlying physics for arbitrary laws of physics (otherwise what happens if your AI discovers that the laws of physics are not what we think they are), which doesn’t seem necessary to implement CEV.
The idea that CEV is simpler (because you can “let it figure things out”) is new to me! I always felt CEV was very complex and required tons of philosophical progress, much more than solving the problem of consciousness. If you think it requires less, can you sketch the argument?
I think you may have misunderstood my comment. I’m not saying CEV is simpler overall, I’m saying it’s not clear to me why your idea is simpler, if you’re including the “feature” of allowing people inside the VR to change the AI’s values. That seems to introduce problems that are analogous to the kinds of problems that CEV has. Basically you have to design your VR universe to guarantee that people who live inside them will avoid value drift and eventually reach correct conclusions about what their values are. That’s where the main difficulty in CEV lies also, at least in my view. What do you think are some of the philosophical progress that CEV requires that your idea avoids?
The way I imagined it, people inside the VR wouldn’t be able to change the AI’s values. Population ethics seems like a problem that people can solve by themselves, negotiating with each other under the VR’s rules, without help from AI.
CEV requires extracting all human preferences, extrapolating them, determining coherence, and finding a general way to map them to physics. (We need to either do it ourselves, or teach the AI how to do it, the difference doesn’t matter to the argument.) The approach in my post skips most of these tasks, by letting humans describe a nice normal world directly, and requires mapping only one thing (consciousness) to physics. Though I agree with you that the loss of potential utility is huge, the idea is intended as a kind of lower bound.
So there will be some way for people living inside the VR to change the AI’s values later, it won’t just be a fixed utility function encoding whatever philosophical views the people building the AI have? If that’s the case (and you’ve managed to avoid bugs and potential issues like value drift and AI manipulating the people’s philosophical reasoning) then I’d be happy with that. But I don’t see why it’s easier than FAI. Sure, you don’t need to figure out how to translate preferences from one domain to another in order to implement it, but then you don’t need to do that to implement CEV either. You can let CEV try to figure that out, and if CEV can’t, it can do the same thing you’re suggesting here, have the FAI implement a VR universe on top of the physical one.
Your idea actually seems harder than CEV in at least one respect because you have to solve how human-like consciousness relates to underlying physics for arbitrary laws of physics (otherwise what happens if your AI discovers that the laws of physics are not what we think they are), which doesn’t seem necessary to implement CEV.
The idea that CEV is simpler (because you can “let it figure things out”) is new to me! I always felt CEV was very complex and required tons of philosophical progress, much more than solving the problem of consciousness. If you think it requires less, can you sketch the argument?
I think you may have misunderstood my comment. I’m not saying CEV is simpler overall, I’m saying it’s not clear to me why your idea is simpler, if you’re including the “feature” of allowing people inside the VR to change the AI’s values. That seems to introduce problems that are analogous to the kinds of problems that CEV has. Basically you have to design your VR universe to guarantee that people who live inside them will avoid value drift and eventually reach correct conclusions about what their values are. That’s where the main difficulty in CEV lies also, at least in my view. What do you think are some of the philosophical progress that CEV requires that your idea avoids?
The way I imagined it, people inside the VR wouldn’t be able to change the AI’s values. Population ethics seems like a problem that people can solve by themselves, negotiating with each other under the VR’s rules, without help from AI.
CEV requires extracting all human preferences, extrapolating them, determining coherence, and finding a general way to map them to physics. (We need to either do it ourselves, or teach the AI how to do it, the difference doesn’t matter to the argument.) The approach in my post skips most of these tasks, by letting humans describe a nice normal world directly, and requires mapping only one thing (consciousness) to physics. Though I agree with you that the loss of potential utility is huge, the idea is intended as a kind of lower bound.