In a previous thread I suggested starting by explicitly defining something like a CEV for a simple worm. After thinking about it, I think perhaps a norn, or some other simple hypothetical organism might be better. To make the situation as simple as possible, start with a universe where the norn are the most intelligent life in existence.
A norn (or something simpler than a norn) has explicitly defined drives, meaning the utility functions of individual norns could potentially be approximated very accurately.
The biggest weakness of this idea is that a norn, or worm, or cellular automaton, can’t really participate in the process of approving or rejecting the resulting set of extrapolated solutions. For some people, I think this indicates that you can’t do CEV on something that isn’t sentient. It only causes me to wonder, what if we are literally too stupid to even comprehend the best possible CEV that can be offered to us? I don’t think this is unlikely.
Well, it would be helpful if we could also: 2.5) work out a reliable test for whether a given X really is an instance of the CEV concept for the given reference class
Which seems to depend on having some kind of understanding.
Lacking that, we are left with having to trust that whatever the SI we’ve built is doing is actually what we “really want” it to do, even if we don’t seem to want it to do that, which is an awkward place to be.
In a previous thread I suggested starting by explicitly defining something like a CEV for a simple worm. After thinking about it, I think perhaps a norn, or some other simple hypothetical organism might be better. To make the situation as simple as possible, start with a universe where the norn are the most intelligent life in existence.
A norn (or something simpler than a norn) has explicitly defined drives, meaning the utility functions of individual norns could potentially be approximated very accurately.
The biggest weakness of this idea is that a norn, or worm, or cellular automaton, can’t really participate in the process of approving or rejecting the resulting set of extrapolated solutions. For some people, I think this indicates that you can’t do CEV on something that isn’t sentient. It only causes me to wonder, what if we are literally too stupid to even comprehend the best possible CEV that can be offered to us? I don’t think this is unlikely.
I think this doesn’t matter, if we can
1) successfully define the CEV concept itself,
2) define a suitable reference class,
3) build a superintelligence, and
4) ensure that the superintelligence continues to pursue the best CEV it can find for the appropriate reference class.
Well, it would be helpful if we could also:
2.5) work out a reliable test for whether a given X really is an instance of the CEV concept for the given reference class
Which seems to depend on having some kind of understanding.
Lacking that, we are left with having to trust that whatever the SI we’ve built is doing is actually what we “really want” it to do, even if we don’t seem to want it to do that, which is an awkward place to be.
You’re the first to suggest something approaching a model on this thread :-)