Yeah, I also had the idea about utility being conjunctive and mentioned it in a deleted reply to Wei, but then realized that Eliezer’s version (fragility of value) already exists and is better argued.
On the other hand, maybe the worst hellscapes can be prevented in one go, if we “just” solve the problem of consciousness and tell the AI what suffering means. We don’t need all of human value for that. Hellscapes without suffering can also be pretty bad in terms of human value, but not quite as bad, I think. Of course solving consciousness is still a very tall order, but it might be easier than solving all philosophy that’s required for FAI, and it can lead to other shortcuts like in my recent post (not that I’d propose them seriously).
Some people at MIRI might be thinking about this under nonperson predicate. (Eliezer’s view on which computations matter morally is different from the one endorsed by Brian, though.) And maybe it’s important to not limit FAI options too much by preventing mindcrime at all costs – if there are benefits against other very bad failure modes (or – cooperatively – just increased controllability for the people who care a lot about utopia-type outcomes), maybe some mindcrime in the early stages to ensure goal-alignment would be the lesser evil.
Yeah, I also had the idea about utility being conjunctive and mentioned it in a deleted reply to Wei, but then realized that Eliezer’s version (fragility of value) already exists and is better argued.
On the other hand, maybe the worst hellscapes can be prevented in one go, if we “just” solve the problem of consciousness and tell the AI what suffering means. We don’t need all of human value for that. Hellscapes without suffering can also be pretty bad in terms of human value, but not quite as bad, I think. Of course solving consciousness is still a very tall order, but it might be easier than solving all philosophy that’s required for FAI, and it can lead to other shortcuts like in my recent post (not that I’d propose them seriously).
Some people at MIRI might be thinking about this under nonperson predicate. (Eliezer’s view on which computations matter morally is different from the one endorsed by Brian, though.) And maybe it’s important to not limit FAI options too much by preventing mindcrime at all costs – if there are benefits against other very bad failure modes (or – cooperatively – just increased controllability for the people who care a lot about utopia-type outcomes), maybe some mindcrime in the early stages to ensure goal-alignment would be the lesser evil.