Psy-Kosh: Hrm. I’d think “avoid destroying the world” itself to be an ethical injunction too.
The problem is that this is phrased as an injunction over positive consequneces. Deontology does better when it’s closer to the action level and negative rather than positive.
Imagine trying to give this injunction to an AI. Then it would have to do anything that it thought would prevent the destruction of the world, without other considerations. Doesn’t sound like a good idea.
So, I realize this is really old, but it helped trip the threshold for this idea I’m rolling between my palms.
Do we suspect that a proper AI would interpret “avoid destroying the world” as something like
avoid(prevent self from being cause of)
destroying(analysis indicates destruction threshold ~= 10% landmass remaining habitable, etc.)
the world(interpret as earth, human society...)
(like a modestly intelligent genie)
or do we have reason to suspect that it would hash out the phrase to something more like how a human would read it (given that it’s speaking english which it learned from humans)?
This idea isn’t quite fully formed yet, but I think there might be something to it.
So, I realize this is really old, but it helped trip the threshold for this idea I’m rolling between my palms.
Do we suspect that a proper AI would interpret “avoid destroying the world” as something like
avoid(prevent self from being cause of) destroying(analysis indicates destruction threshold ~= 10% landmass remaining habitable, etc.) the world(interpret as earth, human society...)
(like a modestly intelligent genie)
or do we have reason to suspect that it would hash out the phrase to something more like how a human would read it (given that it’s speaking english which it learned from humans)?
This idea isn’t quite fully formed yet, but I think there might be something to it.