Not to be too harsh, but this seems like several steps backwards in designing a Friendly AI. To extend your metaphor, it’s as if NASA has started calculating the potential trajectory of a rocket and you’re running in shouting “Guys, guys, all this math is going to make the project way too confusing! We can see the moon—let’s just aim our rocket at that!”.
I think part of the problem is determining the goal of your post. Are you trying to come up with a simple, non-weird sounding explanation of FAI that LWers can explain to people without sounding low-status? If so, good job. This sounds reassuring and intuitive, and I’m fairly confident I could explain it to anyone without inferential-gap problems. Or are you trying to create a blueprint for how FAI “might be pursued in the real world”? If this is your goal, I’m concerned.
Having managers, bureaucrats, and lawyers create a set of laws for an AI to follow just won’t work. Human deliberation and consensus can’t solve simple, human created problems—why on earth do we think it’ll solve AI problems? Constraining an AI with today’s value system stifles all moral progress—possibly forever.
Then you would get your programmers and your cognitive scientists to implement that goal condition in a way such that the symbols have the meanings that they are supposed to have.
That’s the whole problem. We don’t really know what we mean by “right”, and we don’t even know what “happiness” and “freedom” and similar concepts mean to us. There’s no reason to believe doing this would be any easier than CEV, and someevidence to suggest it’s impossible.
Not to be too harsh, but this seems like several steps backwards in designing a Friendly AI. To extend your metaphor, it’s as if NASA has started calculating the potential trajectory of a rocket and you’re running in shouting “Guys, guys, all this math is going to make the project way too confusing! We can see the moon—let’s just aim our rocket at that!”.
I think part of the problem is determining the goal of your post. Are you trying to come up with a simple, non-weird sounding explanation of FAI that LWers can explain to people without sounding low-status? If so, good job. This sounds reassuring and intuitive, and I’m fairly confident I could explain it to anyone without inferential-gap problems. Or are you trying to create a blueprint for how FAI “might be pursued in the real world”? If this is your goal, I’m concerned.
Having managers, bureaucrats, and lawyers create a set of laws for an AI to follow just won’t work. Human deliberation and consensus can’t solve simple, human created problems—why on earth do we think it’ll solve AI problems? Constraining an AI with today’s value system stifles all moral progress—possibly forever.
That’s the whole problem. We don’t really know what we mean by “right”, and we don’t even know what “happiness” and “freedom” and similar concepts mean to us. There’s no reason to believe doing this would be any easier than CEV, and some evidence to suggest it’s impossible.