Here is a game you can play with yourself, or others:
a) You have to decide on a five dishes and a recipe for each dish that can be cooked by any reasonably competent chef.
b) From tomorrow onwards, everyone on earth can only ever eat food if the food is one of those dishes, prepared according to the recipes you decided.
c) Tomorrow, every single human on Earth, including you and everyone you know, will also have their tastebuds (and related neural circuitry) randomly swapped with someone else.
This means that you are operating under the veil of ignorance. You should make sure that the dishes you decide on are tasty to whoever you are, once the law takes effect.
Multiplayer: The one to first convince all other players of what dishes, wins.
Single player: If you play alone, you just need to convince yourself.
A decent analogy capturing the consensus challenges.
However, the point should not focus on taste as much as the dishes in general. That misrepresents the idea too much. Nutrition, availabaility of ingredients, and so on should also factor in, to be a better comparison. Just agree on the dishes in general. You don’t need to swap taste buds, you can again swap everyone to reach the veil condition.
You aim for a stable default that people can live with. Minimum acceptable outcome.
That’s a key point that a lot of people are missing when it comes to AI alignment.
Scenarios that people are most worried about such as the AI killing or enslaving everyone, or making paperclips in disregard of anyone who is made of resources and may be impacted by that, are immoral by pretty much any widely used human standard. If the AI disagrees with some humans about morality, but this disagreement is within the moral parameters about which modern, Western, humans disagree, the AI is for all practical purposes aligned.
The point I was trying to make was that, in my opinion morality is not a thing that can be “solved”.
If I prefer chinese and you prefer greek, I’ll want to get chinese, you’ll wanna get greek. There’s not that much more to be said. The best we can hope for is reaching some pareto frontier so we’re not deliberately screwing ourselves over, but along that pareto frontier we’ll be pulling in opposite directions.
Perhaps a better example would’ve been music. Only one genre of music can be played from now on.
Here is a game you can play with yourself, or others:
a) You have to decide on a five dishes and a recipe for each dish that can be cooked by any reasonably competent chef.
b) From tomorrow onwards, everyone on earth can only ever eat food if the food is one of those dishes, prepared according to the recipes you decided.
c) Tomorrow, every single human on Earth, including you and everyone you know, will also have their tastebuds (and related neural circuitry) randomly swapped with someone else.
This means that you are operating under the veil of ignorance. You should make sure that the dishes you decide on are tasty to whoever you are, once the law takes effect.
Multiplayer: The one to first convince all other players of what dishes, wins.
Single player: If you play alone, you just need to convince yourself.
Good luck!
A decent analogy capturing the consensus challenges.
However, the point should not focus on taste as much as the dishes in general. That misrepresents the idea too much. Nutrition, availabaility of ingredients, and so on should also factor in, to be a better comparison. Just agree on the dishes in general. You don’t need to swap taste buds, you can again swap everyone to reach the veil condition.
You aim for a stable default that people can live with. Minimum acceptable outcome.
That’s a key point that a lot of people are missing when it comes to AI alignment.
Scenarios that people are most worried about such as the AI killing or enslaving everyone, or making paperclips in disregard of anyone who is made of resources and may be impacted by that, are immoral by pretty much any widely used human standard. If the AI disagrees with some humans about morality, but this disagreement is within the moral parameters about which modern, Western, humans disagree, the AI is for all practical purposes aligned.
The point I was trying to make was that, in my opinion morality is not a thing that can be “solved”.
If I prefer chinese and you prefer greek, I’ll want to get chinese, you’ll wanna get greek. There’s not that much more to be said. The best we can hope for is reaching some pareto frontier so we’re not deliberately screwing ourselves over, but along that pareto frontier we’ll be pulling in opposite directions.
Perhaps a better example would’ve been music. Only one genre of music can be played from now on.