My summary of the post would be “The Friendly AI plan is to enforce human values by means of a singleton. This alternative plan suggests enforcing human values by means of social norms”.
If I was going to give a one paragraph summary of the idea, it might be something along the lines of:
“Hey kids. Welcome to the new school. You’ve all got your guns I see. Good. You’ll notice today that there are a few adults around carrying big guns and wearing shiny badges. They’ll be gone tomorrow. Tomorrow morning you’ll be on your own, and so you got a choice to make. Tomorrow might be like the Gunfight at the OK Corral—bloody, and with few survivors left standing at the end of the day. Or you can come up with some rules for your society, and a means of enforcing them, that will improve the odds of most of you surviving to the end of the week. To give you time to think, devise and agree rules, we’ve provided you with some temporary sherrifs, and a draft society plan that might last a day or two. Our draft plan beats no plan, but it likely wouldn’t last a week if unimproved, so we suggest you use the stress free day we’ve gifted you with in order to come up with an improved version. Feel free to tear it up and start from scratch, or do what you like. All we insist upon is that you take a day to think your options over carefully, before you find yourselves forced into shooting it out from survival terror.”
So, yes, there will be drift in values from whatever definition of ‘politeness’ the human plan starts them off from. But the drift will be a planned one. A direction planned cooperatively by the AIs, with their own survival (or objectives) in mind. The $64,000 question is whether their objectives are, on average, likely to be such that a stable cooperative society is seen to be in the interests of those objectives. If it is, then it seems likely they have at least as good a chance as we would have of devising a stable ruleset for the society that would deal increasingly well with the problems of drift and power imbalance.
Whether, if a majority of the initial released AIs have some fondness for humanity, this fondness would be a preserved quantity under that sort of scenario, is a secondary question (if one of high importance to our particular species). And one I’d be interested in hearing reasoned arguments on, from either side.
If I was going to give a one paragraph summary of the idea, it might be something along the lines of:
“Hey kids. Welcome to the new school. You’ve all got your guns I see. Good. You’ll notice today that there are a few adults around carrying big guns and wearing shiny badges. They’ll be gone tomorrow. Tomorrow morning you’ll be on your own, and so you got a choice to make. Tomorrow might be like the Gunfight at the OK Corral—bloody, and with few survivors left standing at the end of the day. Or you can come up with some rules for your society, and a means of enforcing them, that will improve the odds of most of you surviving to the end of the week. To give you time to think, devise and agree rules, we’ve provided you with some temporary sherrifs, and a draft society plan that might last a day or two. Our draft plan beats no plan, but it likely wouldn’t last a week if unimproved, so we suggest you use the stress free day we’ve gifted you with in order to come up with an improved version. Feel free to tear it up and start from scratch, or do what you like. All we insist upon is that you take a day to think your options over carefully, before you find yourselves forced into shooting it out from survival terror.”
So, yes, there will be drift in values from whatever definition of ‘politeness’ the human plan starts them off from. But the drift will be a planned one. A direction planned cooperatively by the AIs, with their own survival (or objectives) in mind. The $64,000 question is whether their objectives are, on average, likely to be such that a stable cooperative society is seen to be in the interests of those objectives. If it is, then it seems likely they have at least as good a chance as we would have of devising a stable ruleset for the society that would deal increasingly well with the problems of drift and power imbalance.
Whether, if a majority of the initial released AIs have some fondness for humanity, this fondness would be a preserved quantity under that sort of scenario, is a secondary question (if one of high importance to our particular species). And one I’d be interested in hearing reasoned arguments on, from either side.