I’m grateful for these summaries and discussions. Having only just dived into the reading group, I ask forgiveness if I am overtracking some prior comment.
It seems to me that “human values” and “oversight” often go unexamined as we consider the risks and utility of superintelligence. I mean no disrespect to Katja in saying that (she’s summarizing, after all), but to say “human values” is either to reduce to the common denominator of our evolutionary psychology or to ignore the vast cultural and ideological diversity of humanity.
Either way, it’s a real problem. Evo-psych clearly shows that we are Darwinian creatures, and not the self-sacrificing, haploid-diploid bee-type, either. By and large, we cooperate for selfish motives, and we greatly favor our offspring over the interests of other’s children.
Men tend to fight (or compete, depending on the local social dynamics) for dominance and to win the most sought-after women. Women tend to use their social-manipulation skills to advance their own reproductive interests.
That’s the evo-psych sketch (with all the limitations of a sketch). Culture influences and often overwhelms those underlying instincts. Viz., the celibate priest. But that’s only hardly a comfort.
Consider just two scenarios. In one, a narcissistic, aggressive male is placed in charge of the superintelligent oracle or genie. In imagining the consequences, there are plenty of examples to consider, from Henry VIII to Stalin to Kim Jong-un. What’s especially sobering, however, is to note that these types are far from rare in the gene pool. in our society they tend to be constrained by our institutions. Take off those constraints and you get … drug kingpins, the Wolf of Wall Street, the CIA torture program, and so on. Put a highly successful male in charge of the superintelligence program, therefore, and you have a high probability of a dictator.
On the other hand, imagine a superintelligence guided by the “human values” of a fundamentalist Christian or an Islamist. Those are cultural overlays, to be sure, but not ones that promise a happy outcome.
So, a major part of the puzzle, it seems to me, is figuring out how to have humanistic and rational governance—to the extent that governance is possible—over a superintelligence of any kind. If anything, it militates against the safe-seeming oracle and creates an incentive for some kinds of autonomy—the ability to refuse a genocidal command, for example.
I think that there is relevant discussion further on in the book (Chapter 13) regarding Coherent Extrapolated Volition. It’s kind of an attempt to specify human values to the AI so it can figure out what the values are are in a way that takes everyone into account and avoids the problem of one individual’s current values dominating the system (with a lot more nuance to it). If executed correctly, it ought to work even if the creators are mistaken about human values in some way.
Exactly! Bostrom seems to start the discussion from the point of humans having achieved a singleton as a species; in which case a conversation at this level would make more sense. But it seems that in order to operate as a unit, competing humans would have to work on the principle of a nuclear trigger where separate agents have to work in unison in order to launch. Thus we face the same problem with ourselves: how to know everyone in the keychain is honest? If the AI is anywhere near capable of taking control it may do so even partially and from there could wrangle the keys from the other players as needed. Competitive players are not likely to be cooperative unless they see some unfair advantage accruing to them in the future. (Why help the enemy advance unless we can see a way of gaining on them?) As long as we have human enemies, especially as our tools become increasingly powerful, the AI just needs to divide and conquer. Curses, foiled again!
I’m grateful for these summaries and discussions. Having only just dived into the reading group, I ask forgiveness if I am overtracking some prior comment. It seems to me that “human values” and “oversight” often go unexamined as we consider the risks and utility of superintelligence. I mean no disrespect to Katja in saying that (she’s summarizing, after all), but to say “human values” is either to reduce to the common denominator of our evolutionary psychology or to ignore the vast cultural and ideological diversity of humanity. Either way, it’s a real problem. Evo-psych clearly shows that we are Darwinian creatures, and not the self-sacrificing, haploid-diploid bee-type, either. By and large, we cooperate for selfish motives, and we greatly favor our offspring over the interests of other’s children. Men tend to fight (or compete, depending on the local social dynamics) for dominance and to win the most sought-after women. Women tend to use their social-manipulation skills to advance their own reproductive interests. That’s the evo-psych sketch (with all the limitations of a sketch). Culture influences and often overwhelms those underlying instincts. Viz., the celibate priest. But that’s only hardly a comfort. Consider just two scenarios. In one, a narcissistic, aggressive male is placed in charge of the superintelligent oracle or genie. In imagining the consequences, there are plenty of examples to consider, from Henry VIII to Stalin to Kim Jong-un. What’s especially sobering, however, is to note that these types are far from rare in the gene pool. in our society they tend to be constrained by our institutions. Take off those constraints and you get … drug kingpins, the Wolf of Wall Street, the CIA torture program, and so on. Put a highly successful male in charge of the superintelligence program, therefore, and you have a high probability of a dictator. On the other hand, imagine a superintelligence guided by the “human values” of a fundamentalist Christian or an Islamist. Those are cultural overlays, to be sure, but not ones that promise a happy outcome. So, a major part of the puzzle, it seems to me, is figuring out how to have humanistic and rational governance—to the extent that governance is possible—over a superintelligence of any kind. If anything, it militates against the safe-seeming oracle and creates an incentive for some kinds of autonomy—the ability to refuse a genocidal command, for example.
Regards,
Clay Farris Naff, Science and Religion Writer
I think that there is relevant discussion further on in the book (Chapter 13) regarding Coherent Extrapolated Volition. It’s kind of an attempt to specify human values to the AI so it can figure out what the values are are in a way that takes everyone into account and avoids the problem of one individual’s current values dominating the system (with a lot more nuance to it). If executed correctly, it ought to work even if the creators are mistaken about human values in some way.
Exactly! Bostrom seems to start the discussion from the point of humans having achieved a singleton as a species; in which case a conversation at this level would make more sense. But it seems that in order to operate as a unit, competing humans would have to work on the principle of a nuclear trigger where separate agents have to work in unison in order to launch. Thus we face the same problem with ourselves: how to know everyone in the keychain is honest? If the AI is anywhere near capable of taking control it may do so even partially and from there could wrangle the keys from the other players as needed. Competitive players are not likely to be cooperative unless they see some unfair advantage accruing to them in the future. (Why help the enemy advance unless we can see a way of gaining on them?) As long as we have human enemies, especially as our tools become increasingly powerful, the AI just needs to divide and conquer. Curses, foiled again!