The more conventional solution is to control the machine by programming its brain.
Do you think control via market incentives is desirable? Inevitable? Likely?
Programming something and then allowing it to run unattended in the hope that you programmed correctly is not ‘control’, as the term is usually understood in ‘control theory’.
I would say that I believe that control of an AI by continuing trade is ‘necessary’ if we expect that our desires will change over time, and we will want to nudge the AI (or build a new AI) to satisfy those unanticipated desires.
It certainly makes sense to try to build machines whose values are aligned with humans over the short term—such machines will have little credible power to threaten us—just as parents have little power to credibly threaten their children since carrying out such threats directly reduces the threatener’s own utility.
And this also means that the machine needs to discount (its altruistic interest in) human welfare at the same rate as human do—otherwise, if it discounts faster, then it can threaten human with a horrible future (since it cares only about the human present). Or if it temporally discounts human happiness much slower than do humans, it will be able to threaten to delay human gratification.
However, if we want to be able to control our machines (to be able to cause them to do things that we did not originally imagine wanting them to do) then we do need to program in some potential carrots and sticks—things our machines care about that only humans can provide. These things need not be physical—a metaphoric pat on the head may do the trick. But if we are wise, we will program our machines to temporally discount this kind of gratification rather sharply—we don’t want it embarking on long term plans to increase future head-pats at the cost of incurring our short-term displeasure.
Incidentally, over the past few comments, I have noticed that you repeatedly refer to “the machine” where I might have written “machines” or “a machine”. Do you think that a singleton-dominated future is desirable? Inevitable? Likely?
And this also means that the machine needs to discount (its altruistic interest in) human welfare at the same rate as human do—otherwise, if it discounts faster, then it can threaten human with a horrible future (since it cares only about the human present). Or if it temporally discounts human happiness much slower than do humans, it will be able to threaten to delay human gratification.
If a machine wants for humans what the humans want for themselves, it wants to discount that stuff the way they like it. That doesn’t imply that it has any temporal discounting in its utility function—it is just using a moral mirror.
Incidentally, over the past few comments, I have noticed that you repeatedly refer to “the machine” where I might have written “machines” or “a machine”. Do you think that a singleton-dominated future is desirable? Inevitable? Likely?
I certainly wasn’t thinking about that issue consciously. Our brains may just handle examples a little differently.
It is challenging to answer directly because the premise that there is either one or many is questionable. There are degrees of domination—and we already have things like the United Nations.
Also, this seems to be an area where civilisation will probably get what it wants—so its down to us to some extent—which makes this a difficult area to make predictions in. However, I do think a mostly-united future—with few revolutions and little fighting—is more likely than not. An extremely tightly-united future also seems quite plausible to me. Material like this seems to be an unconvincing reason for doubt.
However, if we want to be able to control our machines (to be able to cause them to do things that we did not originally imagine wanting them to do) then we do need to program in some potential carrots and sticks—things our machines care about that only humans can provide.
No. That’s the “reinforcement learning” model. There is also the “recompile its brain” model.
The reinforcement learning model is problematical. If you hit a superintelligence with a stick, it will probably soon find a way take the stick away from you.
I would say that I believe that control of an AI by continuing trade is ‘necessary’ if we expect that our desires will change over time, and we will want to nudge the AI (or build a new AI) to satisfy those unanticipated desires.
Well, that surely isn’t right. Asimov knew that! He proposed making the machines want to do what we want them to—by making them following our instructions.
Programming something and then allowing it to run unattended in the hope that you programmed correctly is not ‘control’, as the term is usually understood in ‘control theory’.
A straw man—from my POV. I never said “unattended ” in the first place. “”
Programming something and then allowing it to run unattended in the hope that you programmed correctly is not ‘control’, as the term is usually understood in ‘control theory’.
I would say that I believe that control of an AI by continuing trade is ‘necessary’ if we expect that our desires will change over time, and we will want to nudge the AI (or build a new AI) to satisfy those unanticipated desires.
It certainly makes sense to try to build machines whose values are aligned with humans over the short term—such machines will have little credible power to threaten us—just as parents have little power to credibly threaten their children since carrying out such threats directly reduces the threatener’s own utility.
And this also means that the machine needs to discount (its altruistic interest in) human welfare at the same rate as human do—otherwise, if it discounts faster, then it can threaten human with a horrible future (since it cares only about the human present). Or if it temporally discounts human happiness much slower than do humans, it will be able to threaten to delay human gratification.
However, if we want to be able to control our machines (to be able to cause them to do things that we did not originally imagine wanting them to do) then we do need to program in some potential carrots and sticks—things our machines care about that only humans can provide. These things need not be physical—a metaphoric pat on the head may do the trick. But if we are wise, we will program our machines to temporally discount this kind of gratification rather sharply—we don’t want it embarking on long term plans to increase future head-pats at the cost of incurring our short-term displeasure.
Incidentally, over the past few comments, I have noticed that you repeatedly refer to “the machine” where I might have written “machines” or “a machine”. Do you think that a singleton-dominated future is desirable? Inevitable? Likely?
If a machine wants for humans what the humans want for themselves, it wants to discount that stuff the way they like it. That doesn’t imply that it has any temporal discounting in its utility function—it is just using a moral mirror.
I certainly wasn’t thinking about that issue consciously. Our brains may just handle examples a little differently.
And your decision not to answer my questions … Did you think about that consciously?
Of course. I’m prioritising. I did already make five replies to your one comment—and the proposed shift of direction seemed to be quite a digression.
My existing material on the topic:
http://alife.co.uk/essays/one_big_organism/
http://alife.co.uk/essays/self_directed_evolution/
http://alife.co.uk/essays/the_second_superintelligence/
It is challenging to answer directly because the premise that there is either one or many is questionable. There are degrees of domination—and we already have things like the United Nations.
Also, this seems to be an area where civilisation will probably get what it wants—so its down to us to some extent—which makes this a difficult area to make predictions in. However, I do think a mostly-united future—with few revolutions and little fighting—is more likely than not. An extremely tightly-united future also seems quite plausible to me. Material like this seems to be an unconvincing reason for doubt.
No. That’s the “reinforcement learning” model. There is also the “recompile its brain” model.
The reinforcement learning model is problematical. If you hit a superintelligence with a stick, it will probably soon find a way take the stick away from you.
Well, that surely isn’t right. Asimov knew that! He proposed making the machines want to do what we want them to—by making them following our instructions.
A straw man—from my POV. I never said “unattended ” in the first place. “”