To be sure I understand, in this story, the core actions Agent-5 takes that push the world towards the CEO’s control is very slight sabotaging of its opponents and making the media Agent-5 controls slightly biased in favor of the CEO (the maximal extent before it becomes obvious)?
I don’t think this would be likely to make a big difference. In AI2027, Agent-5 isn’t vastly superintelligent at politics (merely superhuman) [Edit: as pointed out in the comments, this is wrong and Agent-5 is wildly superhuman at politics in AI2027, which makes the scenario more plausible to me. I’ll keep the rest of the paragraph, but applied to an AI weaker than Agent-5 at politics], and so it seems really hard for such subtle manipulations to move the probability that the CEO becomes de-facto world-dictator by more than a Bayes factor of 10 (which is higher than I would like, but I think that the probability of an AI company becoming world dictator without AI secret loyalty is sufficiently small that even a Bayes factor of 10 does not make success likely). (But this is a low confidence guess, my skepticism relies on my not-very-informed priors about how hard it is to shift the political balance of power by doing things like manipulating social media algorithms—maybe it is much easier than I think it is.)
(There are many other ways things could go wrong if Agent-5 is loyal to the CEO, but I think it’s important to catalogue the most important ones when trying to mitigate secret loyalties.)
Here are some more thoughts on a superintelligence-run persuasion campaign:
Like Daniel wrote in a comment, it’s good to think of Agent-5 as distributed and able to nudge things all over the internet. The nudges could be highly personalized to demographics and individuals, so responsive to the kind of subtle emotional triggers the superintelligence learns about each individual.
It seems many people’s opinions today are already significantly shaped by social media and disinformation. So this makes me think a similar process that’s much more agentic, personalized, and superintelligence-optimized could be very potent.
There’s the possibility of mind-hacking too, though I chose to leave that out of the blogpost.
The CEO is probably well-positioned to take credit for a lot of the benefits Agent-5 seems to bring to the world (some of these benefits are genuine, some illusory).
In an earlier iteration of this scenario I had a military coup rather than this gradual political ascension via persuasion. But then I decided that a superintelligence capable of controlling the robots well enough to disempower the human military would probably also be powerful enough to do something less heavy-handed like what’s in the scenario.
Agent-5 isn’t vastly superintelligent at politics (merely superhuman)
Look into the November 2027 section of the forecast’s Race Ending. In December 2027 Agent-5 is supposed to have a score of 4.0 at politics and 3.9 at forecasting, meaning that it would be “wildly superhuman” at both.
When I read AI2027 the “November 2027: Superhuman Politicking” felt much less extreme than “you can make someone that had <0.01 chance of winning be the clear favorite”. I guess AI2027 didn’t want to make a very strong statement about what is possible with wildly superhuman skills and so they used a relatively mild example (make the race continue despite Xi being willing to make large sacrifices—which seem to me >0.1 even without AI manipulations).
I am still unsure how much can be done with Agent-5. I know some people who don’t buy you will get “magical” political abilities during the first few years of the intelligence explosion (for a combination of not believing in very fast takeoff taking you to extremely advanced political skills, and not believing extremely advanced political skills would be that useful) but I am not very sympathetic to their views and I agree that if you get wildly superhuman political skills, the sort of manipulation you describe seems >0.5 likely to succeed.
FWIW I don’t think Agent-5 needs to be vastly superhuman at politics to succeed in this scenario, merely top-human level. Analogy: A single humanoid robot might need to be vastly superhuman at fighting to take out the entire US army in a land battle. But a million humanoid robots could probably do it if they were merely expert at fighting. Agent-5 isn’t a single agent, it’s a collective of millions.
But a million humanoid robots could probably do it if they were merely expert at fighting. Agent-5 isn’t a single agent, it’s a collective of millions.
Something close to this might be a big reason why Amdahl’s law/parallelization bottlenecks on the software singularity might not matter, because the millions of AIs are much, much closer to one single AI doing deep serial research than it is to an entire human field with thousands or millions of people.
There is something something lack of powerful in context learning, where currently millions of AI are bsaically one AI because they can’t change rapidly in response to new information, but once they can they will be a tree of AI from copying the ones with insights.
To be sure I understand, in this story, the core actions Agent-5 takes that push the world towards the CEO’s control is very slight sabotaging of its opponents and making the media Agent-5 controls slightly biased in favor of the CEO (the maximal extent before it becomes obvious)?
I don’t think this would be likely to make a big difference. In AI2027, Agent-5 isn’t vastly superintelligent at politics (merely superhuman)[Edit: as pointed out in the comments, this is wrong and Agent-5 is wildly superhuman at politics in AI2027, which makes the scenario more plausible to me. I’ll keep the rest of the paragraph, but applied to an AI weaker than Agent-5 at politics], and so it seems really hard for such subtle manipulations to move the probability that the CEO becomes de-facto world-dictator by more than a Bayes factor of 10 (which is higher than I would like, but I think that the probability of an AI company becoming world dictator without AI secret loyalty is sufficiently small that even a Bayes factor of 10 does not make success likely). (But this is a low confidence guess, my skepticism relies on my not-very-informed priors about how hard it is to shift the political balance of power by doing things like manipulating social media algorithms—maybe it is much easier than I think it is.)(There are many other ways things could go wrong if Agent-5 is loyal to the CEO, but I think it’s important to catalogue the most important ones when trying to mitigate secret loyalties.)
Here are some more thoughts on a superintelligence-run persuasion campaign:
Like Daniel wrote in a comment, it’s good to think of Agent-5 as distributed and able to nudge things all over the internet. The nudges could be highly personalized to demographics and individuals, so responsive to the kind of subtle emotional triggers the superintelligence learns about each individual.
It seems many people’s opinions today are already significantly shaped by social media and disinformation. So this makes me think a similar process that’s much more agentic, personalized, and superintelligence-optimized could be very potent.
There’s the possibility of mind-hacking too, though I chose to leave that out of the blogpost.
The CEO is probably well-positioned to take credit for a lot of the benefits Agent-5 seems to bring to the world (some of these benefits are genuine, some illusory).
In an earlier iteration of this scenario I had a military coup rather than this gradual political ascension via persuasion. But then I decided that a superintelligence capable of controlling the robots well enough to disempower the human military would probably also be powerful enough to do something less heavy-handed like what’s in the scenario.
Look into the November 2027 section of the forecast’s Race Ending. In December 2027 Agent-5 is supposed to have a score of 4.0 at politics and 3.9 at forecasting, meaning that it would be “wildly superhuman” at both.
My bad! Will edit.
When I read AI2027 the “November 2027: Superhuman Politicking” felt much less extreme than “you can make someone that had <0.01 chance of winning be the clear favorite”. I guess AI2027 didn’t want to make a very strong statement about what is possible with wildly superhuman skills and so they used a relatively mild example (make the race continue despite Xi being willing to make large sacrifices—which seem to me >0.1 even without AI manipulations).
I am still unsure how much can be done with Agent-5. I know some people who don’t buy you will get “magical” political abilities during the first few years of the intelligence explosion (for a combination of not believing in very fast takeoff taking you to extremely advanced political skills, and not believing extremely advanced political skills would be that useful) but I am not very sympathetic to their views and I agree that if you get wildly superhuman political skills, the sort of manipulation you describe seems >0.5 likely to succeed.
FWIW I don’t think Agent-5 needs to be vastly superhuman at politics to succeed in this scenario, merely top-human level. Analogy: A single humanoid robot might need to be vastly superhuman at fighting to take out the entire US army in a land battle. But a million humanoid robots could probably do it if they were merely expert at fighting. Agent-5 isn’t a single agent, it’s a collective of millions.
Something close to this might be a big reason why Amdahl’s law/parallelization bottlenecks on the software singularity might not matter, because the millions of AIs are much, much closer to one single AI doing deep serial research than it is to an entire human field with thousands or millions of people.
There is something something lack of powerful in context learning, where currently millions of AI are bsaically one AI because they can’t change rapidly in response to new information, but once they can they will be a tree of AI from copying the ones with insights.