Why I don’t believe in doom

In my previous very divisive post I was said more than once that I was being downvoted because I was not providing any arguments. This is an attempt to correct that and to expand on my current model of doom and AGI. I think it can be more clear in the form of a Q and A

Q: First, what are you arguing against?

A: Against the commonly ( in this community) held belief that the first AGI means doom in a very short period of time (I would say days/​weeks/​months)

Q: What makes you think that?

A: We live in a complex world where successfully pulling off a plan that kills everyone and in a short of time might be beyond what is achievable, the same way that winning against AlphaZero giving it a 20 stone handicap is impossible even by a God- like entity with infinite computational resources

Q: But you can’t prove it is NOT possible, can you?

A: No, the same way that you can NOT prove that is possible. We are dealing here with probabilities only, and I feel they have been uncritically thrown out of the window by assuming that an AGI will automatically take control of the world and will spread its influence to the universe at the speed of light. There are many reasons why things can go in a different way. You can derive from the orthogonality principle and instrumental convergence that an AGI might try to attack us, but it might very well be that is not powerful enough and for that reason decides not to.

Q: I can think plans to kill humans successfuly

A: So do I. I can’t think of plans to kill all humans successfuly in a short span of time

Q: But an AGI would

A: Or maybe not. The fact that people can’t come up with real plans makes me think that those plans are not that easy, because the absence of evidence is evidence of absence

Q: what about {evil plan}?

A: I think you don’t realise that every single plan contains multiple moving parts that can go wrong for many reasons and an AGI would see that too. If the survival of the AGI is part of the utility function and it correctly infers that its life might be at risk, the AGI might decide not to engage in that plan or to bide its time, potentially for years, which invalidate the premise we started with

Q: does it really matter if it is not 2 weeks but 2 years?

A: it matters dearly, because it give us the opportunity to make other AGIs in that time window.

Q: What does it change? They would kill you too

A: No, we just established that it does not happen. I can take the AGIs and put them to work, for instance, into AGI safety

Q: how do you coerce the machine to do that?

A: we don’t know sufficiently well how the machine looks like at this point, but there might be different ways. One way is setting a competition between different AGIs. Others might have many different ideas

Q: the machine will trick you into making you think that they are solving the alignment problem but they aren’t

A: or maybe not. Maybe they understand that if a non-human verifiable proof is provided, then they will be shut down.

Q: nobody will be interested in putting the AGIs to work on this

A: I think that thanks to Eliezer et al’s work over the years, many people/​organisations would be more than willing to put the money and effort once we reach that stage

Q: I find your attitude very little constructive. Lets say that the machine does not kill everyone, that would still be a problem

A: I never said the opposite. That’s why I do think it is great that people are working on this

Q: in that case, it makes sense for us to put ourselves in the worst case scenario, don’t you think?

A: No, because that has a cost too. If you are working in cybersecurity and think that a hacker can not only steal your password but can also break into your house and eat your dog, it makes sense for you to spend time and resources into building a 7 m high wall and hiring a bodyguard. Which is the price you are paying for having an unrealistic take on the real danger

Q: That is unfair! How is the belief in doom harming this community? A: there are people who feel in despair because they feel helpless. This is bad. Then, it is also probably preventing the exploration of certain lines of research that are quickly dismissed with “the AGI would kill you as soon as it is created”

Think of the know classic AGI in a box. EY made very clear that you can’t make it work because the AGI eventually scapes. I think this is wrong: the AGI eventually scapes IF the box is not good enough and IF the AGI is powerful enough. If you automatically think that the AGI is powerful enough, you just simply ignore that line of research. But those two ifs are big assumptions that can be relaxed. What if we start designing very powerful boxes?

Q: I’m really not convinced that a box is the solution to the alignment problem, what makes so confident about it?

A: I am not convinced either, nor advocating for this line of research. I am pointing out that there are lines of research probably not being explored if we assume that the creation of an AGI implies our immediate dead

Q: So far I haven’t seen any convincing reason why the AGI won’t be that powerful. Imagine a Einstein like brain running 1000 times quicker, without getting tired

A: that does not guarantee that you can come up with any plan that meets all the requirements we previously stated: high chance of killing all humans, high chance of working out, virtually zero margin for error or risk of retaliation

Q: We are going in circles. Tell me, what does the universe look like to you in A, doom won’t happen, B, doom will happen. What are the bits of evidence that would make you change your mind?

A: What are the bits of evidence that make you think that your body temperature at 12:34 the 5th of November will be 36.45 and not 36.6?

Q: I don’t think there is any, but you aren’t answering

A: I am in fact, I am saying that I don’t think we can distinguish those two worlds beforehand, because is not always possible to do that. Thinking otherwise is exactly the kind of flawed reasoning I would expect from people who have enshrined intelligence and rationality to a point where they don’t see their limitations. When I read the news about the progress in ML (Dall-e 2) this does not move my probabilities in any significant direction because I am not denying that AGIs will be created, that is obvious. But I can think that we live in world number A for the same reasons that I am stating over and over, that taking over the world is not that easy.