“I am in fact, I am saying that I don’t think we can distinguish those two worlds beforehand, because is not always possible to do that.”
I don’t understand how this sentence works in context with the rest of the article, which is saying over and over again that you believe in World A and not World B. If you don’t think we can distinguish World A from World B before AGI, why are you confident in World A over World B? Shouldn’t P(World A | AGI) be the same as P(World B | AGI) if there’s no evidence we currently have that can preference one over the other?
I agree that this part of the article can be confusing. Let me put it this way:
P(World A | AGI) = 0.999
P(World B | AGI) = 0.001
So I think the evidence makes me already prefer work A before B. What I don’t know is what observations would make me alter that posterior distribution. That’s what I meant
Whatever observations caused you to initially shift towards A, wouldn’t the opposite observation make you shift towards B? For instance, one observation that caused you to shift towards A is “I can’t think of any actionable plans an AGI could use to easily destroy humanity without a fight”. Thus, wouldn’t an observation of “I can now think of a plan”, or “I have fixed an issue in a previous plan I rejected” or “Someone else thought of a plan that meets my criteria” be sufficient to update you towards B?
Yes, hearing those plans would probably make me change my mind. But they would need to be bulletproof plans, otherwise they would fall into the category, probably not doable in practice/too risky/too slow. Thank you for engagement constructively anyway
“I am in fact, I am saying that I don’t think we can distinguish those two worlds beforehand, because is not always possible to do that.”
I don’t understand how this sentence works in context with the rest of the article, which is saying over and over again that you believe in World A and not World B. If you don’t think we can distinguish World A from World B before AGI, why are you confident in World A over World B? Shouldn’t P(World A | AGI) be the same as P(World B | AGI) if there’s no evidence we currently have that can preference one over the other?
I agree that this part of the article can be confusing. Let me put it this way:
P(World A | AGI) = 0.999
P(World B | AGI) = 0.001
So I think the evidence makes me already prefer work A before B. What I don’t know is what observations would make me alter that posterior distribution. That’s what I meant
I notice I’m still confused.
Whatever observations caused you to initially shift towards A, wouldn’t the opposite observation make you shift towards B? For instance, one observation that caused you to shift towards A is “I can’t think of any actionable plans an AGI could use to easily destroy humanity without a fight”. Thus, wouldn’t an observation of “I can now think of a plan”, or “I have fixed an issue in a previous plan I rejected” or “Someone else thought of a plan that meets my criteria” be sufficient to update you towards B?
Yes, hearing those plans would probably make me change my mind. But they would need to be bulletproof plans, otherwise they would fall into the category, probably not doable in practice/too risky/too slow. Thank you for engagement constructively anyway