Didn’t the AI-2027 forecast imply that it’s the race between the leaders that threatens to misalign the AIs and to destroy mankind? In the forecast itself, the two leading countries are the USA and China, which, alas, has a bit less compute than OpenBrain, the American leading company, alone. This causes the arms race between OpenBrain and the collective having all Chinese compute, DeepCent. As a result, DeepCent stubbornly refuses to implement the measures which would let the company have an aligned AI.
Suppose that the probability that the AI created by a racing company is misaligned is p while[1] the probablity P(the AI created by a perfectionist monopolist is misaligned) is q. Then P(Consensus-1 is misaligned|OpenBrain and DeepCent race) is p^2. The probability for a perfectionist to misalign its ASI is q. Then we would have to compare q with p^2, not with p.
The authors of AI-2027 assume that p is close to 1 and q is close to zero.
It might also depend on the compute budget of the perfectionist. Another complication is that p could also be different for the USA if, say, OpenAI, Anthropic, GDM, xAI decide to have the four AIs codesign the successor while DeepCent, having far less compute selects one model to do the research.
Not sure how you get P(consensus-1 is misaligned | Race) = p^2 (EDIT: this was explained to me and I was just being silly). Maybe there’s an implicit assumption there I’m not seeing.
But I agree that this is what AI-2027 roughly says. Also does seem that racing in itself causes bad outcomes.
However, I feel like the “create a parliament of AIs that are all different and hope one of them is sorta aligned” is a different strategy based on very different assumptions though? (It is a strategy I do not agree with, but think the AI 2027 logic would be unlikely to convince a person who agreed with it, and this would be reasonable given such a person’s priors).
Didn’t the AI-2027 forecast imply that it’s the race between the leaders that threatens to misalign the AIs and to destroy mankind? In the forecast itself, the two leading countries are the USA and China, which, alas, has a bit less compute than OpenBrain, the American leading company, alone. This causes the arms race between OpenBrain and the collective having all Chinese compute, DeepCent. As a result, DeepCent stubbornly refuses to implement the measures which would let the company have an aligned AI.
Suppose that the probability that the AI created by a racing company is misaligned is p while[1] the probablity P(the AI created by a perfectionist monopolist is misaligned) is q. Then P(Consensus-1 is misaligned|OpenBrain and DeepCent race) is p^2. The probability for a perfectionist to misalign its ASI is q. Then we would have to compare q with p^2, not with p.
The authors of AI-2027 assume that p is close to 1 and q is close to zero.
It might also depend on the compute budget of the perfectionist. Another complication is that p could also be different for the USA if, say, OpenAI, Anthropic, GDM, xAI decide to have the four AIs codesign the successor while DeepCent, having far less compute selects one model to do the research.
Not sure how you get P(consensus-1 is misaligned | Race) = p^2 (EDIT: this was explained to me and I was just being silly). Maybe there’s an implicit assumption there I’m not seeing.
But I agree that this is what AI-2027 roughly says. Also does seem that racing in itself causes bad outcomes.
However, I feel like the “create a parliament of AIs that are all different and hope one of them is sorta aligned” is a different strategy based on very different assumptions though? (It is a strategy I do not agree with, but think the AI 2027 logic would be unlikely to convince a person who agreed with it, and this would be reasonable given such a person’s priors).