The Mad Scientist Decision Problem

Con­sider Alice, the mad com­puter sci­en­tist. Alice has just solved gen­eral ar­tifi­cal in­tel­li­gence and the al­ign­ment prob­lem. On her com­puter she has two files, each con­tain­ing a seed for a su­per­in­tel­li­gent AI, one of them is al­igned with hu­man val­ues, the other one is a pa­per­clip max­i­mizer. The two AIs only differ in their goals/​val­ues, the rest of the al­gorithms, in­clud­ing de­ci­sion pro­ce­dures, are iden­ti­cal.

Alice de­cides to flipp a coin. If the coin comes up heads, she starts the friendly AI, and if it comes up tails, she starts the pa­per­clip max­i­mizer.

The coin comes up heads. Alice starts the friendly AI, and ev­ery­one re­joice. Some years later the friendly AI learns about the coin­flip and of the pa­per­clip max­i­mizer.

Should the friendly AI coun­ter­fac­tu­ally co­op­er­ate with the pa­per­clip max­i­mizer?

What does var­i­ous de­ci­sion the­o­ries say in this situ­a­tion?

What do you think is the cor­rect an­swer?