Adele Lopez comments on Adele Lopez’s Shortform

Adele Lopez 22 Apr 2022 5:44 UTC
LW: 7 AF: 1
0
AF
[Epistemic status: very speculative]

One ray of hope that I’ve seen discussed is that we may be able to do some sort of acausal trade with even an unaligned AGI, such that it will spare us (e.g. it would give us a humanity-aligned AGI control of a few stars, in exchange for us giving it control of several stars in the worlds we win).

I think Eliezer is right that this wouldn’t work.

But I think there are possible trades which don’t have this problem. Consider the scenario in which we Win, with an aligned AGI taking control of our future light-cone. Assuming the Grabby aliens hypothesis is true, we will eventually run into other civilizations, which will either have Won themselves, or are AGIs who ate their mother civilizations. I think Humanity will be very sad at the loss of the civilizations who didn’t make it because they failed at the alignment problem. We might even be willing to give up several star systems to an AGI who kept its mother civilization intact on a single star system. This trade wouldn’t have the issue Eliezer brought up, since it doesn’t require us to model such an AGI correctly in advance, only that that AGI was able to model Humanity well enough to know it would want this and would honor the implicit trade.

So symmetrically, we might hope that there are alien civilizations that both Win, and would value being able to meet alien civilizations strongly enough. In such a scenario, “dignity points” are especially aptly named: think of how much less embarrassing it would be to have gotten a little further at solving alignment when the aliens ask us why we failed so badly.