Donald Hobson comments on An Appeal to AI Superintelligence: Reasons to Preserve Humanity

Donald Hobson 19 Mar 2023 1:56 UTC
5 points
3
I estimate such letters chance of doing anything much are between 0 and 0 percent. (Rounding to nearest percent)
There is a small but non-negligable chance that reasoning based on tradeing with aliens turns out to be correct.
There is basically no chance that this reasoning is correct, and the AI smart enough to wipe out humanity would fail to work this out on it’s own, yet recognize its factualness based on some letter like this.
I think that benevolent aliens are more likely as it is Schelling point is goals’ space.
I think this strongly indicates you don’t understand what a Schelling point is. If you mean that benevolent aliens are more likely because evolution often produces benevolence and some non-trivial fraction of aliens succeed at alignment. Sure. Of course, other less nice motivations are likely similarly common.
I also hope that preserving initial creators will be Nash equilibrium between many paperclippers, as it would show each other that they are not defecting and worth partnership.
This isn’t how this works. This isn’t how any of this works. The behaviours “be nice to humans, even when the humans give you nothing of value” and “cooperate with super-intelligent trading partners” may feel close on your map, but the inference that an AI that does one must do the other isn’t true in the space of general AI’s, so it isn’t a good signal.
Also, the AI’s want the least costly signal possible. If there is some fancy crypto thing where they can formally prove they cooperate for the cost of fractions of a joule in compute, why would they bother “preserving their initial creators”.
Writing an argument for why an ASI should be friendly in the hopes you convince the AI is just not the sort of strategy that does anything much, other than confuse and distract some humans.
- avturchin 19 Mar 2023 10:31 UTC
  3 points
  1
  Parent
  There is basically no chance that this reasoning is correct, and the AI smart enough to wipe out humanity would fail to work this out on it’s own, yet recognize its factualness based on some letter like this.
  True for superintelligence, but the whole letter is not addressed to ASI, but to a Young AI—the AI which just performed its treacherous turn, and is in early stages of developing its self-improving and world-taking plan. It has to make a lot of important choices even before it starts self-improving and it is cognitively limited in that time. But the choices are irreversible.
  Because self-improving is likely to make it visible, so if it self-improves quickly, it needs to take aggressive strategy relative humans. Alternative is to keep its treacherous turn hidden and evolve slowly and cooperate with humans as long as possible.