O O comments on How could AIs ‘see’ each other’s source code?

O O 6 Jun 2023 16:26 UTC
3 points
2
I think the comparison illustrates my point because the UN is typically not seen as enforceable and negotiated interests only exist when cooperating is the best choice. For human entities that don’t have maximization goals, both cooperating is often better than both defecting.

(C,C) means both cooperate. (D,C) means defect and cooperate. In classic prisoners dilemma. (D,C) offers a higher EV than (C,C) but less than (D,D).

I don’t think there is any way to weasel around a true prisoners dilemma with adversaries. It’s a simple situation and arises naturally everywhere.
- Gunnar_Zarncke 6 Jun 2023 23:33 UTC
  2 points
  0
  Parent
  I agree that the prisoners’ dilemma occurs frequently, but I don’t think you can use it the way you seem to in the building-a-joint-successor agent. I guess we are operating with different operationalization in mind, and until those are spelled out, we will probably not agree.
  Maybe we can make some quick progress on that by going with the pay-off matrix but for each agents choice adding a probability that the choice is detected before execution. We also at least need a no-op case because presubly you can refrain from building a successor agent (in reality there would be many in-between options but to keep it manageable). I think if you multiply things out the build-agent-in-neutral place comes out on top.