cousin_it comments on Two agents can have the same source code and optimise different utility functions

cousin_it 11 Jul 2018 0:02 UTC
3 points
0
Yeah. But I think that can only happen if the agents aren’t very smart :-) A very smart agent would’ve self-modified at startup before seeing any observations, and adopted a utility function that’s a weighted sum of selfish utilities of all copies, with the weights given by its prior of being this or that copy.
- Joar Skalse 11 Jul 2018 0:55 UTC
  3 points
  0
  Parent
  Yes, agents with different preferences are incentivised to cooperate provided that the cost of enforcing cooperation is less than the cost of conflict. Agreeing to adopt a shared utility function via acausal trade might potentially be a very cheap way to enforce cooperation, and some agents might do this just based on their prior. However, this is true for any agents with different preferences, not just agents of the type I described. You could use the same argument to say that you are in general unlikely to find two very intelligent agents with different utility functions.
  - cousin_it 11 Jul 2018 10:24 UTC
    3 points
    0
    Parent
    Agents with identical source code will reason identically before seeing any observations, so the “acausal trade” in this case barely feels like trade at all, just making your preferences updateless over possible future observations. That’s much simpler than acausal trade between agents with different source code, which we can’t even formalize yet.
- Chris van Merwijk 11 Jul 2018 0:44 UTC
  1 point
  0
  Parent
  Here are some counterarguments:
  There can be scenario’s where the agent cannot change his source code without processing observations. e.g. the agent may need to reprogram himself via some external device.
  The agent may not be aware that there are multiple copies of him.
  It seems that for many plausible agent designs, it would require a significant change in the architecture to change his utility function. E.g. if two human sociopaths would want to change their utility function into a weighted average of the two, they couldn’t do so without significantly changing their brain architecture. A TDT agent could do this, but I think it is not prudent to assume that all actually future existing AGI’s we will deal with will be TDT’s (in fact, most likely most of them won’t be it seems to me).
  So I don’t think your comment invalidates the relevance of the point made by the poster.
  - cousin_it 11 Jul 2018 0:54 UTC
    2 points
    0
    Parent
    Yeah, I was talking mostly about idealized UDT agents, not humans.