About caring about other TDT agents, it feels to me like the kind of thing that should follow from the right decision theory. Here’s one idea. Imagine you’re a TDT agent that has just been started / woken up. You haven’t yet observed anything about the world, and haven’t yet observed your utility function either—it’s written in a sealed envelope in front of you. Well, you have a choice: take a peek at your utility function and at the world, or use this moment of ignorance to precommit to cooperate with everyone else who’s in the same situation. Which includes all other TDT agents who ever woke up or will ever wake up and are smart enough to realize the choice.
It seems likely that such wide cooperation will increase total utility, and so increase expected utility for each agent (ignoring anthropics for the moment). So it makes sense to make the precommitment, and only then open your eyes and start observing the world and your utility function and so on. So for your proposed problem, where a TDT agent has the opportunity to kill another TDT agent in their sleep to steal five dollars from them (destroying more utility for the other than gaining for themselves), the precommitment would stop them from doing it. Does this make sense?
I don’t fully understand Vanessa’s approach yet.
About caring about other TDT agents, it feels to me like the kind of thing that should follow from the right decision theory. Here’s one idea. Imagine you’re a TDT agent that has just been started / woken up. You haven’t yet observed anything about the world, and haven’t yet observed your utility function either—it’s written in a sealed envelope in front of you. Well, you have a choice: take a peek at your utility function and at the world, or use this moment of ignorance to precommit to cooperate with everyone else who’s in the same situation. Which includes all other TDT agents who ever woke up or will ever wake up and are smart enough to realize the choice.
It seems likely that such wide cooperation will increase total utility, and so increase expected utility for each agent (ignoring anthropics for the moment). So it makes sense to make the precommitment, and only then open your eyes and start observing the world and your utility function and so on. So for your proposed problem, where a TDT agent has the opportunity to kill another TDT agent in their sleep to steal five dollars from them (destroying more utility for the other than gaining for themselves), the precommitment would stop them from doing it. Does this make sense?