One point of this framework is to distinguish “sharing values” from “actually trusting each other”. There are cases where agents share values but don’t trust each other, or get stuck in coordination traps
In Wei Dai’s thinking, having the same values/utility function means that two agents care about the exact same things. This is formalized in UDT, but it’s also a requirement you can add to most decision theories, e.g. CDT with reflective oracles (or some other mostly lawful incomplete measure). This is normally described as requiring that the utility function has no “indexical components,” i.e. components that point to something about the agent that is running the utility function. This is slightly confusing, so it may be helpful to understand that in the case of utility functions with indexical components, two deterministic and non-pseudorandomizing robots may have different utility functions (Wei Dai’s definition) even if they are exactly the same as each other in code and physical construction, and are just e.g. placed so one is facing the other.
Yes, as per my other reply to Wei I was just using the dictatorship example as the closest real-world example. Even when agents have the same utility function, with no indexical components, I claim that they might still face a trust bottleneck, because they have different beliefs. In particular they might not be certain that the other agent has the same values as they do, and therefore all else equal would prefer to have resources themselves rather than giving them to the other agent, which then recreates a bunch of standard problems like prisoner’s dilemmas. (This is related to the epistemic prisoner’s dilemma, but more general.)
I’ll note that “not being sure what utility functions are in use” is generally (in the colloquial sense) not how standard game theory works. It seems like I am not competent enough at standard game theory to clearly write down the edge cases I think might exist that could help with your understanding. This paragraph could serve as a placeholder for the case where I develop that competence.
As for non-standard game theory, you say you’re reading the 2009 book The Bounds of Reason here[1] and I wonder if you’ve heard of the newer Translucent players: Explaining cooperative behavior in social dilemmas by Valerio Capraro and J. Y. Halpern, substantially related to your topic of the process of functioning itself being part of what is considered in a way not fully instrumental (by normative procedures).
This article fails to cite chapter 7 of the older book Good and Real by Garry Drescher, published in 2006, a partially flawed discussion of similar topics. The analysis substantially by J. Y. Halpern across multiple articles is clearer in its limitations than Drescher’s, and it lets you set up new variations of sociological problems that can then be attacked by standard mathematical techniques. This is unlike the current state of a hypothetical “UDT 1.0 game theory,” itself the algorithmic similarity based subset of Drescher’s proposal[2].
Though UDT 1.0 itself was credibly developed in parallel to Drescher’s project, with ideas from Vladimir Nesov and Eliezer Yudkowsky being added to a CDT-like idea from the 1990s intended for agents embedded in quantum physics[3][4][5].
In Wei Dai’s thinking, having the same values/utility function means that two agents care about the exact same things. This is formalized in UDT, but it’s also a requirement you can add to most decision theories, e.g. CDT with reflective oracles (or some other mostly lawful incomplete measure). This is normally described as requiring that the utility function has no “indexical components,” i.e. components that point to something about the agent that is running the utility function. This is slightly confusing, so it may be helpful to understand that in the case of utility functions with indexical components, two deterministic and non-pseudorandomizing robots may have different utility functions (Wei Dai’s definition) even if they are exactly the same as each other in code and physical construction, and are just e.g. placed so one is facing the other.
Yes, as per my other reply to Wei I was just using the dictatorship example as the closest real-world example. Even when agents have the same utility function, with no indexical components, I claim that they might still face a trust bottleneck, because they have different beliefs. In particular they might not be certain that the other agent has the same values as they do, and therefore all else equal would prefer to have resources themselves rather than giving them to the other agent, which then recreates a bunch of standard problems like prisoner’s dilemmas. (This is related to the epistemic prisoner’s dilemma, but more general.)
I’ll note that “not being sure what utility functions are in use” is generally (in the colloquial sense) not how standard game theory works. It seems like I am not competent enough at standard game theory to clearly write down the edge cases I think might exist that could help with your understanding. This paragraph could serve as a placeholder for the case where I develop that competence.
As for non-standard game theory, you say you’re reading the 2009 book The Bounds of Reason here[1] and I wonder if you’ve heard of the newer Translucent players: Explaining cooperative behavior in social dilemmas by Valerio Capraro and J. Y. Halpern, substantially related to your topic of the process of functioning itself being part of what is considered in a way not fully instrumental (by normative procedures).
This article fails to cite chapter 7 of the older book Good and Real by Garry Drescher, published in 2006, a partially flawed discussion of similar topics. The analysis substantially by J. Y. Halpern across multiple articles is clearer in its limitations than Drescher’s, and it lets you set up new variations of sociological problems that can then be attacked by standard mathematical techniques. This is unlike the current state of a hypothetical “UDT 1.0 game theory,” itself the algorithmic similarity based subset of Drescher’s proposal[2].
https://www.lesswrong.com/posts/zk6TiByFRyjETpTAj/economic-efficiency-often-undermines-sociopolitical-autonomy?commentId=4PWDzmnarLCq3kjch
Though UDT 1.0 itself was credibly developed in parallel to Drescher’s project, with ideas from Vladimir Nesov and Eliezer Yudkowsky being added to a CDT-like idea from the 1990s intended for agents embedded in quantum physics[3][4][5].
https://www.lesswrong.com/posts/ophhRzHyt44qcjnkS/trying-to-understand-my-own-cognitive-edge?commentId=gxAFgbmbQco5PrLmF
https://web.archive.org/web/20160917233328/http://fennetic.net/irc/finney.org/~hal/udassa/summary1.html
https://www.lesswrong.com/posts/QmWNbCRMgRBcMK6RK/the-absolute-self-selection-assumption#Problem__3__The_Born_Probabilities